TechTC - Technion Repository of Text Categorization Datasets

The TechTC-300 Test Collection for Text Categorization

Version: 1.0
Release date: May 2, 2004
Maintained by: Evgeniy Gabrilovich (gabr@cs.technion.ac.il)
  1. Overview
  2. Availability and usage
  3. Detailed statistics
  4. Mailing list
  5. Questions?
  6. References
  7. Additional publications

Overview

The TechTC-300 Test Collection contains 300 labeled datasets whose categorization difficulty (as measured by baseline SVM accuracy) is uniformly distributed between 0.6 and 1.0. This test collection was constructed in the course of experiments described in (Davidov et al., 2004).

The data acquisition procedure and the format of the data files for this collection are comprehensively described at the main TechTC page.

Availability and usage

Download the entire test collection as a single ZIP file:

Section "Detailed statistics" below contains further information on individual datasets.

Conditions of use

If you publish results based on this test collection, please cite the following paper:
Dmitry Davidov, Evgeniy Gabrilovich, and Shaul Markovitch
"Parameterized Generation of Labeled Datasets for Text Categorization Based on a Hierarchical Directory"
To appear in Proceedings of The 27th Annual International ACM SIGIR Conference, Sheffield, UK, July 2004
[Abstract / PDF]

Please also inform your readers of the current location of the data: http://techtc.cs.technion.ac.il/techtc300/techtc300.html

newSoftware

Nil Geisweiller has kindly made available his software, which allows to create datasets based on the ODP: https://github.com/ngeiswei/techtc-builder

Detailed statistics

The following table provides detailed information about the individual datasets in the TechTC-300 collection. Each line in this table describes one dataset. The first two columns specify the ids of the ODP categories comprising the dataset, hyperlinked with the corresponding categories in the ODP hierarchy. The third column shows the number of documents in the dataset. The fourth and fifth columns contain the graph distance and the text distance between the two categories, respectively (see (Davidov et al., 2004; Section 2.1) for the exact definitions of these metrics). The next three columns show the accuracy of text categorization with SVM, C4.5 and KNN (respectively) using all the features (i.e., without any feature selection). Finally, the last column shows the Maximum Achievable Accuracy (MAA) for this dataset.

Category
id1
Category
id2
Number of
documents
Graph
distance
Text
distance
SVM
C4.5
KNN
MAA
1092 1110 269 3 12.551497 0.95 0.9231 0.9426 0.95
1092 135724 264 6 15.045206 0.9731 0.9425 0.9733 0.9733
1092 789236 275 14 22.826953 0.9704 0.9704 0.9778 0.9778
1110 47793 257 10 17.81247 0.964 0.948 0.872 0.964
1622 42350 200 5 6.023847 0.88 0.8401 0.7619 0.88
1996 261990 173 8 22.161406 0.9503 0.9751 0.9065 0.9751
2592 3431 188 8 21.732932 0.9442 0.9443 0.9609 0.9609
2592 42357 166 5 16.769086 0.9378 0.9502 0.9003 0.9502
2597 56702 270 5 15.892009 0.9578 0.931 0.9541 0.9578
2631 83261 282 6 15.50696 0.9297 0.926 0.8812 0.9297
2631 449165 180 7 17.490925 0.9646 0.9587 0.947 0.9646
2631 789236 284 13 21.690814 0.9677 0.9661 0.9641 0.9677
3093 421943 184 10 20.855236 0.9443 0.9555 0.9055 0.9555
3431 48472 177 8 15.809618 0.9705 0.9382 0.8914 0.9705
5310 9639 246 9 11.221992 0.9653 0.9523 0.8698 0.9653
5560 592118 254 11 24.014359 0.98 0.976 0.9342 0.98
5560 704167 275 12 17.100563 0.9778 0.9334 0.9186 0.9778
6920 8366 200 7 9.199628 0.9072 0.943 0.9042 0.943
8308 8366 200 3 4.506822 0.8501 0.8001 0.7993 0.8501
8564 8567 188 2 16.489813 0.9388 0.9221 0.7646 0.9388
8564 8767 163 4 20.637502 0.923 0.9153 0.7568 0.923
10341 10755 200 4 5.808147 0.7921 0.7305 0.7844 0.7921
10341 14271 200 3 1.634875 0.7305 0.7132 0.8164 0.8164
10341 14525 200 2 2.288228 0.6267 0.6468 0.6333 0.6468
10341 61792 200 4 2.839517 0.7214 0.7001 0.7787 0.7787
10341 186330 200 4 3.664325 0.7215 0.7572 0.7872 0.7872
10341 194927 200 4 2.565268 0.7199 0.6602 0.7817 0.7817
10350 10539 200 4 3.482156 0.74 0.7133 0.7799 0.7799
10350 13928 200 3 1.534028 0.7713 0.7143 0.6846 0.7713
10350 194915 200 4 2.351878 0.7867 0.7332 0.8468 0.8468
10385 14525 200 5 9.117423 0.86 0.82 0.78 0.86
10385 25326 200 7 8.500273 0.8858 0.845 0.8931 0.8931
10385 269078 200 7 7.337601 0.7928 0.7642 0.8513 0.8513
10385 299104 200 5 6.883686 0.7928 0.7823 0.7117 0.7928
10385 312035 200 3 6.453106 0.6857 0.6524 0.5743 0.6857
10539 10567 200 3 7.384583 0.6714 0.6215 0.8717 0.8717
10539 11346 200 6 7.495751 0.8533 0.7999 0.8534 0.8534
10539 20673 200 8 7.19373 0.7 0.6608 0.8287 0.8287
10539 61792 200 6 5.206669 0.6599 0.6416 0.7833 0.7833
10539 85489 200 7 9.984525 0.7665 0.8133 0.6521 0.8133
10539 186330 200 6 4.838736 0.6733 0.64 0.7601 0.7601
10539 194915 200 6 5.02744 0.7191 0.6439 0.8315 0.8315
10539 300332 200 8 7.383129 0.8189 0.7691 0.8474 0.8474
10567 11346 200 7 10.402671 0.8306 0.746 0.9307 0.9307
10567 12121 200 8 9.824434 0.7998 0.7269 0.9502 0.9502
10567 46076 200 7 8.909045 0.7305 0.6979 0.8922 0.8922
10762 325847 160 6 9.204028 0.8067 0.7866 0.76 0.8067
10762 524208 161 10 21.060944 0.9533 0.9567 0.8268 0.9567
10762 5861635 167 15 15.11991 0.9315 0.8441 0.9128 0.9315
11342 9639 282 11 14.219647 0.963 0.926 0.9556 0.963
11346 17360 200 6 8.81867 0.8152 0.7998 0.8845 0.8845
11346 22294 200 9 9.580476 0.875 0.8334 0.9048 0.9048
11498 14517 200 7 23.591085 0.8999 0.8817 0.9093 0.9093
13928 18479 200 5 2.752411 0.7785 0.7537 0.7906 0.7906
13928 71892 200 9 6.062736 0.893 0.8214 0.8041 0.893
13928 186330 200 5 3.108226 0.75 0.7358 0.7749 0.7749
13928 300332 200 7 4.386871 0.8133 0.7799 0.8267 0.8267
13928 312035 200 5 3.372916 0.7429 0.6893 0.7071 0.7429
14271 20186 200 6 1.9748 0.692 0.5961 0.9153 0.9153
14271 46076 200 5 3.179224 0.7536 0.6921 0.8042 0.8042
14271 194927 200 5 2.118571 0.7143 0.6964 0.7877 0.7877
14271 312035 200 5 2.780381 0.769 0.6998 0.7998 0.7998
14517 20673 200 8 11.426356 0.8545 0.834 0.8544 0.8545
14517 186330 200 6 8.272968 0.8499 0.8416 0.9084 0.9084
14518 96104 263 7 11.38311 0.956 0.936 0.8379 0.956
14518 472203 273 13 11.866412 0.9847 0.9347 0.9502 0.9847
14525 61792 200 6 5.438969 0.7666 0.74 0.8 0.8
14525 194927 200 6 5.165557 0.694 0.6752 0.8252 0.8252
14630 18479 200 7 5.450985 0.82 0.8066 0.82 0.82
14630 20186 200 8 5.657029 0.7067 0.5801 0.9599 0.9599
14630 94142 200 10 12.725833 0.8333 0.8467 0.7999 0.8467
14630 300332 200 9 8.626976 0.9065 0.8566 0.8877 0.9065
14630 312035 200 7 6.539195 0.7868 0.7265 0.789 0.789
14630 814096 200 10 8.760172 0.8752 0.8753 0.8939 0.8939
14653 5861635 181 16 17.15566 0.9234 0.8472 0.9175 0.9234
17088 312651 185 11 14.868235 0.9442 0.9164 0.9386 0.9442
17088 421943 183 13 17.302574 0.9275 0.8832 0.8942 0.9275
17360 20186 200 7 5.49414 0.7075 0.6269 0.8635 0.8635
17360 46875 200 6 7.693348 0.8215 0.8036 0.7969 0.8215
17899 48446 181 9 19.815704 0.9705 0.9529 0.8732 0.9705
17899 240218 183 11 25.197977 1 1 0.9705 1
17899 278949 186 11 16.119187 0.9469 0.9469 0.8923 0.9469
17900 61765 281 6 10.571156 0.9001 0.8839 0.9186 0.9186
17900 704167 283 7 12.319126 0.9334 0.8752 0.889 0.9334
18479 20186 200 7 5.050187 0.7643 0.725 0.943 0.943
18479 20673 200 8 6.559954 0.7844 0.6844 0.8581 0.8581
18479 46076 200 6 5.201231 0.6928 0.6786 0.8049 0.8049
18479 186330 200 6 4.240268 0.7143 0.6786 0.7672 0.7672
20186 22294 200 10 7.47135 0.8249 0.7917 0.8916 0.8916
20186 61792 200 7 4.368808 0.6572 0.582 0.9001 0.9001
20546 65374 275 6 12.661821 0.8668 0.8176 0.8329 0.8668
20546 96104 268 7 15.338422 0.9542 0.929 0.9022 0.9542
20546 215009 276 4 9.368388 0.815 0.778 0.8823 0.8823
20673 46076 200 8 7.920879 0.7691 0.7306 0.8383 0.8383
20673 269078 200 10 6.990901 0.7383 0.6537 0.8597 0.8597
20673 312035 200 8 7.867797 0.6997 0.6612 0.7767 0.7767
20826 29965 265 11 13.488412 0.964 0.944 0.94 0.964
21119 96104 264 3 9.241652 0.908 0.908 0.876 0.908
21433 418948 279 15 20.127186 0.9778 0.9593 0.9354 0.9778
21433 5823851 280 16 20.101863 0.9556 0.9519 0.9408 0.9556
22294 25575 200 9 13.086865 0.9001 0.9001 0.9166 0.9166
22294 46076 200 9 7.57941 0.8084 0.7833 0.9213 0.9213
23038 47793 239 4 10.481198 0.922 0.7959 0.8902 0.922
23038 68416 253 3 15.595265 0.9001 0.8333 0.7505 0.9001
23038 83261 256 6 16.008689 0.9458 0.9083 0.8924 0.9458
23222 430894 185 9 17.0746 0.9386 0.9276 0.8714 0.9386
23222 849002 185 13 19.454839 0.9443 0.9498 0.8775 0.9498
25575 47456 200 8 12.532419 0.8714 0.8787 0.8429 0.8787
25575 275169 200 9 9.10123 0.8786 0.8287 0.8897 0.8897
25936 94142 200 6 10.932643 0.8859 0.8716 0.7769 0.8859
28718 8564 183 8 22.416316 0.9293 0.941 0.9234 0.941
28718 849002 177 15 22.377946 0.941 0.9469 0.941 0.9469
29041 7393 179 5 13.936199 0.9587 0.9587 0.8855 0.9587
29965 68416 272 8 17.095964 0.9617 0.9694 0.9544 0.9694
40378 849002 170 16 22.142952 1 0.969 0.9752 1
40392 61765 242 13 21.213909 1 0.9375 0.9874 1
40392 789236 241 16 22.554486 1 0.9874 0.9916 1
40398 421943 190 12 25.392493 1 1 0.9833 1
40398 849002 190 16 22.756264 1 0.9776 1 1
40622 8292 188 7 17.854915 0.9386 0.9499 0.8469 0.9499
40622 69440 183 4 12.643865 0.9587 0.9117 0.9353 0.9587
42345 56702 256 4 16.739267 0.936 0.864 0.8612 0.936
43404 47186 169 11 18.365184 0.9253 0.9316 0.9223 0.9316
43404 849002 169 15 19.403904 0.9752 0.9752 0.919 0.9752
43404 5861635 172 14 17.636264 0.9751 0.9626 0.919 0.9751
45502 5838985 280 12 21.790598 0.9704 0.9815 0.9519 0.9815
46076 61792 200 6 4.968967 0.6928 0.6069 0.767 0.767
46875 61792 200 6 6.556037 0.7733 0.755 0.7866 0.7866
47418 814096 200 7 6.012111 0.8132 0.7199 0.882 0.882
47456 497201 200 10 8.467239 0.8691 0.8306 0.8376 0.8691
48446 69440 180 5 12.020613 0.9529 0.9146 0.8768 0.9529
49502 56994 276 8 20.808216 0.9741 0.9815 0.9485 0.9815
52622 60974 278 11 17.91339 0.9778 0.9593 0.9797 0.9797
56994 96104 270 9 17.2761 0.958 0.927 0.9294 0.958
58108 85489 200 8 9.964476 0.8573 0.8502 0.8072 0.8573
60532 8567 185 7 20.390316 0.9387 0.9526 0.9221 0.9526
60741 849002 189 17 19.155203 0.9665 0.9609 0.9165 0.9665
60974 789236 285 17 22.318235 0.9712 0.9713 0.9697 0.9713
61792 814096 200 9 6.796871 0.8934 0.8467 0.8999 0.8999
69753 85489 200 8 13.180523 0.9132 0.8999 0.8532 0.9132
72031 849002 179 18 21.460154 0.9882 0.9882 0.8763 0.9882
83261 88266 276 5 19.031703 0.9271 0.9271 0.8796 0.9271
85489 90753 200 9 11.796554 0.8668 0.8867 0.8265 0.8867
100241 17900 276 5 11.722277 0.9271 0.8732 0.8545 0.9271
106614 114202 180 2 4.787238 0.753 0.6807 0.6088 0.753
108204 2631 275 4 14.836963 0.9075 0.8895 0.8098 0.9075
108204 42345 255 2 11.665873 0.82 0.7687 0.6736 0.82
114202 58074 175 7 14.48158 0.9752 0.944 0.8979 0.9752
114202 190888 173 8 15.84069 0.9439 0.9501 0.8664 0.9501
114857 40392 218 11 23.677141 1 1 0.9952 1
114857 312807 256 8 17.356788 0.95 0.9584 0.8957 0.9584
114857 535947 255 10 20.287722 0.9499 0.9479 0.9164 0.9499
114857 789236 259 15 20.311252 0.968 0.98 0.9579 0.98
123412 17899 142 4 19.850449 0.9615 0.9578 0.8768 0.9615
123412 233389 152 9 24.321307 0.9715 0.9715 0.9177 0.9715
123412 325847 140 5 15.061287 0.9538 0.9424 0.8802 0.9538
123906 2592 187 5 16.728951 0.9499 0.9442 0.9222 0.9499
123906 463854 186 4 12.101879 0.9219 0.8722 0.8665 0.9219
124388 7393 176 8 20.277789 0.9587 0.9705 0.9193 0.9705
124388 23112 187 10 19.279275 0.9721 0.9553 0.8426 0.9721
127007 17900 280 10 19.969902 0.9667 0.9556 0.9368 0.9667
127749 72031 175 13 22.771321 0.9564 0.944 0.8878 0.9564
135724 2631 273 5 16.196168 0.9704 0.9427 0.9334 0.9704
137433 449165 54 10 25.058798 1 1 0.875 1
138526 2597 276 3 9.441735 0.9464 0.9386 0.931 0.9464
138526 2631 282 6 13.620904 0.9112 0.8636 0.9223 0.9223
138526 789236 280 15 24.749013 0.9667 0.9704 0.9611 0.9704
139208 23038 256 4 8.823366 0.9168 0.8459 0.9416 0.9416
173089 40398 186 11 22.116641 1 0.9764 0.9941 1
173089 524208 175 9 24.781624 0.9565 0.9316 0.9065 0.9565
181232 215009 279 13 17.714288 0.9408 0.9125 0.9186 0.9408
181232 257734 276 11 22.981654 0.9963 0.9963 0.963 0.9963
181232 789236 284 15 24.482248 0.9712 0.9784 0.9768 0.9784
186330 46076 200 6 5.102811 0.6929 0.6714 0.7547 0.7547
186330 94142 200 9 11.952712 0.9144 0.8644 0.793 0.9144
186330 195558 200 9 12.166242 0.9153 0.9461 0.923 0.9461
186330 300332 200 8 6.698188 0.7931 0.7801 0.7695 0.7931
186330 314499 200 10 7.245434 0.8859 0.886 0.858 0.886
190005 58074 176 15 20.615594 0.9882 0.9823 0.9529 0.9882
190005 72031 178 17 19.251093 0.9469 0.941 0.9177 0.9469
190005 287061 183 13 18.581909 0.9276 0.9386 0.8665 0.9386
190005 454516 180 14 13.772803 0.911 0.9221 0.8941 0.9221
190005 849002 181 3 21.024926 0.972 0.9553 0.9407 0.972
190005 5861635 184 18 22.439759 0.9776 0.9608 0.922 0.9776
194915 67777 200 9 8.790863 0.9066 0.8733 0.9473 0.9473
194915 194927 200 2 2.30485 0.6189 0.6627 0.7127 0.7127
194915 324745 200 6 2.858575 0.7334 0.6218 0.8698 0.8698
194927 20186 200 3 2.779625 0.6733 0.6883 0.8866 0.8866
194927 46875 200 6 6.176581 0.7252 0.7127 0.8253 0.8253
194927 61792 200 6 3.833458 0.7467 0.7549 0.7868 0.7868
194927 299104 200 6 4.468027 0.8468 0.76 0.7825 0.8468
194927 312035 200 6 5.343655 0.7999 0.7533 0.7643 0.7999
203793 7393 177 4 11.823288 0.9059 0.9176 0.8628 0.9176
203793 28718 179 6 20.362148 0.9588 0.9351 0.9294 0.9588
203793 81066 187 4 15.926914 0.9165 0.8999 0.8637 0.9165
203793 86383 179 9 18.326251 0.941 0.9469 0.853 0.9469
203793 204402 184 5 19.068405 0.9832 0.9665 0.9276 0.9832
204402 7393 175 5 14.771864 0.9646 0.947 0.9117 0.9646
204402 29041 186 4 10.037915 0.9833 0.9387 0.8975 0.9833
204402 72031 179 10 18.370387 1 0.9941 0.9293 1
204402 287061 184 8 21.580647 0.9665 0.9776 0.9277 0.9776
205242 463854 185 3 13.828423 0.9166 0.8776 0.8565 0.9166
210192 8564 185 7 16.363127 0.9529 0.9352 0.9015 0.9529
210192 520393 180 12 14.85061 0.9705 0.9823 0.9385 0.9823
211244 224533 275 7 16.90307 0.9445 0.8668 0.9004 0.9445
215009 61765 278 9 9.869783 0.8964 0.79 0.9338 0.9338
215009 418948 279 14 13.200056 0.9667 0.8964 0.9371 0.9667
217155 3093 173 6 17.236737 0.947 0.9266 0.825 0.947
222417 472203 269 12 15.997784 0.9693 0.9425 0.9502 0.9693
224533 25321 266 9 19.617231 0.9693 0.9655 0.9194 0.9693
224533 83261 282 2 15.388378 0.9223 0.9001 0.8633 0.9223
224533 88266 280 5 18.404896 0.9556 0.8927 0.8464 0.9556
233389 86383 185 10 25.929839 0.9764 0.9882 0.9411 0.9882
233389 458776 194 2 8.298795 0.8943 0.8584 0.8499 0.8943
233389 849002 190 14 23.535371 0.9776 0.9721 0.9504 0.9776
234662 2597 248 8 21.427624 0.9708 0.9665 0.9339 0.9708
234662 52622 245 8 20.27262 0.9707 0.9666 0.9459 0.9707
234662 5823851 255 9 20.747704 0.964 0.968 0.9343 0.968
238688 56994 284 4 9.982394 0.9038 0.9037 0.8888 0.9038
238688 57037 277 5 17.413948 0.963 0.9371 0.8998 0.963
240218 271300 186 13 28.626135 0.9944 0.9888 0.9664 0.9944
240218 325847 181 12 22.655347 0.9941 0.9882 0.9175 0.9941
240218 474717 185 3 3.66868 0.6221 0.5806 0.6556 0.6556
240790 47793 266 10 17.338881 0.9809 0.9732 0.931 0.9809
261259 8564 187 6 21.037791 0.9498 0.9249 0.8692 0.9498
261259 60532 184 7 21.228216 0.9332 0.922 0.922 0.9332
261259 81066 184 6 19.377676 0.9331 0.911 0.8896 0.9331
261990 8564 183 6 23.098271 0.9646 0.9587 0.9058 0.9646
263248 5861635 186 15 18.974992 0.9441 0.9165 0.9331 0.9441
266541 60741 194 15 18.240194 0.9832 0.9054 0.9165 0.9832
266541 278949 193 14 18.787951 0.9665 0.9471 0.8721 0.9665
266541 301161 185 14 26.270622 0.9176 0.8843 0.9059 0.9176
266541 5861635 190 17 17.554899 0.9442 0.9054 0.9164 0.9442
268608 49870 182 10 27.149274 0.9665 0.9555 0.8948 0.9665
269078 46076 200 8 8.081078 0.8144 0.7429 0.886 0.886
269078 324745 200 8 6.001252 0.8215 0.75 0.8716 0.8716
271300 49870 183 12 17.146705 0.8945 0.9027 0.8415 0.9027
271300 849002 183 17 20.576227 0.9164 0.9165 0.9053 0.9165
271300 5861635 186 16 19.024593 0.9443 0.936 0.8999 0.9443
275733 58074 178 10 21.881103 0.9882 0.9647 0.9529 0.9882
278949 40348 196 12 24.880203 1 0.9944 1 1
278949 849002 188 16 23.232176 0.9609 0.9442 0.9318 0.9609
280052 83450 164 11 22.82512 0.9252 0.9377 0.8628 0.9377
280052 325847 179 11 12.462949 0.8763 0.8471 0.8412 0.8763
299104 46076 200 6 5.118133 0.7787 0.7107 0.7295 0.7787
299104 58108 200 9 9.436247 0.9073 0.8645 0.8215 0.9073
299104 312035 200 6 4.72939 0.7 0.6642 0.6785 0.7
300332 85489 200 5 10.500182 0.8465 0.8333 0.689 0.8465
301161 849002 180 16 27.268891 0.9351 0.9351 0.9264 0.9351
303829 789236 279 16 28.901195 1 0.9926 0.9741 1
312651 49870 184 13 22.289205 0.9665 0.9555 0.9387 0.9665
312651 5861635 187 17 22.551089 0.9776 0.972 0.9029 0.9776
312807 449927 277 7 16.34671 0.9618 0.9578 0.8868 0.9618
316970 85489 200 9 9.31842 0.8502 0.8502 0.8366 0.8502
319115 472203 277 15 13.272473 0.9732 0.9348 0.9502 0.9732
324745 61792 200 6 2.943728 0.6999 0.6784 0.8643 0.8643
324745 85489 200 7 7.537932 0.8644 0.8143 0.9144 0.9144
325847 8564 184 9 17.069471 0.9292 0.9176 0.8707 0.9292
325847 5861635 181 15 15.542789 0.9411 0.8824 0.9057 0.9411
332386 61792 200 8 4.989915 0.8399 0.8399 0.6734 0.8399
332386 85489 200 7 7.637073 0.8335 0.86 0.8199 0.86
344007 47793 264 9 18.789578 0.9732 0.9656 0.9117 0.9732
344007 789236 283 14 24.048275 0.9713 0.9785 0.982 0.982
364836 71892 181 14 12.020263 0.8691 0.8768 0.8119 0.8768
378028 5841153 284 3 7.917506 0.667 0.589 0.5929 0.667
406522 85489 200 7 7.192182 0.8716 0.8501 0.8858 0.8858
415500 454516 182 12 18.515327 0.9667 0.9721 0.8974 0.9721
415500 5861635 186 14 16.567769 0.961 0.95 0.9331 0.961
418948 71432 241 12 24.296459 0.9567 0.9654 0.9307 0.9654
418948 789236 284 16 25.554202 0.9784 0.9856 0.9661 0.9856
421943 789236 271 16 23.507444 0.9741 0.9741 0.9704 0.9741
449927 789236 280 14 20.396345 0.963 0.9704 0.9778 0.9778
458776 8564 192 7 23.53972 0.9442 0.9387 0.8276 0.9442
458776 81066 189 7 23.041299 0.9443 0.9499 0.9083 0.9499
458776 849002 186 14 25.059444 0.9609 0.9526 0.9219 0.9609
458776 5861635 189 13 22.064432 0.9609 0.9443 0.9219 0.9609
463854 58074 177 9 18.121832 0.9882 0.9705 0.9352 0.9882
472203 57037 273 10 17.864909 0.9732 0.9694 0.929 0.9732
472203 71432 236 13 16.939539 0.9592 0.9502 0.9454 0.9592
472203 789236 279 17 20.809081 0.9741 0.9593 0.9778 0.9778
474717 849002 182 17 28.053523 0.9888 0.9888 0.9442 0.9888
520393 849002 183 16 21.303591 0.9721 0.9609 0.9498 0.9721
535947 57037 272 8 24.040567 0.9655 0.9484 0.9232 0.9655
592118 68416 259 3 12.131505 0.796 0.764 0.6374 0.796
1155181 2597 275 12 12.186778 0.9578 0.9616 0.9291 0.9616
1155181 5560 269 14 21.014174 0.981 0.9809 0.96 0.981
1155181 29965 274 12 14.694333 0.9694 0.9655 0.9406 0.9694
1155181 40392 238 14 24.871264 1 0.9784 0.9979 1
1155181 138526 277 13 12.020397 0.9657 0.9579 0.8907 0.9657
1155181 789236 279 16 21.04015 0.9704 0.9741 0.9539 0.9741
5823851 789236 285 17 26.625303 0.982 0.9749 0.982 0.982
5838985 789236 278 15 25.414314 0.9963 1 0.9704 1
5861635 60741 192 6 9.792075 0.9055 0.7445 0.9555 0.9555
5861635 72031 182 17 18.055396 0.9705 0.9823 0.8941 0.9823
5861635 849002 185 19 20.011387 0.9497 0.9609 0.9053 0.9609


Mailing list

To receive periodic updates and to participate in discussions on TechTC, please subscribe to the TectTC mailing list at http://groups.yahoo.com/group/techtc.

Questions ?

If you have questions or comments, please post them to the mailing list (see above), or email me directly at gabr@cs.technion.ac.il.

References

  1. Dmitry Davidov, Evgeniy Gabrilovich, and Shaul Markovitch
    "Parameterized Generation of Labeled Datasets for Text Categorization Based on a Hierarchical Directory"
    The 27th Annual International ACM SIGIR Conference, pp. 250-257, Sheffield, UK, July 2004
    [Abstract / PDF]

  2. Evgeniy Gabrilovich and Shaul Markovitch
    "Text Categorization with Many Redundant Features: Using Aggressive Feature Selection to Make SVMs Competitive with C4.5"
    The 21st International Conference on Machine Learning (ICML), pp. 321-328, Banff, Alberta, Canada, July 2004
    [Abstract / PDF]

Additional publications

If you are using this test collection and want your article(s) listed here, please email me at gabr@cs.technion.ac.il.
  1. Sugiyama, M., Ide, T., Nakajima, S., & Sese, J.
    Semi-supervised local Fisher discriminant analysis for dimensionality reduction Machine Learning, 2010.

Evgeniy Gabrilovich
gabr@cs.technion.ac.il

Last updated on August 24, 2011