| Dataset | Classes | Train samples | Test samples | source |
|---|---|---|---|---|
| Imdb | 2 | 25 000 | 25 000 | link |
| AG’s News | 4 | 120 000 | 7 600 | link |
| Sogou News | 5 | 450 000 | 60 000 | link |
| DBPedia | 14 | 560 000 | 70 000 | link |
| Yelp Review Polarity | 2 | 560 000 | 38 000 | link |
| Yelp Review Full | 5 | 650 000 | 50 000 | link |
| Yahoo! Answers | 10 | 1 400 000 | 60 000 | link |
| Amazon Review Full | 5 | 3 000 000 | 650 000 | link |
| Amazon Review Polarity | 2 | 3 600 000 | 400 000 | link |
- [1]: CNN: Character-level convolutional networks for text classification (paper)
- [2]: VDCNN: Very deep convolutional networks for text classification (paper)
- [3]: HAN: Hierarchical Attention Networks for Document Classification (paper), all credits goes to @cedias
- [4]: Transformer Encoder: Attention Is All You Need (encoder part) (paper), credits to Yu-Hsiang Huang's work)
HAN word (red) and sentence (blue) attention weight at prediction:
Results are reported as follows: (i) / (ii)
- (i): Test set accuracy reported by the paper
- (ii): Test set accuracy reproduced here
| Model | paper accuracy | repo accuracy |
|---|---|---|
| CNN small | ||
| VDCNN 9 layers | ||
| VDCNN 17 layers | ||
| VDCNN 29 layers | ||
| HAN | 90.5 | |
| Transformer | 88.6 |
| Model | paper accuracy | repo accuracy |
|---|---|---|
| CNN small | 84.35 | 88.30 |
| VDCNN 9 layers | 90.17 | 89.22 |
| VDCNN 17 layers | 90.61 | 90.00 |
| VDCNN 29 layers | 91.27 | 90.43 |
| HAN | 92.4 | |
| Transformer | 93.2 |
| Model | paper accuracy | repo accuracy |
|---|---|---|
| CNN small | 91.35 | 93.53 |
| VDCNN 9 layers | 96.42 | 93.50 |
| VDCNN 17 layers | 96.49 | |
| VDCNN 29 layers | 96.64 | 87.90 |
| HAN | 96. | |
| Transformer | 95.6 |
| Model | paper accuracy | repo accuracy |
|---|---|---|
| CNN small | 98.02 | 98.15 |
| VDCNN 9 layers | 98.75 | 98.35 |
| VDCNN 17 layers | 98.02 | 98.15 |
| VDCNN 29 layers | 98.71 | |
| HAN | 99.0 | |
| Transformer | 98.7 |
| Model | paper accuracy | repo accuracy |
|---|---|---|
| CNN small | ||
| VDCNN 9 layers | 94.73 | 93.97 |
| VDCNN 17 layers | 94.95 | 94.73 |
| VDCNN 29 layers | 95.72 | 94.75 |
| HAN | ||
| Model | paper accuracy | repo accuracy |
|---|---|---|
| CNN small | ||
| VDCNN 9 layers | 61.96 | 61.18 |
| VDCNN 17 layers | 62.59 | |
| VDCNN 29 layers | 64.26 | 62.73 |
| HAN | 63. | |
