848717d1bb6dcb417941b32247e995ac
This model is a fine-tuned version of albert/albert-xxlarge-v1 on the nyu-mll/glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.3848
- Data Size: 1.0
- Epoch Runtime: 307.6984
- Accuracy: 0.9088
- F1 Macro: 0.9087
- Rouge1: 0.9088
- Rouge2: 0.0
- Rougel: 0.9088
- Rougelsum: 0.9088
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Accuracy | F1 Macro | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7221 | 0 | 4.0115 | 0.5048 | 0.3416 | 0.5046 | 0.0 | 0.5046 | 0.5044 |
| No log | 1 | 3273 | 0.4753 | 0.0078 | 6.2258 | 0.7871 | 0.7834 | 0.7869 | 0.0 | 0.7868 | 0.7875 |
| 0.0091 | 2 | 6546 | 0.3686 | 0.0156 | 8.3708 | 0.8386 | 0.8353 | 0.8386 | 0.0 | 0.8386 | 0.8386 |
| 0.3873 | 3 | 9819 | 0.2283 | 0.0312 | 12.7594 | 0.9204 | 0.9204 | 0.9204 | 0.0 | 0.9202 | 0.9204 |
| 0.2796 | 4 | 13092 | 0.2260 | 0.0625 | 21.0479 | 0.9175 | 0.9175 | 0.9175 | 0.0 | 0.9176 | 0.9176 |
| 0.2356 | 5 | 16365 | 0.1960 | 0.125 | 38.0719 | 0.9267 | 0.9266 | 0.9268 | 0.0 | 0.9267 | 0.9267 |
| 0.2192 | 6 | 19638 | 0.2178 | 0.25 | 70.9393 | 0.9226 | 0.9226 | 0.9230 | 0.0 | 0.9224 | 0.9228 |
| 0.19 | 7 | 22911 | 0.1886 | 0.5 | 153.8451 | 0.9278 | 0.9277 | 0.9278 | 0.0 | 0.9279 | 0.9278 |
| 0.1771 | 8.0 | 26184 | 0.1892 | 1.0 | 308.0908 | 0.9263 | 0.9263 | 0.9263 | 0.0 | 0.9263 | 0.9265 |
| 0.0974 | 9.0 | 29457 | 0.2689 | 1.0 | 307.7917 | 0.9187 | 0.9187 | 0.9187 | 0.0 | 0.9186 | 0.9189 |
| 0.0794 | 10.0 | 32730 | 0.2984 | 1.0 | 303.5691 | 0.9197 | 0.9196 | 0.9199 | 0.0 | 0.9195 | 0.9197 |
| 0.1164 | 11.0 | 36003 | 0.3848 | 1.0 | 307.6984 | 0.9088 | 0.9087 | 0.9088 | 0.0 | 0.9088 | 0.9088 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 3
Model tree for contemmcm/848717d1bb6dcb417941b32247e995ac
Base model
albert/albert-xxlarge-v1