29cb06aeb56cbdecf76bdc4c3e179e78
This model is a fine-tuned version of albert/albert-xxlarge-v1 on the nyu-mll/glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.5375
- Data Size: 1.0
- Epoch Runtime: 946.5174
- Accuracy: 0.8565
- F1 Macro: 0.8562
- Rouge1: 0.8563
- Rouge2: 0.0
- Rougel: 0.8565
- Rougelsum: 0.8565
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Accuracy | F1 Macro | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.3291 | 0 | 6.0977 | 0.3167 | 0.2066 | 0.3170 | 0.0 | 0.3167 | 0.3168 |
| 1.0205 | 1 | 12271 | 0.6707 | 0.0078 | 13.6475 | 0.7407 | 0.7360 | 0.7408 | 0.0 | 0.7409 | 0.7405 |
| 0.4835 | 2 | 24542 | 0.4064 | 0.0156 | 20.9804 | 0.8509 | 0.8499 | 0.8508 | 0.0 | 0.8509 | 0.8510 |
| 0.4203 | 3 | 36813 | 0.3873 | 0.0312 | 35.9227 | 0.8565 | 0.8558 | 0.8566 | 0.0 | 0.8565 | 0.8566 |
| 0.4056 | 4 | 49084 | 0.3664 | 0.0625 | 65.0683 | 0.8628 | 0.8626 | 0.8626 | 0.0 | 0.8627 | 0.8627 |
| 0.3384 | 5 | 61355 | 0.3593 | 0.125 | 123.7432 | 0.8647 | 0.8641 | 0.8646 | 0.0 | 0.8647 | 0.8647 |
| 0.3534 | 6 | 73626 | 0.3995 | 0.25 | 241.1688 | 0.8552 | 0.8547 | 0.8550 | 0.0 | 0.8552 | 0.8554 |
| 0.3062 | 7 | 85897 | 0.3659 | 0.5 | 476.0455 | 0.8677 | 0.8674 | 0.8674 | 0.0 | 0.8675 | 0.8676 |
| 0.252 | 8.0 | 98168 | 0.3545 | 1.0 | 948.0369 | 0.8707 | 0.8705 | 0.8704 | 0.0 | 0.8705 | 0.8707 |
| 0.2087 | 9.0 | 110439 | 0.3708 | 1.0 | 946.9071 | 0.8664 | 0.8659 | 0.8663 | 0.0 | 0.8665 | 0.8664 |
| 0.1756 | 10.0 | 122710 | 0.4547 | 1.0 | 947.1643 | 0.8609 | 0.8604 | 0.8606 | 0.0 | 0.8606 | 0.8607 |
| 0.1215 | 11.0 | 134981 | 0.5158 | 1.0 | 945.5557 | 0.8592 | 0.8585 | 0.8589 | 0.0 | 0.8592 | 0.8591 |
| 0.1242 | 12.0 | 147252 | 0.5375 | 1.0 | 946.5174 | 0.8565 | 0.8562 | 0.8563 | 0.0 | 0.8565 | 0.8565 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Model tree for contemmcm/29cb06aeb56cbdecf76bdc4c3e179e78
Base model
albert/albert-xxlarge-v1