twscrape-prepared-regression-e5-base-4k-3epochs
This model is a fine-tuned version of dwzhu/e5-base-4k on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4350
- Mse: 0.0003
- Target 0 Mse: 0.0009
- Target 0 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f5b2a1ac610>
- Target 0 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f5c314b43a0>
- Target 1 Mse: 0.0003
- Target 1 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f5c3131fdf0>
- Target 1 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f5c180ffaf0>
- Target 2 Mse: 0.0001
- Target 2 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f5c1809f5e0>
- Target 2 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f5c3015e920>
- Target 3 Mse: 0.0000
- Target 3 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f5c312b04f0>
- Target 3 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f5c312b3340>
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 256
- total_eval_batch_size: 256
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Mse | Target 0 Mse | Target 0 Distributions | Target 0 Error Distribution | Target 1 Mse | Target 1 Distributions | Target 1 Error Distribution | Target 2 Mse | Target 2 Distributions | Target 2 Error Distribution | Target 3 Mse | Target 3 Distributions | Target 3 Error Distribution |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1.4561 | 1.0 | 1589 | 1.4744 | 0.0003 | 0.0010 | <wandb.sdk.data_types.image.Image object at 0x7f5c32c2e530> | <wandb.sdk.data_types.image.Image object at 0x7f5c32c2e620> | 0.0003 | <wandb.sdk.data_types.image.Image object at 0x7f5c4414f610> | <wandb.sdk.data_types.image.Image object at 0x7f5c473e3340> | 0.0001 | <wandb.sdk.data_types.image.Image object at 0x7f5c47230ac0> | <wandb.sdk.data_types.image.Image object at 0x7f5c4711f820> | 0.0000 | <wandb.sdk.data_types.image.Image object at 0x7f5c440fb250> | <wandb.sdk.data_types.image.Image object at 0x7f5c474deb60> |
| 1.4648 | 2.0 | 3178 | 1.4401 | 0.0003 | 0.0009 | <wandb.sdk.data_types.image.Image object at 0x7f5c47483f40> | <wandb.sdk.data_types.image.Image object at 0x7f5c442efa30> | 0.0003 | <wandb.sdk.data_types.image.Image object at 0x7f5c3639dfc0> | <wandb.sdk.data_types.image.Image object at 0x7f5c1839de40> | 0.0001 | <wandb.sdk.data_types.image.Image object at 0x7f5c1843d9c0> | <wandb.sdk.data_types.image.Image object at 0x7f5c182ce5f0> | 0.0000 | <wandb.sdk.data_types.image.Image object at 0x7f5c3627e980> | <wandb.sdk.data_types.image.Image object at 0x7f5c36151f00> |
| 1.1685 | 3.0 | 4767 | 1.4350 | 0.0003 | 0.0009 | <wandb.sdk.data_types.image.Image object at 0x7f5c32e8c2b0> | <wandb.sdk.data_types.image.Image object at 0x7f5c3610c1c0> | 0.0003 | <wandb.sdk.data_types.image.Image object at 0x7f5a8ea24640> | <wandb.sdk.data_types.image.Image object at 0x7f5c32defee0> | 0.0001 | <wandb.sdk.data_types.image.Image object at 0x7f5c315609a0> | <wandb.sdk.data_types.image.Image object at 0x7f5a8ea248b0> | 0.0000 | <wandb.sdk.data_types.image.Image object at 0x7f5b2a27d900> | <wandb.sdk.data_types.image.Image object at 0x7f5c316b6710> |
Framework versions
- Transformers 4.49.0
- Pytorch 2.5.1+cu124
- Datasets 3.0.1
- Tokenizers 0.21.0
- Downloads last month
- 5
Model tree for AlekseyKorshuk/twscrape-prepared-regression-e5-base-4k-3epochs
Base model
dwzhu/e5-base-4k