End of training

Browse files

Files changed (5) hide show

README.md +66 -46
config.json +0 -1
model.safetensors +1 -1
tokenizer_config.json +3 -1
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -2,12 +2,29 @@
 license: apache-2.0
 base_model: distilbert/distilbert-base-uncased
 tags:
 - generated_from_trainer
 datasets:
 - hdfs_rlhf_log_summary_dataset
 model-index:
 - name: log_sage_reward_model
-  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -17,7 +34,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the hdfs_rlhf_log_summary_dataset dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0005
 ## Model description
@@ -37,57 +55,59 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 1.41e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 40
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 11   | 0.0022          |
-| No log        | 2.0   | 22   | 0.0049          |
-| No log        | 3.0   | 33   | 0.0006          |
-| No log        | 4.0   | 44   | 0.0006          |
-| No log        | 5.0   | 55   | 0.0008          |
-| No log        | 6.0   | 66   | 0.0003          |
-| No log        | 7.0   | 77   | 0.0005          |
-| No log        | 8.0   | 88   | 0.0010          |
-| No log        | 9.0   | 99   | 0.0008          |
-| No log        | 10.0  | 110  | 0.0007          |
-| No log        | 11.0  | 121  | 0.0007          |
-| No log        | 12.0  | 132  | 0.0006          |
-| No log        | 13.0  | 143  | 0.0006          |
-| No log        | 14.0  | 154  | 0.0004          |
-| No log        | 15.0  | 165  | 0.0007          |
-| No log        | 16.0  | 176  | 0.0007          |
-| No log        | 17.0  | 187  | 0.0006          |
-| No log        | 18.0  | 198  | 0.0004          |
-| No log        | 19.0  | 209  | 0.0005          |
-| No log        | 20.0  | 220  | 0.0006          |
-| No log        | 21.0  | 231  | 0.0006          |
-| No log        | 22.0  | 242  | 0.0006          |
-| No log        | 23.0  | 253  | 0.0009          |
-| No log        | 24.0  | 264  | 0.0006          |
-| No log        | 25.0  | 275  | 0.0007          |
-| No log        | 26.0  | 286  | 0.0005          |
-| No log        | 27.0  | 297  | 0.0005          |
-| No log        | 28.0  | 308  | 0.0004          |
-| No log        | 29.0  | 319  | 0.0004          |
-| No log        | 30.0  | 330  | 0.0005          |
-| No log        | 31.0  | 341  | 0.0005          |
-| No log        | 32.0  | 352  | 0.0005          |
-| No log        | 33.0  | 363  | 0.0005          |
-| No log        | 34.0  | 374  | 0.0004          |
-| No log        | 35.0  | 385  | 0.0004          |
-| No log        | 36.0  | 396  | 0.0005          |
-| No log        | 37.0  | 407  | 0.0005          |
-| No log        | 38.0  | 418  | 0.0005          |
-| No log        | 39.0  | 429  | 0.0005          |
-| No log        | 40.0  | 440  | 0.0005          |
 ### Framework versions

 license: apache-2.0
 base_model: distilbert/distilbert-base-uncased
 tags:
+- trl
+- reward-trainer
 - generated_from_trainer
 datasets:
 - hdfs_rlhf_log_summary_dataset
+metrics:
+- accuracy
 model-index:
 - name: log_sage_reward_model
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: hdfs_rlhf_log_summary_dataset
+      type: hdfs_rlhf_log_summary_dataset
+      config: default
+      split: None
+      args: default
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 1.0
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the hdfs_rlhf_log_summary_dataset dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1669
+- Accuracy: 1.0
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 1.41e-05
+- train_batch_size: 6
+- eval_batch_size: 24
 - seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 96
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 40
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| No log        | 1.0   | 1    | 0.6950          | 0.5      |
+| No log        | 2.0   | 2    | 0.6896          | 1.0      |
+| No log        | 3.0   | 3    | 0.6843          | 1.0      |
+| No log        | 4.0   | 4    | 0.6789          | 1.0      |
+| No log        | 5.0   | 5    | 0.6735          | 1.0      |
+| No log        | 6.0   | 6    | 0.6671          | 1.0      |
+| No log        | 7.0   | 7    | 0.6597          | 1.0      |
+| No log        | 8.0   | 8    | 0.6510          | 1.0      |
+| No log        | 9.0   | 9    | 0.6403          | 1.0      |
+| 0.0839        | 10.0  | 10   | 0.6275          | 1.0      |
+| 0.0839        | 11.0  | 11   | 0.6130          | 1.0      |
+| 0.0839        | 12.0  | 12   | 0.5955          | 1.0      |
+| 0.0839        | 13.0  | 13   | 0.5747          | 1.0      |
+| 0.0839        | 14.0  | 14   | 0.5508          | 1.0      |
+| 0.0839        | 15.0  | 15   | 0.5250          | 1.0      |
+| 0.0839        | 16.0  | 16   | 0.4984          | 1.0      |
+| 0.0839        | 17.0  | 17   | 0.4698          | 1.0      |
+| 0.0839        | 18.0  | 18   | 0.4413          | 1.0      |
+| 0.0839        | 19.0  | 19   | 0.4121          | 1.0      |
+| 0.0658        | 20.0  | 20   | 0.3850          | 1.0      |
+| 0.0658        | 21.0  | 21   | 0.3604          | 1.0      |
+| 0.0658        | 22.0  | 22   | 0.3384          | 1.0      |
+| 0.0658        | 23.0  | 23   | 0.3186          | 1.0      |
+| 0.0658        | 24.0  | 24   | 0.2995          | 1.0      |
+| 0.0658        | 25.0  | 25   | 0.2823          | 1.0      |
+| 0.0658        | 26.0  | 26   | 0.2664          | 1.0      |
+| 0.0658        | 27.0  | 27   | 0.2516          | 1.0      |
+| 0.0658        | 28.0  | 28   | 0.2384          | 1.0      |
+| 0.0658        | 29.0  | 29   | 0.2260          | 1.0      |
+| 0.0346        | 30.0  | 30   | 0.2149          | 1.0      |
+| 0.0346        | 31.0  | 31   | 0.2054          | 1.0      |
+| 0.0346        | 32.0  | 32   | 0.1971          | 1.0      |
+| 0.0346        | 33.0  | 33   | 0.1898          | 1.0      |
+| 0.0346        | 34.0  | 34   | 0.1838          | 1.0      |
+| 0.0346        | 35.0  | 35   | 0.1787          | 1.0      |
+| 0.0346        | 36.0  | 36   | 0.1746          | 1.0      |
+| 0.0346        | 37.0  | 37   | 0.1714          | 1.0      |
+| 0.0346        | 38.0  | 38   | 0.1691          | 1.0      |
+| 0.0346        | 39.0  | 39   | 0.1676          | 1.0      |
+| 0.021         | 40.0  | 40   | 0.1669          | 1.0      |
 ### Framework versions

config.json CHANGED Viewed

@@ -20,7 +20,6 @@
   "n_heads": 12,
   "n_layers": 6,
   "pad_token_id": 0,
-  "problem_type": "regression",
   "qa_dropout": 0.1,
   "seq_classif_dropout": 0.2,
   "sinusoidal_pos_embds": false,

   "n_heads": 12,
   "n_layers": 6,
   "pad_token_id": 0,
   "qa_dropout": 0.1,
   "seq_classif_dropout": 0.2,
   "sinusoidal_pos_embds": false,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:16869d0953d5b87a61040dee2caef1db68a03ed0adf7482b276d86381884e93c
 size 267829484

 version https://git-lfs.github.com/spec/v1
+oid sha256:ed35b02ba7e11272f3ddca09d5f2c2ffae2b557e8ba3a98fbd69320f3a4c23bd
 size 267829484

tokenizer_config.json CHANGED Viewed

@@ -43,9 +43,11 @@
   },
   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
   "do_lower_case": true,
   "mask_token": "[MASK]",
-  "model_max_length": 1000000000000000019884624838656,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "strip_accents": null,

   },
   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
   "do_lower_case": true,
   "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "strip_accents": null,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8a519663b9a6387514f11ccce00d19ac348e481362fef0e7f53e66f3b08db7db
-size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:f00328d8a44d896bbf900800303965952d869f74248f0d7dc15a100e5d582ea1
+size 4984