kingabzpro commited on
Commit
ee2f9c5
·
verified ·
1 Parent(s): bf6f439

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -27
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
  base_model: openai/gpt-oss-20b
3
- datasets: kingabzpro/gpt-oss-20b-medical-qa
 
4
  library_name: transformers
5
  model_name: gpt-oss-20b-medical-qa
6
  tags:
@@ -8,6 +9,10 @@ tags:
8
  - trl
9
  - sft
10
  licence: license
 
 
 
 
11
  ---
12
 
13
  # Model Card for gpt-oss-20b-medical-qa
@@ -18,19 +23,46 @@ It has been trained using [TRL](https://github.com/huggingface/trl).
18
  ## Quick start
19
 
20
  ```python
21
- from transformers import pipeline
 
22
 
23
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
24
- generator = pipeline("text-generation", model="kingabzpro/gpt-oss-20b-medical-qa", device="cuda")
25
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
26
- print(output["generated_text"])
27
- ```
28
 
29
- ## Training procedure
 
 
 
 
 
 
 
30
 
31
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
 
 
33
 
 
 
 
34
  This model was trained with SFT.
35
 
36
  ### Framework versions
@@ -39,21 +71,4 @@ This model was trained with SFT.
39
  - Transformers: 4.55.2
40
  - Pytorch: 2.8.0.dev20250319+cu128
41
  - Datasets: 4.0.0
42
- - Tokenizers: 0.21.4
43
-
44
- ## Citations
45
-
46
-
47
-
48
- Cite TRL as:
49
-
50
- ```bibtex
51
- @misc{vonwerra2022trl,
52
- title = {{TRL: Transformer Reinforcement Learning}},
53
- author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
54
- year = 2020,
55
- journal = {GitHub repository},
56
- publisher = {GitHub},
57
- howpublished = {\url{https://github.com/huggingface/trl}}
58
- }
59
- ```
 
1
  ---
2
  base_model: openai/gpt-oss-20b
3
+ datasets:
4
+ - FreedomIntelligence/medical-o1-verifiable-problem
5
  library_name: transformers
6
  model_name: gpt-oss-20b-medical-qa
7
  tags:
 
9
  - trl
10
  - sft
11
  licence: license
12
+ license: apache-2.0
13
+ language:
14
+ - en
15
+ pipeline_tag: text-generation
16
  ---
17
 
18
  # Model Card for gpt-oss-20b-medical-qa
 
23
  ## Quick start
24
 
25
  ```python
26
+ from transformers import AutoModelForCausalLM, AutoTokenizer
27
+ from peft import PeftModel
28
 
29
+ # Load the tokenizer
30
+ tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")
 
 
 
31
 
32
+ # Load the original model first
33
+ model_kwargs = dict(attn_implementation="eager", torch_dtype="auto", use_cache=True, device_map="auto")
34
+ base_model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b", **model_kwargs).cuda()
35
+
36
+ # Merge fine-tuned weights with the base model
37
+ peft_model_id = "kingabzpro/gpt-oss-20b-medical-qa"
38
+ model = PeftModel.from_pretrained(base_model, peft_model_id)
39
+ model = model.merge_and_unload()
40
 
41
+ question = dataset[0]["Open-ended Verifiable Question"]
42
+
43
+ text = render_infernce_harmony(question)
44
+
45
+ inputs = tokenizer(
46
+ [text + tokenizer.eos_token], return_tensors="pt"
47
+ ).to("cuda")
48
+ outputs = model.generate(
49
+ input_ids=inputs.input_ids,
50
+ attention_mask=inputs.attention_mask,
51
+ max_new_tokens=20,
52
+ eos_token_id=tokenizer.eos_token_id,
53
+ use_cache=True,
54
+ )
55
+ response = tokenizer.batch_decode(outputs)
56
+ print(response[0])
57
+ ```
58
+ Output:
59
 
60
+ ```bash
61
+ <|start|>developer<|message|># Instructions
62
 
63
+ You are a medical expert with advanced knowledge in clinical reasoning and diagnostics. Respond with ONLY the final diagnosis/cause in ≤5 words.<|end|><|start|>user<|message|>An 88-year-old woman with osteoarthritis is experiencing mild epigastric discomfort and has vomited material resembling coffee grounds multiple times. Considering her use of naproxen, what is the most likely cause of her gastrointestinal blood loss?<|end|><|start|>assistant<|return|><|message|>Stomach ulcer<|end|><|return|>
64
+ ```
65
+ ## Training procedure
66
  This model was trained with SFT.
67
 
68
  ### Framework versions
 
71
  - Transformers: 4.55.2
72
  - Pytorch: 2.8.0.dev20250319+cu128
73
  - Datasets: 4.0.0
74
+ - Tokenizers: 0.21.4