hugfaceguy0001
/

AlterEgo

Text Generation

Model card Files Files and versions

AlterEgo / README.md

hugfaceguy0001's picture

Upload folder using huggingface_hub

73486ce verified about 1 month ago

|

history blame contribute delete

3.28 kB

	---
	library_name: peft
	model_name: AlterEgo
	tags:
	- base_model:adapter:Qwen/Qwen3-4B-Base
	- lora
	- sft
	- transformers
	- trl
	licence: license
	pipeline_tag: text-generation
	base_model: Qwen/Qwen3-4B-Base
	---

	# AlterEgo

	This model is a fine-tuned version of [Qwen3-4B-Base](https://huggingface.co/Qwen/Qwen3-4B-Base).
	It has been trained using [TRL](https://github.com/huggingface/trl).

	这个模型是使用我的几千条知乎问答微调而成的，可以称为我的“数字分身”。

	本项目受到《弹丸论破》的超高校级程序员不二咲千寻的分身程序启发。

	## Quick start

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
	from peft import PeftModel


	base_model_id = "Qwen/Qwen3-4B-Base"
	adapter_path = "hugfaceguy0001/AlterEgo"

	tokenizer = AutoTokenizer.from_pretrained(base_model_id)
	SIMPLE_CHAT_TEMPLATE = (
	"{% for message in messages %}"
	"{{'<\|im_start\|>' + message.role + '\n' + message.content + '<\|im_end\|>' + '\n'}}"
	"{% endfor %}"
	"{% if add_generation_prompt %}{{ '<\|im_start\|>assistant\n' }}{% endif %}"
	)
	tokenizer.chat_template = SIMPLE_CHAT_TEMPLATE

	base_model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	device_map="auto"
	)

	model = PeftModel.from_pretrained(base_model, adapter_path)
	model = model.merge_and_unload()
	model.eval()

	with torch.no_grad():
	question = '为什么说学术圈性价比在下降？'
	messages = [{"role": "user", "content": question}]
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
	model_inputs = tokenizer(
	text,
	return_tensors="pt",
	padding=True,
	).to("cuda")
	generation_config = GenerationConfig(
	max_new_tokens=1024,
	temperature=1.2,
	do_sample=True,
	pad_token_id=tokenizer.pad_token_id,
	stop_texts=["<\|endoftext\|>","<\|im_end\|>"],
	enable_thinking=False,
	)
	generated_ids = model.generate(
	input_ids=model_inputs.input_ids,
	attention_mask=model_inputs.attention_mask,
	generation_config=generation_config,
	)
	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(response)
	```

	## Training procedure

	我使用我的知乎问答数据，大约4000条，通过TRL库的SFTTrainer，在Qwen3-4B-Base基础上训练了这个LoRA. 为了提高效果，我把模型头(lm_head)也加入了训练。训练使用单张RTX 3090，精度为bfloat16，序列长度为1024，batch size是8，梯度累积步数为4，使用Flash Attention 2, 训练了6个epoch，用时约6小时。

	### Framework versions

	- PEFT 0.18.0
	- TRL: 0.25.1
	- Transformers: 4.57.3
	- Pytorch: 2.9.1
	- Datasets: 4.4.1
	- Tokenizers: 0.22.1

	## Citations



	Cite TRL as:

	```bibtex
	@misc{vonwerra2022trl,
	title = {{TRL: Transformer Reinforcement Learning}},
	author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
	year = 2020,
	journal = {GitHub repository},
	publisher = {GitHub},
	howpublished = {\url{https://github.com/huggingface/trl}}
	}
	```