wang's picture

4

wang

wzx111

·

AI & ML interests

None yet

Recent Activity

new activity about 1 month ago

wzx111/Qwen3-1.7B-MATH-GDPO:Which post-training method was actually used for this model, GDPO or GRPO?

updated a dataset about 2 months ago

wzx111/MATH-lighteval-level3

published a dataset about 2 months ago

wzx111/MATH-lighteval-level3

View all activity

Organizations

New activity in wzx111/Qwen3-1.7B-MATH-GDPO about 1 month ago

Which post-training method was actually used for this model, GDPO or GRPO?

#1 opened about 1 month ago by

New activity in Qwen/Qwen3-235B-A22B 9 months ago

是不是奖励函数没有ngram重复度惩罚

#7 opened 9 months ago by

New activity in Qwen/Qwen3-1.7B 9 months ago

【Evaluation】Best practice for evaluating Qwen3 !!

#2 opened 9 months ago by

New activity in wzx111/Qwen2.5-1.5B-Open-R1-GRPO 9 months ago

Improve language tag

#1 opened 9 months ago by