-
-
-
-
-
-
Inference Providers
Active filters:
dpo, trl
Shifusen/Qwen3-Next-80B-A3B-Instruct-Decensored
Text Generation
•
80B
•
Updated
•
35
•
2
HumanLLMs/Human-Like-Qwen2.5-7B-Instruct
Text Generation
•
8B
•
Updated
•
78
•
13
Text Generation
•
7B
•
Updated
•
28
•
1
mradermacher/Role-mo-V2-7B-GGUF
7B
•
Updated
•
368
•
1
mradermacher/Role-mo-V2-7B-i1-GGUF
7B
•
Updated
•
2.16k
•
1
mradermacher/Qwen3-Next-80B-A3B-Instruct-Decensored-GGUF
80B
•
Updated
•
2.09k
•
1
wololoo/Llama-3.2-3B-TR-Instruct-DPO
Text Generation
•
3B
•
Updated
•
203
•
1
jazztoasty101/qwen3-md-natural
Text Generation
•
2B
•
Updated
•
21
•
1
lewtun/zephyr-7b-dpo-full
Text Generation
•
7B
•
Updated
•
16
alignment-handbook/zephyr-7b-dpo-full
Text Generation
•
7B
•
Updated
•
29
•
3
alignment-handbook/zephyr-7b-dpo-qlora
Updated
•
15
•
9
amirali1985/gpt-neo-125m_hh_reward
Text Generation
•
0.1B
•
Updated
•
5
lewtun/zephyr-7b-dpo-qlora
sambar/zephyr-7b-ipo-lora
Text Generation
•
Updated
•
2
nikkoyabut/merged_model_dpo
sambar/zephyr-7b-ipo-lora-5ep
Text Generation
•
Updated
•
5
alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2-dpo
Text Generation
•
1B
•
Updated
•
12
•
2
Yaxin1992/mixtral-dpo-1000
adhi29/openhermes-mistral-dpo-gptq
Updated
Text Generation
•
1.03M
•
Updated
•
6
ybelkada/test-tags-model-2
Text Generation
•
1.03M
•
Updated
•
5
justinj92/dpoplatypus-phi2
Text Generation
•
3B
•
Updated
lewtun/zephyr-7b-dpo-qlora-8e0975a
Updated
akashkumarbtc/openhermes-mistral-dpo-gptq
Updated
darshan8950/openhermes-mistral-dpo-gptq
Updated