Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

454

Full-text search

Active filters: rlhf

samhitha2601/llama3-gsm8k-critic

3B • Updated Oct 24, 2025 • 3

AIResAgTeam/Quantum-LIMIT-Graph-v2.4.0-NSN-level-4-maturity-rust

Updated Nov 4, 2025

ziadrone/airesupdated-v6

Text Generation • Updated Nov 5, 2025 • 4 • 1

Uppaal/gpt2-ProFS-toxicity

Text Generation • 0.4B • Updated Nov 9, 2025 • 10

Uppaal/gpt-j-ProFS-toxicity

Text Generation • 6B • Updated Nov 9, 2025 • 1

Uppaal/opt-ProFS-toxicity

Text Generation • 7B • Updated Nov 9, 2025 • 1

Uppaal/Mistral-ProFS-toxicity

Text Generation • 7B • Updated Nov 9, 2025 • 6

Uppaal/Mistral-sft-ProFS-toxicity

Text Generation • 7B • Updated Nov 9, 2025 • 3

Uppaal/Mistral-ProFS-safety

Text Generation • 7B • Updated Nov 9, 2025 • 4

Uppaal/Mistral-sft-ProFS-safety

Text Generation • 7B • Updated Nov 9, 2025 • 4

sodeniZz/llm-course-hw2-dpo

Text Generation • 0.1B • Updated Nov 15, 2025

sodeniZz/llm-course-hw2-reward-model

Text Classification • 0.1B • Updated Nov 15, 2025

sodeniZz/llm-course-hw2-ppo

Text Generation • 0.1B • Updated Nov 15, 2025 • 1

ahczhg/qwen3-0.6b-rlhf-cot

Text Generation • Updated Nov 17, 2025 • 1

ahczhg/Llama-3.2-1B-Aegis-SFT-DPO

Text Generation • 1B • Updated Nov 17, 2025 • 47 • 1

mradermacher/Llama-3.2-1B-Aegis-SFT-DPO-GGUF

1B • Updated Nov 15, 2025 • 61

khanhrill/HistoryGPT

4B • Updated Dec 12, 2025 • 2

mradermacher/HistoryGPT-GGUF

4B • Updated Dec 15, 2025 • 28

nfsrulesFR/mega-grpo

Text Generation • Updated Nov 22, 2025

TzJ2006/JokeGPT-Model

Updated Nov 29, 2025 • 10 • 1

FutureMa/Qwen2.5-7B-Instruct-GRPO-Math

Text Generation • Updated Nov 28, 2025

noeum/noeum-1-nano

Text Generation • Updated Jan 5 • 16

MaleekNoob/qwen3-0.6b-grpo-v1

Updated Dec 18, 2025

AhmedSSoliman/medgemma-4b-digital-twin-v1

Updated Dec 5, 2025

AhmedSSoliman/gpt-oss-20b-digital-twin-v1

Text Generation • Updated Dec 8, 2025 • 3

AhmedSSoliman/octomed-7b-digital-twin-v1

Text Generation • Updated Dec 9, 2025 • 1 • 1

AIPlans/Qwen3-0.6B-ReMax

Reinforcement Learning • 0.6B • Updated Dec 22, 2025 • 2 • 2

AIPlans/Qwen3-0.6B-IPO

Reinforcement Learning • 0.6B • Updated Dec 12, 2025 • 37 • 1

mradermacher/Qwen3-0.6B-ReMax-GGUF

Reinforcement Learning • 0.6B • Updated Dec 11, 2025 • 7

gyung/lfm2-1.2b-koen-mt-v5-rl-10k-adapter

Text Generation • Updated Dec 15, 2025 • 6 • 1