HIGGS
updated
meta-llama/Llama-3.1-8B-Instruct
Text Generation
• 8B • Updated • 9.23M
• • 5.76k
RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic
Text Generation
• 8B • Updated • 49.6k
• 9
RedHatAI/Llama-3.1-8B-Instruct-NVFP4
Text Generation
• 5B • Updated • 18.8k
• 1
inference-optimization/Llama-3.1-8B-Instruct_5_bits_mode_hybrid
inference-optimization/Llama-3.1-8B-Instruct_5_bits_mode_noise
inference-optimization/Llama-3.1-8B-Instruct_5_bits_mode_heuristic
inference-optimization/Llama-3.1-8B-Instruct_5.5_bits_mode_hybrid
inference-optimization/Llama-3.1-8B-Instruct_5.5_bits_mode_noise
inference-optimization/Llama-3.1-8B-Instruct_5.5_bits_mode_heuristic
inference-optimization/Llama-3.1-8B-Instruct_6_bits_mode_hybrid
inference-optimization/Llama-3.1-8B-Instruct_6_bits_mode_noise
inference-optimization/Llama-3.1-8B-Instruct_6_bits_mode_heuristic
inference-optimization/Llama-3.1-8B-Instruct_6.5_bits_mode_hybrid
inference-optimization/Llama-3.1-8B-Instruct_6.5_bits_mode_noise
inference-optimization/Llama-3.1-8B-Instruct_6.5_bits_mode_heuristic
inference-optimization/Llama-3.1-8B-Instruct_7_bits_mode_hybrid
inference-optimization/Llama-3.1-8B-Instruct_7_bits_mode_noise
inference-optimization/Llama-3.1-8B-Instruct_7_bits_mode_heuristic
meta-llama/Llama-3.2-1B-Instruct
Text Generation
• 1B • Updated • 5.2M
• • 1.38k
inference-optimization/Llama-3.2-1B-Instruct-FP8-Dynamic
1B • Updated • 34
inference-optimization/Llama-3.2-1B-Instruct-NVFP4
0.8B • Updated • 50
inference-optimization/Llama-3.2-1B-Instruct_5_bits_mode_hybrid
inference-optimization/Llama-3.2-1B-Instruct_5_bits_mode_noise
inference-optimization/Llama-3.2-1B-Instruct_5_bits_mode_heuristic
inference-optimization/Llama-3.2-1B-Instruct_5.5_bits_mode_hybrid
inference-optimization/Llama-3.2-1B-Instruct_5.5_bits_mode_noise
inference-optimization/Llama-3.2-1B-Instruct_5.5_bits_mode_heuristic
inference-optimization/Llama-3.2-1B-Instruct_6_bits_mode_hybrid
inference-optimization/Llama-3.2-1B-Instruct_6_bits_mode_noise
inference-optimization/Llama-3.2-1B-Instruct_6_bits_mode_heuristic
inference-optimization/Llama-3.2-1B-Instruct_6.5_bits_mode_hybrid
inference-optimization/Llama-3.2-1B-Instruct_6.5_bits_mode_noise
inference-optimization/Llama-3.2-1B-Instruct_6.5_bits_mode_heuristic
inference-optimization/Llama-3.2-1B-Instruct_7_bits_mode_hybrid
inference-optimization/Llama-3.2-1B-Instruct_7_bits_mode_noise
inference-optimization/Llama-3.2-1B-Instruct_7_bits_mode_heuristic
meta-llama/Llama-3.2-3B-Instruct
Text Generation
• 3B • Updated • 2.17M
• 2.11k
inference-optimization/Llama-3.2-3B-Instruct-FP8-Dynamic
3B • Updated • 29
inference-optimization/Llama-3.2-3B-Instruct-NVFP4
2B • Updated • 34
inference-optimization/Llama-3.2-3B-Instruct_5_bits_mode_hybrid
inference-optimization/Llama-3.2-3B-Instruct_5_bits_mode_noise
inference-optimization/Llama-3.2-3B-Instruct_5_bits_mode_heuristic
inference-optimization/Llama-3.2-3B-Instruct_5.5_bits_mode_hybrid
inference-optimization/Llama-3.2-3B-Instruct_5.5_bits_mode_noise
inference-optimization/Llama-3.2-3B-Instruct_5.5_bits_mode_heuristic
inference-optimization/Llama-3.2-3B-Instruct_6_bits_mode_hybrid
inference-optimization/Llama-3.2-3B-Instruct_6_bits_mode_noise
inference-optimization/Llama-3.2-3B-Instruct_6_bits_mode_heuristic
inference-optimization/Llama-3.2-3B-Instruct_6.5_bits_mode_hybrid
inference-optimization/Llama-3.2-3B-Instruct_6.5_bits_mode_noise
inference-optimization/Llama-3.2-3B-Instruct_6.5_bits_mode_heuristic
inference-optimization/Llama-3.2-3B-Instruct_7_bits_mode_hybrid
inference-optimization/Llama-3.2-3B-Instruct_7_bits_mode_noise
inference-optimization/Llama-3.2-3B-Instruct_7_bits_mode_heuristic
Text Generation
• 8B • Updated • 8.95M
• • 1.07k
RedHatAI/Qwen3-8B-FP8-dynamic
Text Generation
• 8B • Updated • 21.8k
• 12
Text Generation
• 5B • Updated • 2.45k
• 2
inference-optimization/Qwen3-8B_5_bits_mode_hybrid
inference-optimization/Qwen3-8B_5_bits_mode_noise
inference-optimization/Qwen3-8B_5_bits_mode_heuristic
inference-optimization/Qwen3-8B_5.5_bits_mode_hybrid
inference-optimization/Qwen3-8B_5.5_bits_mode_noise
inference-optimization/Qwen3-8B_5.5_bits_mode_heuristic
inference-optimization/Qwen3-8B_6_bits_mode_hybrid
inference-optimization/Qwen3-8B_6_bits_mode_noise
inference-optimization/Qwen3-8B_6_bits_mode_heuristic
inference-optimization/Qwen3-8B_6.5_bits_mode_hybrid
inference-optimization/Qwen3-8B_6.5_bits_mode_noise
inference-optimization/Qwen3-8B_6.5_bits_mode_heuristic
inference-optimization/Qwen3-8B_7_bits_mode_hybrid
inference-optimization/Qwen3-8B_7_bits_mode_noise
inference-optimization/Qwen3-8B_7_bits_mode_heuristic
Text Generation
• Updated • 1.27M
• • 882
RedHatAI/Qwen3-30B-A3B-FP8-dynamic
Text Generation
• 31B • Updated • 3.83k
• 3
RedHatAI/Qwen3-30B-A3B-NVFP4
Text Generation
• 17B • Updated • 33.3k
• 2
inference-optimization/Qwen3-30B-A3B_5.0_bits_mode_hybrid
20B • Updated • 35
inference-optimization/Qwen3-30B-A3B_5.0_bits_mode_noise
20B • Updated • 43
inference-optimization/Qwen3-30B-A3B_5.0_bits_mode_heuristic
20B • Updated • 38
inference-optimization/Qwen3-30B-A3B_5.5_bits_mode_hybrid
22B • Updated • 37
inference-optimization/Qwen3-30B-A3B_5.5_bits_mode_noise
22B • Updated • 41
• 1
inference-optimization/Qwen3-30B-A3B_5.5_bits_mode_heuristic
22B • Updated • 38
inference-optimization/Qwen3-30B-A3B_6.0_bits_mode_hybrid
23B • Updated • 37
inference-optimization/Qwen3-30B-A3B_6.0_bits_mode_noise
24B • Updated • 36
inference-optimization/Qwen3-30B-A3B_6.0_bits_mode_heuristic
23B • Updated • 37
inference-optimization/Qwen3-30B-A3B_6.5_bits_mode_hybrid
24B • Updated • 33
inference-optimization/Qwen3-30B-A3B_6.5_bits_mode_noise
25B • Updated • 41
inference-optimization/Qwen3-30B-A3B_6.5_bits_mode_heuristic
25B • Updated • 38
inference-optimization/Qwen3-30B-A3B_7.0_bits_mode_hybrid
25B • Updated • 39
inference-optimization/Qwen3-30B-A3B_7.0_bits_mode_noise
27B • Updated • 41
inference-optimization/Qwen3-30B-A3B_7.0_bits_mode_heuristic
27B • Updated • 41
Qwen/Qwen3-30B-A3B-Instruct-2507
Text Generation
• Updated • 1.07M
• • 802
inference-optimization/Qwen3-30B-A3B-Instruct-2507-FP8-Dynamic
inference-optimization/Qwen3-30B-A3B-Instruct-2507-NVFP4
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.0_bits_mode_hybrid
20B • Updated • 22
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.0_bits_mode_noise
20B • Updated • 24
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.0_bits_mode_heuristic
20B • Updated • 27
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.5_bits_mode_hybrid
22B • Updated • 25
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.5_bits_mode_noise
22B • Updated • 27
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.5_bits_mode_heuristic
22B • Updated • 24
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.0_bits_mode_hybrid
23B • Updated • 23
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.0_bits_mode_noise
23B • Updated • 19
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.0_bits_mode_heuristic
23B • Updated • 32
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.5_bits_mode_hybrid
25B • Updated • 27
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.5_bits_mode_noise
25B • Updated • 21
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.5_bits_mode_heuristic
25B • Updated • 28
inference-optimization/Qwen3-30B-A3B-Instruct-2507_7.0_bits_mode_hybrid
26B • Updated • 30
inference-optimization/Qwen3-30B-A3B-Instruct-2507_7.0_bits_mode_noise
26B • Updated • 29
inference-optimization/Qwen3-30B-A3B-Instruct-2507_7.0_bits_mode_heuristic
27B • Updated • 28