GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization Paper • 2511.15705 • Published 17 days ago • 91
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published Oct 29 • 45
Context Engineering 2.0: The Context of Context Engineering Paper • 2510.26493 • Published Oct 30 • 7
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning Paper • 2508.10433 • Published Aug 14 • 144
ProX General Models Collection base models trained on ProX curated data. • 16 items • Updated Oct 10, 2024 • 1
ProX Math Models Collection base models trained on ProX curated openwebmath-pro. • 5 items • Updated Oct 10, 2024 • 1
ProX Refining Models Collection Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 5
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning Paper • 2507.16812 • Published Jul 22 • 63
ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention Paper • 2507.01004 • Published Jul 1 • 10
OctoThinker-Llama-1B Family Collection What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training. • 6 items • Updated Jul 6 • 2
OctoThinker-Llama-3B Family Collection What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training. • 6 items • Updated Jul 6 • 2
OctoThinker-Llama-8B Family Collection What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training. • 3 items • Updated Jul 6 • 3
Mid-training Analysis Checkpoints (Llama-3.2-3B) Collection What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training. • 10 items • Updated Jul 7 • 1