Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.16416

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published Jan 30 • 29
DiLoCo: Distributed Low-Communication Training of Language Models

Paper • 2311.08105 • Published Nov 14, 2023 • 16
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo

Paper • 2503.09799 • Published Mar 12 • 15
Muon is Scalable for LLM Training

Paper • 2502.16982 • Published Feb 24 • 8

Agent Evaluation

MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models

Paper • 2507.12806 • Published Jul 17 • 20
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 299
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 303
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 54
Seedream 3.0 Technical Report

Paper • 2504.11346 • Published Apr 15 • 70

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20 • 76
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

Paper • 2508.15760 • Published Aug 21 • 46
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?

Paper • 2508.01780 • Published Aug 3 • 20
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs

Paper • 2304.08244 • Published Apr 14, 2023 • 1
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 158

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8 • 186
Survey of User Interface Design and Interaction Techniques in Generative AI Applications

Paper • 2410.22370 • Published Oct 28, 2024 • 12
Survey of Hallucination in Natural Language Generation

Paper • 2202.03629 • Published Feb 8, 2022

Fun journal papers Ive read

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 232
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

Paper • 2503.10613 • Published Mar 13 • 79
BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published Dec 13, 2024 • 35
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published Jan 30 • 29
DiLoCo: Distributed Low-Communication Training of Language Models

Paper • 2311.08105 • Published Nov 14, 2023 • 16
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo

Paper • 2503.09799 • Published Mar 12 • 15
Muon is Scalable for LLM Training

Paper • 2502.16982 • Published Feb 24 • 8

LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

Paper • 2508.15760 • Published Aug 21 • 46
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?

Paper • 2508.01780 • Published Aug 3 • 20
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs

Paper • 2304.08244 • Published Apr 14, 2023 • 1
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 158

Agent Evaluation

MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models

Paper • 2507.12806 • Published Jul 17 • 20
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8 • 186
Survey of User Interface Design and Interaction Techniques in Generative AI Applications

Paper • 2410.22370 • Published Oct 28, 2024 • 12
Survey of Hallucination in Natural Language Generation

Paper • 2202.03629 • Published Feb 8, 2022

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 299
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 303
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 54
Seedream 3.0 Technical Report

Paper • 2504.11346 • Published Apr 15 • 70

Fun journal papers Ive read

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 232
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20 • 76
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

Paper • 2503.10613 • Published Mar 13 • 79
BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published Dec 13, 2024 • 35
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs