FaithLens: Detecting and Explaining Faithfulness Hallucination Paper • 2512.20182 • Published Dec 23, 2025 • 9
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 72
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27, 2025 • 97
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 Paper • 2510.19600 • Published Oct 22, 2025 • 69
QueST: Incentivizing LLMs to Generate Difficult Problems Paper • 2510.17715 • Published Oct 20, 2025 • 35
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications Paper • 2509.26490 • Published Sep 30, 2025 • 20
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning Paper • 2505.16483 • Published May 22, 2025 • 10
Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement Paper • 2410.15633 • Published Oct 21, 2024 • 7