view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 6 days ago • 223
Surfer 2: The Next Generation of Cross-Platform Computer Use Agents Paper • 2510.19949 • Published Oct 22 • 38
Holo1.5 Collection Holo1.5 - Open Foundation Models for Computer Use Agents • 5 items • Updated Sep 15 • 34
Holo1 Collection Vision-Language Action Model for use in Surfer-H web navigation agent • 6 items • Updated Jun 10 • 48
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21 • 234
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 303
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 155
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google +1 Feb 19 • 72
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 249