7 136 49

Frank Sommers PRO

fsommers

fsommers

AI & ML interests

None yet

Recent Activity

liked a model about 10 hours ago

google/gemma-4-31B

upvoted a paper 1 day ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

liked a Space 1 day ago

webml-community/Gemma-4-WebGPU

View all activity

Organizations

upvoted a paper 1 day ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published 3 days ago • 101

upvoted an article 2 days ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

7 days ago

•

792

upvoted 2 papers 5 days ago

BHDD: A Burmese Handwritten Digit Dataset

Paper • 2603.21966 • Published 17 days ago • 1

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Paper • 2604.02029 • Published 7 days ago • 132

upvoted an article 14 days ago

Article

Beyond Semantic Similarity: Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline

27 days ago

•

upvoted a paper 15 days ago

Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs

Paper • 2603.16932 • Published 26 days ago • 86

upvoted a paper 21 days ago

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published 29 days ago • 152

upvoted a collection about 1 month ago

Qwen3.5

Collection

21 items • Updated about 1 month ago • 1.46k

upvoted an article about 2 months ago

Article

Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model

Feb 4

•

upvoted an article 2 months ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Jul 9, 2025

•

792

upvoted a paper 2 months ago

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 67

upvoted 2 papers 3 months ago

GutenOCR: A Grounded Vision-Language Front-End for Documents

Paper • 2601.14490 • Published Jan 20 • 37

Typhoon OCR: Open Vision-Language Model For Thai Document Extraction

Paper • 2601.14722 • Published Jan 21 • 15

upvoted a collection 3 months ago

PP-OCRv5

Collection

PP-OCRv5 is the latest text recognition solution, supporting Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese • 13 items • Updated Sep 15, 2025 • 54

upvoted a paper 4 months ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 265

upvoted an article 4 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

307

upvoted 2 papers 5 months ago

SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models

Paper • 2511.15605 • Published Nov 19, 2025 • 25

TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval

Paper • 2511.16528 • Published Nov 20, 2025 • 24

upvoted a collection 5 months ago

Qwen3-VL

Collection

37 items • Updated Dec 31, 2025 • 687

upvoted a paper 5 months ago

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28, 2025 • 103

Frank Sommers PRO

AI & ML interests

Recent Activity

Organizations

fsommers's activity

Welcome Gemma 4: Frontier multimodal intelligence on device

Beyond Semantic Similarity: Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline

Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Transformers v5: Simple model definitions powering the AI ecosystem