Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
carlizor
's Collections
Agents
Multi lora spaces
TTS
Utilities
Document retrieval / chat
Flux
Image restoration
3D Generation
LLM
Embedding
LLM - Small
Video vision
To Read
Video
Image Segmentation
Image Generation (Fast)
Image Depth
Image caption
Audio
Image Generation
Image that talks
Image Enhance
Image Vision
Image editing
Image upscaling
Face Recognition
Multimodal
LLM - Medium
Image Vision
updated
Aug 12, 2025
Upvote
-
Salesforce/xgen-mm-phi3-mini-instruct-r-v1
Image-Text-to-Text
•
5B
•
Updated
Feb 3, 2025
•
604
•
186
AIDC-AI/Ovis1.6-Gemma2-9B
Image-Text-to-Text
•
Updated
Aug 15, 2025
•
24.4k
•
273
nvidia/NVLM-D-72B
Image-Text-to-Text
•
Updated
Jan 14, 2025
•
106k
•
775
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
301
•
1.71k
deepseek-ai/Janus-1.3B
Any-to-Any
•
Updated
Jan 27, 2025
•
4.11k
•
592
deepseek-ai/JanusFlow-1.3B
Any-to-Any
•
2B
•
Updated
Jan 27, 2025
•
346
•
151
NexaAI/OmniVLM-968M
0.5B
•
Updated
Aug 20, 2025
•
1.99k
•
530
vikhyatk/moondream2
Image-Text-to-Text
•
Updated
Sep 23, 2025
•
2.88M
•
1.37k
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
Updated
Feb 4, 2025
•
102k
•
1.53k
jiuhai/florence-vl-8b-sft
Updated
Dec 3, 2024
•
22
•
21
AI-Safeguard/Ivy-VL-llava
Visual Question Answering
•
4B
•
Updated
Apr 28, 2025
•
250
•
71
OpenGVLab/InternVL2_5-78B
Image-Text-to-Text
•
78B
•
Updated
Sep 11, 2025
•
379
•
192
Qwen/QVQ-72B-Preview
Image-Text-to-Text
•
Updated
Jan 12, 2025
•
363
•
609
deepseek-ai/deepseek-vl2
Image-Text-to-Text
•
Updated
Dec 18, 2024
•
3.5k
•
379
allenai/Molmo-7B-D-0924
Image-Text-to-Text
•
8B
•
Updated
Dec 15, 2025
•
19.1k
•
564
prithivMLmods/Qwen2-VL-OCR-2B-Instruct
Image-Text-to-Text
•
2B
•
Updated
May 2, 2025
•
1.52k
•
101
ByteDance/Sa2VA-1B
Image-Text-to-Text
•
1B
•
Updated
Sep 8, 2025
•
925
•
29
HuggingFaceTB/SmolVLM-500M-Instruct
Image-Text-to-Text
•
0.5B
•
Updated
Apr 8, 2025
•
15.8k
•
186
Qwen/Qwen2.5-VL-72B-Instruct
Image-Text-to-Text
•
Updated
Jun 6, 2025
•
150k
•
•
593
Qwen/Qwen2.5-VL-7B-Instruct
Image-Text-to-Text
•
Updated
Apr 6, 2025
•
3.14M
•
•
1.45k
OpenGVLab/InternVideo2_5_Chat_8B
Video-Text-to-Text
•
8B
•
Updated
Aug 4, 2025
•
2.03k
•
88
nvidia/Eagle2-9B
Image-Text-to-Text
•
9B
•
Updated
Jan 28, 2025
•
65
•
62
stepfun-ai/GOT-OCR-2.0-hf
Image-Text-to-Text
•
0.6B
•
Updated
Jan 31, 2025
•
22.3k
•
225
allenai/olmOCR-7B-0225-preview
Image-Text-to-Text
•
8B
•
Updated
Aug 19, 2025
•
4.87k
•
706
microsoft/Magma-8B
Robotics
•
9B
•
Updated
Dec 10, 2025
•
543
•
413
marco/mcdse-2b-v1
2B
•
Updated
Oct 29, 2024
•
3.02k
•
56
CohereLabs/aya-vision-8b
Image-Text-to-Text
•
9B
•
Updated
Jan 9
•
45.3k
•
316
Skywork/Skywork-R1V-38B
Image-Text-to-Text
•
38B
•
Updated
Aug 12, 2025
•
46.9k
•
127
docling-project/SmolDocling-256M-preview
Image-Text-to-Text
•
Updated
Sep 17, 2025
•
45k
•
1.61k
Qwen/Qwen2.5-VL-32B-Instruct
Image-Text-to-Text
•
Updated
Apr 14, 2025
•
1.09M
•
•
476
reducto/RolmOCR
Image-Text-to-Text
•
8B
•
Updated
Apr 2, 2025
•
2.96k
•
579
moonshotai/Kimi-VL-A3B-Thinking
Image-Text-to-Text
•
16B
•
Updated
16 days ago
•
79.3k
•
445
XiaomiMiMo/MiMo-VL-7B-RL
Image-Text-to-Text
•
8B
•
Updated
Jun 7, 2025
•
1.4k
•
168
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1
Image-Text-to-Text
•
Updated
Dec 4, 2025
•
910k
•
175
ByteDance/Dolphin
Image-Text-to-Text
•
Updated
Jul 16, 2025
•
3.12k
•
513
nanonets/Nanonets-OCR-s
Image-Text-to-Text
•
4B
•
Updated
Jun 20, 2025
•
28.8k
•
1.58k
echo840/MonkeyOCR
Image-Text-to-Text
•
Updated
Aug 28, 2025
•
258
•
514
moonshotai/Kimi-VL-A3B-Thinking-2506
Image-Text-to-Text
•
16B
•
Updated
16 days ago
•
82.6k
•
348
prithivMLmods/DREX-062225-exp
Image-Text-to-Text
•
8B
•
Updated
Jul 20, 2025
•
16
•
6
zai-org/GLM-4.1V-9B-Thinking
Image-Text-to-Text
•
10B
•
Updated
Oct 25, 2025
•
254k
•
•
771
HelloKKMe/GTA1-72B
Image-Text-to-Text
•
73B
•
Updated
Jul 8, 2025
•
14
•
4
rednote-hilab/dots.ocr
Image-Text-to-Text
•
Updated
Oct 31, 2025
•
250k
•
1.23k
Upvote
-
Share collection
View history
Collection guide
Browse collections