Image Vision - a carlizor Collection

carlizor 's Collections

Agents

Multi lora spaces

TTS

Document retrieval / chat

Flux

Image restoration

LLM

To Read

Video

Image Segmentation

Image Generation (Fast)

Audio

Image Generation

Image that talks

Image upscaling

Face Recognition

Image Vision

updated Aug 12, 2025

Salesforce/xgen-mm-phi3-mini-instruct-r-v1

Image-Text-to-Text • 5B • Updated Feb 3, 2025 • 604 • 186
AIDC-AI/Ovis1.6-Gemma2-9B

Image-Text-to-Text • Updated Aug 15, 2025 • 24.4k • 273
nvidia/NVLM-D-72B

Image-Text-to-Text • Updated Jan 14, 2025 • 106k • 775
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 301 • 1.71k
deepseek-ai/Janus-1.3B

Any-to-Any • Updated Jan 27, 2025 • 4.11k • 592
deepseek-ai/JanusFlow-1.3B

Any-to-Any • 2B • Updated Jan 27, 2025 • 346 • 151
NexaAI/OmniVLM-968M

0.5B • Updated Aug 20, 2025 • 1.99k • 530
vikhyatk/moondream2

Image-Text-to-Text • Updated Sep 23, 2025 • 2.88M • 1.37k
stepfun-ai/GOT-OCR2_0

Image-Text-to-Text • Updated Feb 4, 2025 • 102k • 1.53k
jiuhai/florence-vl-8b-sft

Updated Dec 3, 2024 • 22 • 21
AI-Safeguard/Ivy-VL-llava

Visual Question Answering • 4B • Updated Apr 28, 2025 • 250 • 71
OpenGVLab/InternVL2_5-78B

Image-Text-to-Text • 78B • Updated Sep 11, 2025 • 379 • 192
Qwen/QVQ-72B-Preview

Image-Text-to-Text • Updated Jan 12, 2025 • 363 • 609
deepseek-ai/deepseek-vl2

Image-Text-to-Text • Updated Dec 18, 2024 • 3.5k • 379
allenai/Molmo-7B-D-0924

Image-Text-to-Text • 8B • Updated Dec 15, 2025 • 19.1k • 564
prithivMLmods/Qwen2-VL-OCR-2B-Instruct

Image-Text-to-Text • 2B • Updated May 2, 2025 • 1.52k • 101
ByteDance/Sa2VA-1B

Image-Text-to-Text • 1B • Updated Sep 8, 2025 • 925 • 29
HuggingFaceTB/SmolVLM-500M-Instruct

Image-Text-to-Text • 0.5B • Updated Apr 8, 2025 • 15.8k • 186
Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • Updated Jun 6, 2025 • 150k • • 593
Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated Apr 6, 2025 • 3.14M • • 1.45k
OpenGVLab/InternVideo2_5_Chat_8B

Video-Text-to-Text • 8B • Updated Aug 4, 2025 • 2.03k • 88
nvidia/Eagle2-9B

Image-Text-to-Text • 9B • Updated Jan 28, 2025 • 65 • 62
stepfun-ai/GOT-OCR-2.0-hf

Image-Text-to-Text • 0.6B • Updated Jan 31, 2025 • 22.3k • 225
allenai/olmOCR-7B-0225-preview

Image-Text-to-Text • 8B • Updated Aug 19, 2025 • 4.87k • 706
microsoft/Magma-8B

Robotics • 9B • Updated Dec 10, 2025 • 543 • 413
marco/mcdse-2b-v1

2B • Updated Oct 29, 2024 • 3.02k • 56
CohereLabs/aya-vision-8b

Image-Text-to-Text • 9B • Updated Jan 9 • 45.3k • 316
Skywork/Skywork-R1V-38B

Image-Text-to-Text • 38B • Updated Aug 12, 2025 • 46.9k • 127
docling-project/SmolDocling-256M-preview

Image-Text-to-Text • Updated Sep 17, 2025 • 45k • 1.61k
Qwen/Qwen2.5-VL-32B-Instruct

Image-Text-to-Text • Updated Apr 14, 2025 • 1.09M • • 476
reducto/RolmOCR

Image-Text-to-Text • 8B • Updated Apr 2, 2025 • 2.96k • 579
moonshotai/Kimi-VL-A3B-Thinking

Image-Text-to-Text • 16B • Updated 16 days ago • 79.3k • 445
XiaomiMiMo/MiMo-VL-7B-RL

Image-Text-to-Text • 8B • Updated Jun 7, 2025 • 1.4k • 168
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Image-Text-to-Text • Updated Dec 4, 2025 • 910k • 175
ByteDance/Dolphin

Image-Text-to-Text • Updated Jul 16, 2025 • 3.12k • 513
nanonets/Nanonets-OCR-s

Image-Text-to-Text • 4B • Updated Jun 20, 2025 • 28.8k • 1.58k
echo840/MonkeyOCR

Image-Text-to-Text • Updated Aug 28, 2025 • 258 • 514
moonshotai/Kimi-VL-A3B-Thinking-2506

Image-Text-to-Text • 16B • Updated 16 days ago • 82.6k • 348
prithivMLmods/DREX-062225-exp

Image-Text-to-Text • 8B • Updated Jul 20, 2025 • 16 • 6
zai-org/GLM-4.1V-9B-Thinking

Image-Text-to-Text • 10B • Updated Oct 25, 2025 • 254k • • 771
HelloKKMe/GTA1-72B

Image-Text-to-Text • 73B • Updated Jul 8, 2025 • 14 • 4
rednote-hilab/dots.ocr

Image-Text-to-Text • Updated Oct 31, 2025 • 250k • 1.23k