Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Wakals 's Collections
CoVT: Chain-of-Visual-Thought

CoVT: Chain-of-Visual-Thought

updated 12 days ago

Enrich VLMs’ vision-centric reasoning capabilities via Chain-of-Visual-Thought!

Upvote
5

  • Wakals/CoVT-7B-seg_depth_dino

    8B • Updated 2 days ago • 140 • 2

  • Wakals/CoVT-7B-seg_depth_dino_edge

    8B • Updated 2 days ago • 261 • 2

  • Wakals/CoVT-7B-depth

    8B • Updated 2 days ago • 40 • 2

  • Wakals/CoVT-7B-seg

    8B • Updated 2 days ago • 37 • 1

  • Wakals/CoVT-LLaVA-13B-depth

    13B • Updated 2 days ago • 23 • 2

  • Wakals/CoVT-Dataset

    Viewer • Updated 2 days ago • 1.17M • 3.02k • 9

  • Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

    Paper • 2511.19418 • Published 13 days ago • 26
Upvote
5
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs