AI & ML interests

Computer Vision

Recent Activity

prithivMLmods 
posted an update 9 days ago
view post
Post
4234
Multimodal-Edge Demo, a node-based inference canvas demo, is now live on Spaces. It features node-based Transformers for fast inference across 10+ edge-device multimodal models on the Hub, all within a single space. The series includes models from Qwen3.5, Qwen3-VL, Gemma 4, and the LFM 2.5 VL model series, with support for reasoning and grounding tasks.

🤗 Demo: prithivMLmods/Multimodal-Edge-Node
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/Multimodal-Edge-Node
✅ Multimodal Apps Collections: https://huggingface.co/collections/prithivMLmods/hall-of-multimodal-apps

🤗 > To learn more, visit the app page or the respective model pages.
prithivMLmods 
posted an update 17 days ago
view post
Post
1849
Now, a collection of various compression schemes for Qwen3.6 and the abliterated version 1 of dense models is available on the Hub. Check it out via the links below. 👇

🔗 Qwen3.6-MoE: https://huggingface.co/collections/prithivMLmods/qwen36-35b-a3b-compressions
🔗 Qwen3.6-27B Compressions: https://huggingface.co/collections/prithivMLmods/qwen36-27b-compressions

🤗 > To learn more, visit the app page or the respective model pages.
prithivMLmods 
posted an update 22 days ago
view post
Post
4161
HY-World-2.0 — A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds is now available on Spaces, and it works both as native Gradio components and in Gradio server mode.

> HY-World-2.0-Demo: prithivMLmods/HY-World-2.0-Demo
> HY-World-2.0 [Server Mode]: prithivMLmods/HY-World-2.0-Demo
> Featuring 3D reconstruction and Gaussian splats with the Rerun viewer, along with camera poses, depth maps, and surface normals.
> In Server Mode, Gradio is served via FastAPI, with FastAPI remaining the top-level server.
> Model: tencent/HY-World-2.0
> GitHub: https://github.com/PRITHIVSAKTHIUR/HY-World-2.0-Demo

🤗To learn more, visit the app page or the respective model pages.
prithivMLmods 
posted an update 28 days ago
view post
Post
6194
A new comparator on Spaces showcases Standard FLUX.2 Decoder vs. FLUX.2 Small Decoder. The Small Decoder is ~1.4× faster, uses ~1.4× less VRAM, and maintains near-identical image quality. It has ~28M parameters with narrower channels [96, 192, 384, 384] vs. [128, 256, 512, 512], and the demo supports sequence generation by running both decoders simultaneously and comparing the results side by side.

🤗 Comparator: prithivMLmods/Flux.2-4B-Decoder-Comparator
🔗 FLUX.2-small-decoder: black-forest-labs/FLUX.2-small-decoder
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/Flux.2-4B-Encoder-Comparator
🚁 Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

🤗 > App built on the Gradio SDK. To learn more, visit the app page or the respective model pages.
prithivMLmods 
posted an update 29 days ago
view post
Post
4227
Now, a collection of various compression schemes for Gemma 4 and the abliterated version 1 of dense models is available on the Hub. Check it out via the links below. 👇

🔗Gemma 4 Compression(s)- https://huggingface.co/collections/prithivMLmods/gemma-4-compressions
🔗Gemma 4 Uncensored [MAX] + Compression(s) - [`β ]- https://huggingface.co/collections/prithivMLmods/gemma-4-uncensored-max-compressions
🔗Gemma 4 Compression(s) - MoE- https://huggingface.co/collections/prithivMLmods/gemma-4-compressions-moe
🔗Gemma-4 F32 GGUF- https://huggingface.co/collections/prithivMLmods/gemma-4-f32-gguf

🤗 > To learn more, visit the app page or the respective model pages.
prithivMLmods 
posted an update about 1 month ago
view post
Post
2321
Now the demo for image detection based on SAM3 and Gemma-4 (*Filter) is available on Spaces, using full-fledged Transformers inference with multimodal reasoning for processed images. It also supports video segmentation (mask), video segmentation (annotation), and image click segmentation.

🤗 Demo Space: prithivMLmods/SAM3-Gemma4-CUDA
🥽 SAM3: facebook/sam3
🔗 gemma-4-E2B-it: google/gemma-4-E2B-it

To learn more, visit the app page or the respective model pages.
  • 1 reply
·
prithivMLmods 
posted an update about 1 month ago
view post
Post
4760
The demo for Image Detection (*Filter) based on SAM3 and Qwen-3.5 is now available on Hugging Face Spaces using Transformers inference, with multimodal reasoning for processed images, and it also supports video segmentation (mask), video segmentation (annotation), and image click segmentation.

🤗 Demo Space: prithivMLmods/SAM3-Plus-Qwen3.5
🥽 SAM3: facebook/sam3
🔗 Qwen-3.5: Qwen/Qwen3.5-2B

To learn more, visit the app page or the respective model pages.
  • 5 replies
·