TensorLens: End-to-End Transformer Analysis via High-Order Attention Tensors Paper • 2601.17958 • Published 4 days ago • 1
Adaptive Nonlinear Vector Autoregression: Robust Forecasting for Noisy Chaotic Time Series Paper • 2507.08738 • Published Jul 11, 2025 • 1
view post Post 15382 Want to iterate on a Hugging Face Space with an LLM? Now you can easily convert any HF entire repo (Model, Dataset or Space) to a text file and feed it to a language model! multimodalart/repo2txt See translation 🤗 3 3 👍 2 2 🚀 1 1 + Reply
UniFusion: Vision-Language Model as Unified Encoder in Image Generation Paper • 2510.12789 • Published Oct 14, 2025 • 19
EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24, 2025 • 44
How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions Paper • 2506.16679 • Published Jun 20, 2025 • 1
view post Post 18101 Self-Forcing - a real-time video distilled model from Wan 2.1 by @adobe is out, and they open sourced it 🐐I've built a live real time demo on Spaces 📹💨 multimodalart/self-forcing See translation 6 replies · ❤️ 12 12 🔥 6 6 + Reply
Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability Paper • 2506.02138 • Published Jun 2, 2025 • 1
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2, 2025 • 148
Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations Paper • 2303.09289 • Published Mar 16, 2023 • 2
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge Paper • 2309.11575 • Published Sep 20, 2023
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation Paper • 2305.15296 • Published May 24, 2023 • 1
Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness? Paper • 2305.18398 • Published May 28, 2023 • 2
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis Paper • 2209.08891 • Published Sep 19, 2022 • 2
The Stable Artist: Steering Semantics in Diffusion Latent Space Paper • 2212.06013 • Published Dec 12, 2022 • 1
LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment Paper • 2406.05113 • Published Jun 7, 2024 • 3