3 185 468

Anthonny Olime

Aviv-anthonnyolime

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

arcee-ai/Trinity-Nano-Preview

upvoted a collection 4 days ago

Ministral 3

liked a model 5 days ago

mistralai/Mistral-Large-3-675B-Base-2512

View all activity

Organizations

upvoted a collection 4 days ago

Ministral 3

Collection

A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 5 days ago • 115

upvoted an article 6 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

7 days ago

•

224

upvoted a paper 7 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published 10 days ago • 152

upvoted an article 7 days ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9

•

upvoted an article 17 days ago

Article

Text-to-image Architectural Experiments

24 days ago

•

upvoted an article 25 days ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3

•

upvoted an article 3 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

Sep 11

•

166

upvoted a paper 4 months ago

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published Nov 7, 2024 • 22

upvoted an article 4 months ago

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

Aug 18

•

upvoted 6 papers 4 months ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14 • 144

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53

DINOv3

Paper • 2508.10104 • Published Aug 13 • 285

upvoted 2 collections 4 months ago

Nemotron-Pre-Training-Dataset

Collection

7 items • Updated 4 days ago • 47

DINOv3

Collection

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 398

upvoted 2 articles 4 months ago

Article

TextQuests: How Good are LLMs at Text-Based Video Games?

Aug 12

•

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

Aug 11

•

upvoted a paper 4 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 192

Anthonny Olime

AI & ML interests

Recent Activity

Organizations

Aviv-anthonnyolime's activity

Transformers v5: Simple model definitions powering the AI ecosystem

From GRPO to DAPO and GSPO: What, Why, and How

Text-to-image Architectural Experiments

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

TextQuests: How Good are LLMs at Text-Based Video Games?

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks