ML Foundations

non-profit

AI & ML interests

None defined yet.

Recent Activity

sedrickkeh authored a paper 3 days ago

SkillFactory: Self-Distillation For Learning Cognitive Behaviors

anas-awadalla updated a model 22 days ago

mlfoundations/Gelato-30B-A3B

anas-awadalla updated a dataset 26 days ago

mlfoundations/Click-100k

View all activity

sedrickkeh

authored a paper 3 days ago

SkillFactory: Self-Distillation For Learning Cognitive Behaviors

Paper • 2512.04072 • Published 4 days ago • 2

anas-awadalla

updated a model 22 days ago

mlfoundations/Gelato-30B-A3B

Image-Text-to-Text • 31B • Updated 22 days ago • 1.44k • 25

anas-awadalla

updated a dataset 26 days ago

mlfoundations/Click-100k

Viewer • Updated 26 days ago • 101k • 648 • 10

djghosh

updated a collection about 1 month ago

🍨 Gelato

From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents • 5 items • Updated 22 days ago

djghosh

updated a dataset about 1 month ago

mlfoundations/gelato-osworld-agent-trajectories

Viewer • Updated about 1 month ago • 13.5k • 43

djghosh

published a dataset about 1 month ago

mlfoundations/gelato-osworld-agent-trajectories

Viewer • Updated about 1 month ago • 13.5k • 43

pratyushmaini

authored 5 papers 4 months ago

Model-tuning Via Prompts Makes NLP Models Adversarially Robust

Paper • 2303.07320 • Published Mar 13, 2023

Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic

Paper • 2404.07177 • Published Apr 10, 2024 • 1

Rethinking LLM Memorization through the Lens of Adversarial Compression

Paper • 2404.15146 • Published Apr 23, 2024

OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics

Paper • 2506.12618 • Published Jun 14

BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining

Paper • 2508.10975 • Published Aug 14 • 60

ronakdm

authored 3 papers 4 months ago

Distributionally Robust Optimization with Bias and Variance Reduction

Paper • 2310.13863 • Published Oct 21, 2023

The Benefits of Balance: From Information Projections to Variance Reduction

Paper • 2408.15065 • Published Aug 27, 2024 • 1

A Generalization Theory for Zero-Shot Prediction

Paper • 2507.09128 • Published Jul 12

JJitsev

authored 2 papers 6 months ago

OpenThoughts: Data Recipes for Reasoning Models

Paper • 2506.04178 • Published Jun 4 • 48

Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets

Paper • 2506.04598 • Published Jun 5 • 7

marianna13

authored 3 papers 6 months ago

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

Paper • 2406.02061 • Published Jun 4, 2024 • 2

DataComp-LM: In search of the next generation of training sets for language models

Paper • 2406.11794 • Published Jun 17, 2024 • 54

OpenThoughts: Data Recipes for Reasoning Models

Paper • 2506.04178 • Published Jun 4 • 48

ryanmarten

authored a paper 6 months ago

OpenThoughts: Data Recipes for Reasoning Models

Paper • 2506.04178 • Published Jun 4 • 48