11 22

Liam Duignan

Lduignan1

Lduignan1

AI & ML interests

NLP/Named Entity Recognition/LLMs

Recent Activity

upvoted an article 4 days ago

Luth: Efficient French Specialization for Small Language Models

liked a model 9 days ago

AI-MO/NuminaMath-7B-CoT

liked a dataset 21 days ago

juletxara/mgsm

View all activity

Organizations

upvoted an article 4 days ago

Article

Luth: Efficient French Specialization for Small Language Models

Aug 11, 2025

•

liked a model 9 days ago

AI-MO/NuminaMath-7B-CoT

Text Generation • 7B • Updated Jul 19, 2024 • 252 • 26

liked a dataset 21 days ago

juletxara/mgsm

Viewer • Updated Oct 9, 2025 • 2.84k • 11.5k • 44

liked a dataset 23 days ago

HuggingFaceH4/ultrachat_200k

Viewer • Updated Oct 16, 2024 • 515k • 54.3k • 696

upvoted a paper 24 days ago

VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors

Paper • 2604.02486 • Published 30 days ago • 10

liked a dataset 28 days ago

abedk/GSM8K-cultural

Viewer • Updated Mar 31, 2025 • 1.2k • 35 • 2

upvoted a paper 28 days ago

Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?

Paper • 2503.18018 • Published Mar 23, 2025 • 7

upvoted a paper about 1 month ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 189

upvoted a collection about 1 month ago

Pensez-LLM

Collection

French-English reasoning model • 4 items • Updated Mar 2 • 4

liked a model about 1 month ago

Qwen/Qwen3.5-9B

Image-Text-to-Text • 10B • Updated Mar 2 • 7.44M • • 1.37k

upvoted a paper about 2 months ago

Reasoning Models Struggle to Control their Chains of Thought

Paper • 2603.05706 • Published Mar 5 • 37

liked a dataset 3 months ago

nvidia/Nemotron-Math-v2

Viewer • Updated Feb 11 • 7.09M • 2.82k • 178

upvoted an article 3 months ago

Article

TextQuests: How Good are LLMs at Text-Based Video Games?

Aug 12, 2025

•

liked a Space 4 months ago

Evaluation Guidebook

📝

312

Explore LLM benchmark trends over time

liked a dataset 5 months ago

allenai/ai2_arc

Viewer • Updated Dec 21, 2023 • 7.79k • 426k • 333

upvoted an article 5 months ago

Article

Integrating benchmarks into LM Evaluation Harness

Jul 21, 2025

•

upvoted an article 6 months ago

Article

Supercharge your OCR Pipelines with Open Models

Oct 21, 2025

•

309

liked a model 7 months ago

ibm-granite/granite-docling-258M

Image-Text-to-Text • 0.3B • Updated Sep 23, 2025 • 123k • 1.16k

liked a Space 12 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

liked a Space about 1 year ago

OpenLLM French leaderboard 🇫🇷

🥇

Explore and submit LLM benchmarks

Liam Duignan

AI & ML interests

Recent Activity

Organizations

Lduignan1's activity

Luth: Efficient French Specialization for Small Language Models

TextQuests: How Good are LLMs at Text-Based Video Games?

Evaluation Guidebook

Integrating benchmarks into LM Evaluation Harness

Supercharge your OCR Pipelines with Open Models

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

OpenLLM French leaderboard 🇫🇷