Motoki Wu's picture

Motoki Wu

tokestermw

·

https://motoki.co

AI & ML interests

None yet

Recent Activity

liked a Space 1 day ago

Qwen/Qwen3-TTS

upvoted a paper 2 days ago

Agentic-R: Learning to Retrieve for Agentic Search

liked a model 2 days ago

XiaomiMiMo/MiMo-V2-Flash

View all activity

Organizations

upvoted a paper 2 days ago

Agentic-R: Learning to Retrieve for Agentic Search

Paper • 2601.11888 • Published 9 days ago • 19

upvoted a collection 2 days ago

NVIDIA Nemotron v3

Open, Production-ready Enterprise Models • 7 items • Updated 6 days ago • 124

upvoted a collection 6 days ago

GLM-4.7

3 items • Updated 7 days ago • 59

upvoted a paper about 1 month ago

Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published Dec 18, 2025 • 34

upvoted an article about 1 month ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Dec 15, 2025

•

106

upvoted 2 articles about 2 months ago

Article

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

Dec 9, 2025

•

82

Article

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

Dec 8, 2025

•

48

upvoted an article 2 months ago

Article

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

+2

Nov 21, 2025

•

24

upvoted a collection 3 months ago

PromptMII

Prompt-MII: Meta-Learning Instruction Induction for LLMs. Link to paper: https://arxiv.org/abs/2510.16932 • 4 items • Updated Oct 21, 2025 • 2

upvoted a paper 4 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

upvoted an article 4 months ago

Article

mem-agent: Equipping LLM Agents with Memory Using RL

Oct 9, 2025

•

32

upvoted a paper 4 months ago

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6, 2025 • 129

upvoted a collection 4 months ago

Qwen3-Omni

6 items • Updated 26 days ago • 181

upvoted 5 papers 5 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4, 2025 • 195

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 229

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28, 2025 • 110

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published Aug 23, 2025 • 24

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22, 2025 • 160

upvoted a collection 5 months ago

NVIDIA Nemotron V2

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 6 days ago • 101

upvoted a paper 5 months ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97