Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yikang Shen's picture
7 8 1

Yikang Shen

YikangS
chriszhouwei's profile picture ranarag's profile picture gss1147's profile picture
·
  • yikangshen

AI & ML interests

None yet

Organizations

IBM's profile picture JetMoE's profile picture IBM Research's profile picture

upvoted a collection over 1 year ago

Power-LM

Collection
Dense & MoE LLMs trained with power learning rate scheduler. • 3 items • Updated Mar 2 • 16
upvoted a paper over 1 year ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 23
upvoted 2 papers almost 2 years ago

Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26, 2024 • 48

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 20
upvoted 2 papers about 2 years ago

Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 32

The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 14
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs