gunasekar's picture

16 7

gunasekar

GunA-SD

·

AI & ML interests

None yet

Recent Activity

liked a model 21 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

liked a model 23 days ago

Qwen/Qwen3-0.6B

updated a dataset over 1 year ago

GunA-SD/bash_code

View all activity

Organizations

None yet

liked a model 21 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Text Generation • 2B • Updated Feb 24 • 2.63M • • 1.4k

liked a model 23 days ago

Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated Jul 26 • 7.56M • • 862

updated a dataset over 1 year ago

GunA-SD/bash_code

Viewer • Updated May 3, 2024 • 343k • 247 • 6

updated a collection over 1 year ago

datasets

11 items • Updated May 3, 2024

upvoted 3 papers over 1 year ago

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11, 2024 • 38

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Paper • 2404.07839 • Published Apr 11, 2024 • 47

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 93

updated a collection over 1 year ago

papers

4 items • Updated Apr 17, 2024

upvoted 2 papers over 1 year ago

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14, 2024 • 43

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12, 2024 • 66