Key-Value Means Collection Models featured in the Key-Value Means paper. • 23 items • Updated 1 day ago • 1
GLiNER-Relex: A Unified Framework for Joint Named Entity Recognition and Relation Extraction Paper • 2605.10108 • Published 2 days ago • 1
A Causal Language Modeling Detour Improves Encoder Continued Pretraining Paper • 2605.12438 • Published 1 day ago • 2
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content Paper • 2506.20331 • Published Jun 25, 2025 • 6
Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling Paper • 2604.28075 • Published 13 days ago • 19
German LLM Benchmarks Collection Improved German versions of widely used LLM benchmarks • 4 items • Updated 9 days ago • 1
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models Paper • 1511.09249 • Published Nov 30, 2015 • 1
BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs Paper • 2604.02045 • Published Apr 2 • 36
Decoding Text Spans for Efficient and Accurate Named-Entity Recognition Paper • 2604.20447 • Published 21 days ago • 2
GlotSuite Collection GlotSuite: Paving the Way for Bringing Generative AI to Underserved Communities • 17 items • Updated 28 days ago • 3
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs nielsr • Apr 7 • 61
fiNERweb Collection A multilingual dataset for NER covering 91 langauges and 25 scripts • 3 items • Updated Dec 16, 2025 • 3
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Paper • 2603.19223 • Published Mar 19 • 31