Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
minpeter
's Collections
[Dataset] K-Corpus
[Dataset] FineWeb2 Edu Korean
[Model] Very, very small things
[Dataset] Pretrain-corpus
[Model] en-ko trans
[Dataset] Candidate datasets to translate
[Dataset] common-pile korean (Filtered-raw)
[Dataset] PR
[Study] NN MNIST
[Model] FLUX.1 Full Finetuned & Merged
[🛠️] Huggingface Utility
[Dataset] unified standard function calling
[tokenizer] AlternateTokenizer
[Dataset] Function Calling
[Dataset] Pretrain-corpus
updated
Jul 22
Upvote
-
PleIAs/common_corpus
Viewer
•
Updated
Jun 10
•
470M
•
41.4k
•
320
HuggingFaceFW/fineweb
Viewer
•
Updated
Jul 11
•
52.5B
•
202k
•
2.48k
HuggingFaceFW/fineweb-edu
Viewer
•
Updated
Jul 11
•
3.5B
•
238k
•
828
HuggingFaceFW/fineweb-2
Viewer
•
Updated
Oct 27
•
4.48B
•
85.9k
•
704
data-is-better-together/fineweb-c
Viewer
•
Updated
Jul 8
•
88.7k
•
1.46k
•
57
allenai/dolmino-mix-1124
Viewer
•
Updated
Oct 29
•
170M
•
30.6k
•
87
allenai/dolma
Updated
Apr 17, 2024
•
1.99k
•
964
allenai/olmo-mix-1124
Viewer
•
Updated
Aug 19
•
621M
•
29.3k
•
82
mlfoundations/dclm-baseline-1.0
Preview
•
Updated
Jul 22, 2024
•
297k
•
248
Zyphra/Zyda-2
Preview
•
Updated
Aug 6
•
141k
•
85
Upvote
-
Share collection
View history
Collection guide
Browse collections