Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Hanbaike
/
kyrgyz_spm_tokenizer

Kyrgyz
kyrgyz
tokenization
sentencepiece
BPE
Unigram
Model card Files Files and versions
xet
Community
kyrgyz_spm_tokenizer / text
442 MB
  • 1 contributor
History: 1 commit
Hanbaike's picture
Hanbaike
Upload folder using huggingface_hub
059fc10 verified 8 months ago
  • kir_community_2017-sentences.txt
    55.9 MB
    xet
    Upload folder using huggingface_hub 8 months ago
  • kir_newscrawl_2011_300K-sentences.txt
    59 MB
    xet
    Upload folder using huggingface_hub 8 months ago
  • kir_newscrawl_2016_1M-sentences.txt
    211 MB
    xet
    Upload folder using huggingface_hub 8 months ago
  • kir_wikipedia_2010_10K-sentences.txt
    1.95 MB
    Upload folder using huggingface_hub 8 months ago
  • kir_wikipedia_2016_300K-sentences.txt
    57.9 MB
    xet
    Upload folder using huggingface_hub 8 months ago
  • kir_wikipedia_2021_300K-sentences.txt
    56.3 MB
    xet
    Upload folder using huggingface_hub 8 months ago