Model Card for Model ID

Model Description

CLASS-IT is a 140M parameter language model based on the LLaMA architecture.

The model is first pre-trained for 8 epochs on a cleaned version of the BabyLM Challenge strict track dataset. After pre-training, the model is instruction-tuned on two additional datasets (8.7M words total) for 10 epochs :

  • a conversational dataset derived from Switchboard, and

  • an educational dataset based on an augmented version of Simple English Wikipedia (to be released soon).

Evaluation

The model has been submitted to the 2025 BabyLM Challenge – Interaction Track: https://huggingface.co/spaces/BabyLM-community/babylm-leaderboard-2025-all-tasks

Citation

This model was introduced in the paper:
“CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs”
(Capone, Bondielli & Lenci, BabyLM Challange 2025)
📄 ArXiv: 2510.25364

Cite as (BibTeX):

@inproceedings{capone-etal-2025-class,
    title = "{CLASS}-{IT}: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for {B}aby{LM}s",
    author = "Capone, Luca  and
      Bondielli, Alessandro  and
      Lenci, Alessandro",
    editor = "Charpentier, Lucas  and
      Choshen, Leshem  and
      Cotterell, Ryan  and
      Gul, Mustafa Omer  and
      Hu, Michael Y.  and
      Liu, Jing  and
      Jumelet, Jaap  and
      Linzen, Tal  and
      Mueller, Aaron  and
      Ross, Candace  and
      Shah, Raj Sanjay  and
      Warstadt, Alex  and
      Wilcox, Ethan Gotlieb  and
      Williams, Adina",
    booktitle = "Proceedings of the First BabyLM Workshop",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.babylm-main.30/",
    pages = "436--444",
    ISBN = "TODO"
}
Downloads last month
4,641
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support