TEXT_Datasets
Datasets for fine-tunning, instruction and evaluation of text models from projecte-aina
-
Viewer • Updated • 56.4k • 144 • 21
projecte-aina/ceil
Viewer • Updated • 204k • 50Note Named Entities Recognition
projecte-aina/catalanqa
Viewer • Updated • 21.4k • 545 • 1Note QA dataset
projecte-aina/GuiaCat
Viewer • Updated • 5.75k • 471Note Sentiment analysis
projecte-aina/CaWikiTC
Viewer • Updated • 21k • 70Note Text classification
projecte-aina/ancora-ca-ner
Viewer • Updated • 13.6k • 324 • 2Note Named Entities Recognition
projecte-aina/teca
Viewer • Updated • 21.2k • 325 • 1Note Textual entailment
projecte-aina/viquiquad
Viewer • Updated • 596 • 271Note Extractive-QA
projecte-aina/xquad-ca
Viewer • Updated • 1.19k • 290Note Cross-lingual-QA, Extractive-QA
projecte-aina/WikiCAT_ca
Viewer • Updated • 12.4k • 187Note Text classification
projecte-aina/Parafraseja
Viewer • Updated • 22k • 521Note Paraphrase
projecte-aina/sts-ca
Viewer • Updated • 3.07k • 152 • 1Note Semantic Textual Similarity
projecte-aina/wnli-ca
Viewer • Updated • 852 • 408Note Textual entailmen
projecte-aina/tecla
Viewer • Updated • 113k • 251Note Text classification
projecte-aina/vilaquad
Viewer • Updated • 2.1k • 141Note Extractive-QA
projecte-aina/catalan_general_crawling
Viewer • Updated • 711k • 141Note A 435-million-token web corpus of Catalan mainly intended to pretrain language models and word representations.
projecte-aina/raco_forums
Viewer • Updated • 3.95M • 217 • 2Note A 19-million-sentence corpus of Catalan user-generated text built from the forums mainly intended to pretrain language models and word representations.
projecte-aina/catalan_government_crawling
Viewer • Updated • 71k • 86 • 1Note A 39-million-token web corpus of Catalan mainly intended to pretrain language models and word representations.
projecte-aina/catalan_textual_corpus
Viewer • Updated • 3.06M • 128 • 1Note A 1760-million-token web corpus of Catalan mainly intended to pretrain language models and word representations.
projecte-aina/CoQCat
Viewer • Updated • 6k • 163 • 2Note Conversational QA
projecte-aina/caBreu
Viewer • Updated • 3k • 435Note Summarization
projecte-aina/CaSERa-catalan-stance-emotions-raco
Viewer • Updated • 14k • 108Note Emotion and dynamic stance detection
projecte-aina/InToxiCat
Viewer • Updated • 29.8k • 121 • 1Note Abusive language detection
projecte-aina/UD_Catalan-AnCora
Viewer • Updated • 16.7k • 151 • 1Note POS tagging
projecte-aina/CaSSA-catalan-structured-sentiment-analysis
Viewer • Updated • 6.4k • 60 • 3Note Sentiment analysis
projecte-aina/CaSET-catalan-stance-emotions-twitter
Viewer • Updated • 6.77k • 119 • 2Note Emotion, static stance, and dynamic stance detection.
projecte-aina/COPA-ca
Viewer • Updated • 1k • 331Note Commonsense reasoning
projecte-aina/xnli-ca
Viewer • Updated • 7.5k • 267Note Textual entailment
projecte-aina/casum
Viewer • Updated • 218k • 364Note Summarization
projecte-aina/vilasum
Viewer • Updated • 13.8k • 239Note Summarization
projecte-aina/CATalog
Viewer • Updated • 34.3M • 1.55k • 5Note Language Modeling
projecte-aina/mgsm_ca
Viewer • Updated • 258 • 364Note Question Answering
projecte-aina/MentorES
Viewer • Updated • 10.2k • 71 • 2Note Instruction Tuning
projecte-aina/MentorCA
Viewer • Updated • 10.2k • 47 • 2Note Instruction Tuning
projecte-aina/openbookqa_ca
Viewer • Updated • 1k • 381Note Question Answering
projecte-aina/PAWS-ca
Viewer • Updated • 53.4k • 378Note Paraphrase Identification
projecte-aina/NLUCat
Updated • 24Note Intent classification, spans identification and examples generation.
projecte-aina/siqa_ca
Viewer • Updated • 1.95k • 215Note Multiple Choice Question Answering
projecte-aina/piqa_ca
Viewer • Updated • 1.84k • 197Note Multiple Choice Question Answering
projecte-aina/xstorycloze_ca
Viewer • Updated • 1.87k • 179Note Multiple Choice Commonsense Reasoning
projecte-aina/arc_ca
Viewer • Updated • 4.42k • 683Note Multiple Choice Question Answering
projecte-aina/oasst1_ca
Viewer • Updated • 5.49k • 73Note Instruction Tuning