|
|
--- |
|
|
tags: |
|
|
- setfit |
|
|
- sentence-transformers |
|
|
- text-classification |
|
|
- generated_from_setfit_trainer |
|
|
widget: |
|
|
- text: A fantastical portal opening into another dimension, swirling energy. |
|
|
- text: Analyze the concept of political trust and its importance for governance. |
|
|
- text: What makes a particular escape room experience engaging and successful? |
|
|
- text: What is the function of the lymphatic system? |
|
|
- text: Desenvolva um conto fictício sobre um mapa antigo que guia para um tesouro |
|
|
cultural perdido. |
|
|
metrics: |
|
|
- accuracy |
|
|
pipeline_tag: text-classification |
|
|
library_name: setfit |
|
|
inference: true |
|
|
base_model: ibm-granite/granite-embedding-107m-multilingual |
|
|
model-index: |
|
|
- name: SetFit with ibm-granite/granite-embedding-107m-multilingual |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Text Classification |
|
|
dataset: |
|
|
name: Unknown |
|
|
type: unknown |
|
|
split: test |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.8924137931034483 |
|
|
name: Accuracy |
|
|
--- |
|
|
|
|
|
As of 28/07/2025, I instead of using this model, a simpler approach would be to just use one of these [Gliclass Models](https://huggingface.co/cnmoro/gliclass-base-v3.0-onnx), matching the user's prompt against the prompts classes. But this model will remain here nonetheless. |
|
|
|
|
|
------------------------ |
|
|
|
|
|
# SetFit with ibm-granite/granite-embedding-107m-multilingual |
|
|
|
|
|
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. |
|
|
|
|
|
The model has been trained using an efficient few-shot learning technique that involves: |
|
|
|
|
|
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. |
|
|
2. Training a classification head with features from the fine-tuned Sentence Transformer. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
- **Model Type:** SetFit |
|
|
- **Sentence Transformer body:** [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual) |
|
|
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance |
|
|
- **Maximum Sequence Length:** 512 tokens |
|
|
- **Number of Classes:** 30 classes |
|
|
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) --> |
|
|
<!-- - **Language:** Unknown --> |
|
|
<!-- - **License:** Unknown --> |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) |
|
|
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) |
|
|
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) |
|
|
|
|
|
### Model Labels |
|
|
| Label | Examples | |
|
|
|:-------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
|
| sentiment_analysis | <ul><li>'Como a análise de sentimento pode melhorar a tomada de decisões estratégicas?'</li><li>'Dê um exemplo de como a ironia afeta a análise de sentimento.'</li><li>'What are the key advantages of using transformer-based models (e.g., BERT, RoBERTa) for sentiment analysis tasks?'</li></ul> | |
|
|
| marketing | <ul><li>'What are the emerging trends in voice search optimization for marketing?'</li><li>'How can augmented reality (AR) be integrated into marketing experiences?'</li><li>'How does psychographic segmentation differ from demographic segmentation?'</li></ul> | |
|
|
| entertainment | <ul><li>'Explore the challenges of balancing artistic integrity with commercial viability in entertainment.'</li><li>'Discuss the impact of major sporting events as a form of entertainment.'</li><li>"O que torna uma canção um 'hit global' na era da internet?"</li></ul> | |
|
|
| image_generation | <ul><li>'Um grupo de pássaros migratórios voando em formação perfeita no céu azul, com nuvens ao fundo.'</li><li>'A photorealistic rendering of a gourmet dish, top-down view, professional food photography.'</li><li>'Uma ponte suspensa de madeira em uma floresta tropical densa, com névoa subindo do chão.'</li></ul> | |
|
|
| complex_reasoning | <ul><li>'Implement a neural-symbolic system that combines rule-based reasoning with deep learning to answer complex scientific questions from literature.'</li><li>'Create a method to reconstruct 3D objects from multiple 2D images taken under varying lighting and viewpoints with occlusions.'</li><li>'Descreva métodos para criar jogos educacionais que se adaptem dinamicamente ao progresso do aluno.'</li></ul> | |
|
|
| education | <ul><li>'Propose ways to reduce the achievement gap among different socio-economic groups.'</li><li>'O papel do esporte no desenvolvimento integral dos estudantes.'</li><li>'A importância da colaboração entre escolas e comunidade.'</li></ul> | |
|
|
| mathematics | <ul><li>'Discuss the concept of convexity in optimization.'</li><li>'How does game theory use mathematical models?'</li><li>'Descreva uma estratégia eficaz para resolver problemas de matemática que envolvem múltiplas etapas.'</li></ul> | |
|
|
| biology | <ul><li>'Explique a regulação da temperatura corporal em humanos.'</li><li>'Descreva os diferentes níveis de organização biológica.'</li><li>'O que é a homeostase e por que ela é vital?'</li></ul> | |
|
|
| extraction | <ul><li>'Faça uma síntese das principais características dos dados apresentados no relatório de desmatamento, destacando tendências e padrões observáveis.'</li><li>'É seu dever revelar os elementos-chave que explicam a relação entre desmatamento e políticas públicas, destacando causas e soluções propostas.'</li><li>"Build a detailed analysis of the competitor's marketing funnel, from awareness to conversion."</li></ul> | |
|
|
| engineering | <ul><li>'What are the main methods of controlled demolition of structures?'</li><li>'Como a sustentabilidade pode ser integrada no projeto e construção de edifícios residenciais?'</li><li>'Discuss the application of artificial intelligence in predictive maintenance of electrical equipment.'</li></ul> | |
|
|
| ethics | <ul><li>'Defina ética e moralidade, destacando suas principais diferenças e interconexões.'</li><li>'Is there a universal ethic that applies to all humans?'</li><li>'How do personal values shape ethical choices?'</li></ul> | |
|
|
| law | <ul><li>'O que é a Lei Anticorrupção?'</li><li>'Quais os direitos dos animais no direito brasileiro?'</li><li>'Explain the concept of intellectual property.'</li></ul> | |
|
|
| general_knowledge | <ul><li>'Qual é a importância da agricultura para a economia brasileira?'</li><li>'Quais espécies animais são consideradas ameaçadas de extinção no Brasil?'</li><li>'O que é a imunidade vacinal e como as vacinas funcionam?'</li></ul> | |
|
|
| geopolitics | <ul><li>'Discuss the role of the International Criminal Court in global justice and accountability.'</li><li>'Examine the role of proxy conflicts in modern geopolitical competition.'</li><li>'Examine the role of the World Trade Organization in a protectionist global economy.'</li></ul> | |
|
|
| summarization | <ul><li>'Resuma os resultados de uma avaliação educacional nacional.'</li><li>'Resuma um artigo jornalístico investigativo explicando os fatos.'</li><li>'Summarize user experience testing results to prioritize UI improvements.'</li></ul> | |
|
|
| healthcare | <ul><li>'Describe the role of a paramedic in the pre-hospital emergency care setting.'</li><li>'What is sepsis and why is it a medical emergency?'</li><li>'Discuss the medical implications of an aging global population.'</li></ul> | |
|
|
| spiritual | <ul><li>'A experiência do êxtase espiritual.'</li><li>'Aspectos do misticismo e o inexplicável.'</li><li>'How do you manage expectations on your spiritual journey?'</li></ul> | |
|
|
| coding | <ul><li>'Implemente uma função para balancear expressões matemáticas adicionando parênteses corretamente.'</li><li>'Desenvolva um algoritmo que transforme uma expressão regular em um autômato finito determinístico.'</li><li>'Desenvolva um algoritmo para reconhecimento de padrões em strings baseado em autômatos finitos não determinísticos.'</li></ul> | |
|
|
| tool | <ul><li>'Leia o conteúdo de uma página web e resuma os principais pontos.'</li><li>'Traduza este texto do português para inglês usando um serviço externo.'</li><li>'Retrieve the top 5 upcoming tech conferences worldwide this year.'</li></ul> | |
|
|
| politics | <ul><li>'Explain the process of judicial review and its role in a constitutional government.'</li><li>'Como as campanhas eleitorais influenciam o eleitorado e quais estratégias são utilizadas?'</li><li>'Discuta o funcionamento de um regime presidencialista e suas vantagens e desvantagens.'</li></ul> | |
|
|
| business | <ul><li>'Princípios e metodologias da gestão ágil de projetos (Agile) aplicadas a empresas.'</li><li>'O papel do CEO moderno em um cenário de negócios em constante mudança.'</li><li>'How can businesses optimize their operational processes for greater efficiency?'</li></ul> | |
|
|
| creativity | <ul><li>'Write a story about a labyrinth that reconfigures itself based on the visitor’s fears.'</li><li>'Escreva uma crônica de humor sobre as peculiaridades do transporte público em uma capital brasileira.'</li><li>'Desenvolva um conto que envolva um segredo escondido dentro de uma música popular brasileira.'</li></ul> | |
|
|
| physics | <ul><li>'What is superconductivity?'</li><li>'O que é a física do plasma e onde ela é estudada/aplicada?'</li><li>'Explain the concept of a quantum field.'</li></ul> | |
|
|
| psychological | <ul><li>'Identificando e apoiando dificuldades de aprendizagem.'</li><li>'Analyze the concept of social loafing and ways to mitigate it.'</li><li>'Discuss the psychological factors influencing academic performance and learning.'</li></ul> | |
|
|
| history | <ul><li>'Analyze the concept of "historical turning points."'</li><li>'Explore the history of human-animal relationships.'</li><li>'How did the fall of the Berlin Wall affect European integration?'</li></ul> | |
|
|
| translation | <ul><li>"Convert this folk song from Portuguese to English: 'Asa Branca - Luiz Gonzaga'."</li><li>"Poderia traduzir esta citação filosófica do latim para português: 'Cogito, ergo sum'."</li><li>"Convert this Brazilian lullaby to English: 'Boi da cara preta, pega essa criança que tem medo de careta.'"</li></ul> | |
|
|
| basic_reasoning | <ul><li>'Se um carro gasta 10 litros de combustível para percorrer 100 km, quanto gastará em 250 km?'</li><li>'If the sum of two numbers is 35 and their difference is 5, what are the numbers?'</li><li>'O que é maior: 1/2 ou 0,6?'</li></ul> | |
|
|
| finance | <ul><li>'Describe the process of a company going public (IPO).'</li><li>'Discuss the role of regulations in preventing financial crises.'</li><li>"Discuss the concept of 'too big to fail' in the banking sector."</li></ul> | |
|
|
| chemistry | <ul><li>'Como a temperatura afeta a velocidade das reações químicas?'</li><li>'O que são os ligantes em compostos de coordenação?'</li><li>'What is green chemistry? List and explain at least three of its core principles.'</li></ul> | |
|
|
| roleplay | <ul><li>'Personifique um cineasta independendente buscando financiamento para um projeto arriscado.'</li><li>'Atue como um editor de jogos digitais assistindo testes beta e tomando decisões de ajustes finais.'</li><li>'You are a psychologist exploring childhood trauma with a patient using therapeutic techniques.'</li></ul> | |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Metrics |
|
|
| Label | Accuracy | |
|
|
|:--------|:---------| |
|
|
| **all** | 0.8924 | |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use for Inference |
|
|
|
|
|
First install the SetFit library: |
|
|
|
|
|
```bash |
|
|
pip install setfit |
|
|
``` |
|
|
|
|
|
Then you can load this model and run inference. |
|
|
|
|
|
```python |
|
|
from setfit import SetFitModel |
|
|
|
|
|
# Download from the 🤗 Hub |
|
|
model = SetFitModel.from_pretrained("cnmoro/prompt-router") |
|
|
# Run inference |
|
|
preds = model("What is the function of the lymphatic system?") |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
### Downstream Use |
|
|
|
|
|
*List how someone could finetune this model on their own dataset.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Out-of-Scope Use |
|
|
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Bias, Risks and Limitations |
|
|
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Recommendations |
|
|
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
|
--> |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Set Metrics |
|
|
| Training set | Min | Median | Max | |
|
|
|:-------------|:----|:--------|:----| |
|
|
| Word count | 3 | 11.6859 | 38 | |
|
|
|
|
|
| Label | Training Sample Count | |
|
|
|:-------------------|:----------------------| |
|
|
| creativity | 176 | |
|
|
| extraction | 283 | |
|
|
| image_generation | 173 | |
|
|
| education | 181 | |
|
|
| summarization | 174 | |
|
|
| chemistry | 174 | |
|
|
| sentiment_analysis | 179 | |
|
|
| geopolitics | 181 | |
|
|
| translation | 179 | |
|
|
| history | 177 | |
|
|
| coding | 158 | |
|
|
| politics | 181 | |
|
|
| healthcare | 178 | |
|
|
| business | 170 | |
|
|
| complex_reasoning | 152 | |
|
|
| psychological | 174 | |
|
|
| biology | 172 | |
|
|
| mathematics | 178 | |
|
|
| marketing | 177 | |
|
|
| physics | 177 | |
|
|
| engineering | 176 | |
|
|
| roleplay | 171 | |
|
|
| finance | 175 | |
|
|
| basic_reasoning | 154 | |
|
|
| ethics | 180 | |
|
|
| entertainment | 180 | |
|
|
| tool | 166 | |
|
|
| law | 173 | |
|
|
| spiritual | 175 | |
|
|
| general_knowledge | 170 | |
|
|
|
|
|
### Training Hyperparameters |
|
|
- batch_size: (8, 8) |
|
|
- num_epochs: (1, 16) |
|
|
- max_steps: 2400 |
|
|
- sampling_strategy: oversampling |
|
|
- body_learning_rate: (2e-05, 1e-05) |
|
|
- head_learning_rate: 0.01 |
|
|
- loss: CosineSimilarityLoss |
|
|
- distance_metric: cosine_distance |
|
|
- margin: 0.25 |
|
|
- end_to_end: False |
|
|
- use_amp: False |
|
|
- warmup_proportion: 0.1 |
|
|
- l2_weight: 0.01 |
|
|
- seed: 42 |
|
|
- evaluation_strategy: steps |
|
|
- eval_max_steps: -1 |
|
|
- load_best_model_at_end: True |
|
|
|
|
|
### Training Results |
|
|
| Epoch | Step | Training Loss | Validation Loss | |
|
|
|:------:|:----:|:-------------:|:---------------:| |
|
|
| 0.0004 | 1 | 0.2374 | - | |
|
|
| 0.0208 | 50 | 0.2111 | - | |
|
|
| 0.0417 | 100 | 0.2087 | - | |
|
|
| 0.0625 | 150 | 0.1995 | - | |
|
|
| 0.0833 | 200 | 0.1984 | 0.1876 | |
|
|
| 0.1042 | 250 | 0.1894 | - | |
|
|
| 0.125 | 300 | 0.1872 | - | |
|
|
| 0.1458 | 350 | 0.1818 | - | |
|
|
| 0.1667 | 400 | 0.1758 | 0.1587 | |
|
|
| 0.1875 | 450 | 0.1647 | - | |
|
|
| 0.2083 | 500 | 0.1547 | - | |
|
|
| 0.2292 | 550 | 0.1404 | - | |
|
|
| 0.25 | 600 | 0.1342 | 0.1252 | |
|
|
| 0.2708 | 650 | 0.1309 | - | |
|
|
| 0.2917 | 700 | 0.1209 | - | |
|
|
| 0.3125 | 750 | 0.1329 | - | |
|
|
| 0.3333 | 800 | 0.1068 | 0.1055 | |
|
|
| 0.3542 | 850 | 0.1131 | - | |
|
|
| 0.375 | 900 | 0.1006 | - | |
|
|
| 0.3958 | 950 | 0.1033 | - | |
|
|
| 0.4167 | 1000 | 0.1005 | 0.0922 | |
|
|
| 0.4375 | 1050 | 0.1133 | - | |
|
|
| 0.4583 | 1100 | 0.0898 | - | |
|
|
| 0.4792 | 1150 | 0.0918 | - | |
|
|
| 0.5 | 1200 | 0.0983 | 0.0855 | |
|
|
| 0.5208 | 1250 | 0.0947 | - | |
|
|
| 0.5417 | 1300 | 0.0921 | - | |
|
|
| 0.5625 | 1350 | 0.1045 | - | |
|
|
| 0.5833 | 1400 | 0.09 | 0.0763 | |
|
|
| 0.6042 | 1450 | 0.0893 | - | |
|
|
| 0.625 | 1500 | 0.0823 | - | |
|
|
| 0.6458 | 1550 | 0.0853 | - | |
|
|
| 0.6667 | 1600 | 0.0881 | 0.0713 | |
|
|
| 0.6875 | 1650 | 0.0837 | - | |
|
|
| 0.7083 | 1700 | 0.0886 | - | |
|
|
| 0.7292 | 1750 | 0.0784 | - | |
|
|
| 0.75 | 1800 | 0.0838 | 0.0680 | |
|
|
| 0.7708 | 1850 | 0.0743 | - | |
|
|
| 0.7917 | 1900 | 0.0788 | - | |
|
|
| 0.8125 | 1950 | 0.084 | - | |
|
|
| 0.8333 | 2000 | 0.0772 | 0.0659 | |
|
|
| 0.8542 | 2050 | 0.0872 | - | |
|
|
| 0.875 | 2100 | 0.0808 | - | |
|
|
| 0.8958 | 2150 | 0.0649 | - | |
|
|
| 0.9167 | 2200 | 0.0795 | 0.0651 | |
|
|
| 0.9375 | 2250 | 0.0774 | - | |
|
|
| 0.9583 | 2300 | 0.0687 | - | |
|
|
| 0.9792 | 2350 | 0.0787 | - | |
|
|
| 1.0 | 2400 | 0.0786 | 0.0647 | |
|
|
|
|
|
### Framework Versions |
|
|
- Python: 3.11.11 |
|
|
- SetFit: 1.2.0.dev0 |
|
|
- Sentence Transformers: 5.0.0 |
|
|
- Transformers: 4.53.2 |
|
|
- PyTorch: 2.7.1+cu126 |
|
|
- Datasets: 3.2.0 |
|
|
- Tokenizers: 0.21.0 |
|
|
|
|
|
## Citation |
|
|
|
|
|
### BibTeX |
|
|
```bibtex |
|
|
@article{https://doi.org/10.48550/arxiv.2209.11055, |
|
|
doi = {10.48550/ARXIV.2209.11055}, |
|
|
url = {https://arxiv.org/abs/2209.11055}, |
|
|
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, |
|
|
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, |
|
|
title = {Efficient Few-Shot Learning Without Prompts}, |
|
|
publisher = {arXiv}, |
|
|
year = {2022}, |
|
|
copyright = {Creative Commons Attribution 4.0 International} |
|
|
} |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
## Glossary |
|
|
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Authors |
|
|
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Contact |
|
|
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
|
--> |