---
language: pl
tags:
- ner
- polish
- token-classification
- onnx
datasets:
- custom
widget:
- text: "Wczoraj kupiłem w Biedronce mleko za 4,99 zł"
- text: "Spotkałem Pawła Kowalskiego w Warszawie"
- text: "W przyszły wtorek jadę do Krakowa"
---

# nodoki-ner-polish

Custom Polish NER model for [nodoki.app](https://github.com/yourusername/nodoki) - a local-first PWA for personal information management.

## Model Description

This model is fine-tuned DistilBERT for Polish Named Entity Recognition with custom entity types:

- **PERSON** - Names of people (👤 osoba)
- **POLISH_CITY** - Polish cities (📍 miasto)
- **FOREIGN_CITY** - International cities (📍 miasto)
- **POLISH_STORE** - Polish retail chains (🛒 sklep)
- **AMOUNT_PLN** - Money amounts in PLN (💰 kwota PLN)
- **MONTH** - Polish month names (📅 miesiąc)
- **WEEKDAY** - Days of the week (📆 dzień tygodnia)
- **DATE_RELATIVE** - Relative dates like wczoraj, dzisiaj, jutro (🕐 data względna)

## Performance

- **F1 Score**: 0.9985
- **Format**: ONNX quantized (uint8)
- **Size**: ~130MB

## Usage with Transformers.js

```javascript
import { pipeline } from '@xenova/transformers';

const classifier = await pipeline(
  'token-classification',
  'HerqAI/nodoki-ner-polish'
);

const result = await classifier('Wczoraj kupiłem w Biedronce mleko za 4,99 zł');
console.log(result);
```

## Training Details

- Base model: `distilbert-base-multilingual-cased`
- Training samples: 5,000
- Validation samples: 1,000
- Test samples: 750
- Epochs: 4
- Learning rate: 2e-5
- Batch size: 16

## License

Apache 2.0