File size: 1,555 Bytes
cd5cf8b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
04b6235
 
cd5cf8b
 
 
 
 
 
 
 
f38ccbd
cd5cf8b
 
f38ccbd
cd5cf8b
 
 
 
 
 
 
 
04b6235
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: apache-2.0
language:
- uk
metrics:
- f1
- precision
- recall
base_model:
- 51la5/roberta-large-NER
pipeline_tag: token-classification
library_name: spacy
model-index:
- name: roberta-large-ner-uk
  results:
  - task:
      name: NER
      type: token-classification
    metrics:
    - name: NER Precision
      type: precision
      value: 0.9468
    - name: NER Recall
      type: recall
      value: 0.9416
    - name: NER F1
      type: f1
      value: 0.9442
tags:
- ner
- uk
datasets:
- lang-uk/UberText-NER-Silver
---
# roberta-large-ner-uk

A transformer-based NER model for Ukrainian, trained on a combination of human-annotated data (NER-UK 2.0) and high-quality silver-standard annotations (UberText-NER-Silver). Based on `roberta-large-NER`, this model achieves state-of-the-art performance on a wide range of named entities in Ukrainian.

## Model Details

- **Model type:** Transformer-based encoder (spaCy pipeline)
- **Language (NLP):** Ukrainian
- **License:** Apache 2.0
- **Finetuned from model:** `51la5/roberta-large-NER`
- **Entity Types (13):** `PERS`, `ORG`, `LOC`, `DATE`, `TIME`, `JOB`, `MON`, `PCT`, `PERIOD`, `DOC`, `QUANT`, `ART`, `MISC`

## Usage

```python
import spacy
nlp = spacy.load("roberta-large-ner-uk")
doc = nlp("Президент України Володимир Зеленський виступив у Брюсселі.")
print([(ent.text, ent.label_) for ent in doc.ents])
```

## Authors

[Vladyslav Radchenko](https://huggingface.co/pofce), [Nazarii Drushchak](https://huggingface.co/ndrushchak)