temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v1

Full transformers checkpoint derived from OpenMed/OpenMed-PII-mLiteClinical-Base-135M-v1 and tuned for Irish core PII:

  • PPSN
  • account_number
  • bank_routing_number
  • credit_debit_card
  • PASSPORT_NUMBER
  • postcode
  • phone_number
  • email
  • first_name
  • last_name
  • swift_bic

The main focus is English + Irish Gaelic (ga) handling for Irish administrative, citizen-support, and HSE-style text.

Included Artifacts

  • Full transformers model files in the repo root
  • Dynamic int8 ONNX export in onnx/model_quantized.onnx
  • inference_mask.py for the full model
  • inference_mask_onnx.py for the ONNX int8 artifact
  • clean benchmark summaries in eval/

Recommended Inference

Highest accuracy:

python3 inference_mask.py \
  --text "My PPSN is 1234567TW and call me on 087 123 4567." \
  --json

Fast CPU path:

python3 inference_mask_onnx.py \
  --text "My PPSN is 1234567TW and call me on 087 123 4567." \
  --json

Dedicated Eircode example:

python3 inference_mask.py \
  --text "My Eircode is D02 X285." \
  --json

Benchmarks

Reference comparison on the manual Irish core suite and PPSN regression suites:

Label Base OpenMed Previous Public Model This Release ONNX Q8
PPSN 0.0000 0.0800 0.8000 0.7273
account_number 0.3333 0.3333 1.0000 1.0000
bank_routing_number 0.0000 0.0000 1.0000 1.0000
credit_debit_card 0.1538 0.1818 1.0000 0.3333
PASSPORT_NUMBER 0.0000 0.0000 1.0000 1.0000
postcode 0.0000 0.0000 1.0000 1.0000
phone_number 0.0000 0.0000 0.8571 0.8571
email 0.7059 1.0000 1.0000 1.0000
first_name 0.8947 0.8947 1.0000 1.0000
last_name 0.8889 0.8889 1.0000 1.0000
swift_bic 0.0000 0.0000 1.0000 1.0000

Edge and multilingual PPSN checks:

Suite Base OpenMed Previous Public Model This Release ONNX Q8
edge_ppsn 0.0000 0.4211 0.5000 0.4000
edge_phone_number 0.1429 0.1429 0.6316 0.5000
multilingual_ppsn 0.0000 0.9704 0.9940 0.9882

Multilingual PPSN throughput on CPU (eval/multilingual_ppsn_v1_all.jsonl):

  • Base OpenMed: 42.30 examples/s
  • Previous public PPSN model: 42.63 examples/s
  • This release: 41.18 examples/s
  • ONNX Q8: 81.99 examples/s

Practical Reading Of The Benchmarks

  • This release is materially better than the previous public PPSN-only model on Irish phones, Eircodes, account details, passport numbers, and names.
  • The bundled ONNX int8 export is useful for CPU speed, but it is not accuracy-identical to the full checkpoint.
  • The largest ONNX drops are on credit_debit_card and some PPSN edge cases. Use the full model when those matter.

License And Attribution

  • Model weights in this repo are distributed under Apache-2.0.
  • Base model: OpenMed/OpenMed-PII-mLiteClinical-Base-135M-v1
  • Training data included synthetic Irish data plus attributed upstream data from:
    • joelniklaus/mapa (cc-by-4.0)
    • gretelai/synthetic_pii_finance_multilingual (apache-2.0)
  • See NOTICE for attribution details.

Portfolio Comparison

Updated: 2026-03-16.

Use this section for the fastest public comparison across the temsa PII masking portfolio.

  • The first core table only includes public checkpoints that ship both comparable q8 accuracy and q8 CPU throughput.
  • The first PPSN table only includes public artifacts that ship comparable PPSN accuracy and CPU throughput.
  • Missing cells in the archive tables mean the older release did not ship that metric in its public bundle.
  • DiffMask rows use the reconciled clean_single_pass harness that matches the deployed runtime.
  • GlobalPointer rows use the public raw-only span-matrix release bundle and its packaged q8 ONNX artifact.
  • The same content is shipped as PORTFOLIO_COMPARISON.md inside each public model repo.

Irish Core PII: Comparable Public Checkpoints

Repo Stack Full Core F1 Q8 Core F1 Q8 Multilingual PPSN F1 Q8 Core ex/s
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc6 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 282.9
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc5 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 282.9
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc3 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 317.9
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc2 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 292.5
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc1 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 337.3
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc29 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 232.7
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc28 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 232.7
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc25 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 212.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc24 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 278.9
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc23 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 237.6
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc22 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 106.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc21 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 150.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc20 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 181.9
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc19 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 73.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc18 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 126.2
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc17 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc16 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc15 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc14 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 119.2
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc13 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 126.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc12 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 73.6
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc11 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 94.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc10 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc9 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 119.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc8 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 128.9
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc7 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 89.0
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc6 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 89.0
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc5 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 84.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc4 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9333 61.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc3 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9333 61.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc2 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9222 61.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc1 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9222 61.5
temsa/IrishCore-GlobalPointer-135M-v1-rc4 GlobalPointer raw-only span-matrix 1.0000 1.0000 0.9333 221.6
temsa/IrishCore-GlobalPointer-135M-v1-rc3 GlobalPointer raw-only span-matrix 1.0000 1.0000 0.9213 204.9
temsa/IrishCore-GlobalPointer-135M-v1-rc2 GlobalPointer raw-only span-matrix 0.9934 0.9934 0.9326 231.2
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc8 Raw-only token-span 0.9737 0.9737 0.9176 46.1
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc7 Hybrid classifier + generated scanner spec 1.0000 0.9934 1.0000 30.0
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc6 Hybrid classifier + repair decoders 1.0000 0.9934 1.0000 29.5
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc5 Hybrid classifier + repair decoders 0.9737 0.9669 0.9333 34.4
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc4 Hybrid classifier + repair decoders 0.9870 0.9740 0.9600 114.2
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 Hybrid classifier + repair decoders 0.9806 0.9677 0.9333 44.9
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 Hybrid classifier + repair decoders 0.9554 0.9615 0.7887 119.1
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v1 Hybrid classifier baseline 0.9530 0.9333 0.9882 103.3
temsa/IrishCore-DiffMask-135M-v1-rc6 DiffMask token-span, scanner-free 0.9801 0.9733 0.9274 130.3
temsa/IrishCore-DiffMask-135M-v1-rc5 DiffMask token-span, scanner-free 0.9733 0.9733 0.9379 249.2
temsa/IrishCore-DiffMask-135M-v1-rc4 DiffMask token-span, scanner-free 0.9733 0.9733 0.9371 29.5
temsa/IrishCore-DiffMask-135M-v1-rc3 DiffMask token-span, scanner-free 0.9664 0.9664 0.9591 30.0
temsa/IrishCore-DiffMask-135M-v1-rc2 DiffMask token-span, scanner-free 0.9664 0.9664 0.9212 247.1
temsa/IrishCore-DiffMask-135M-v1-rc1 DiffMask token-span, scanner-free 0.9801 0.9934 0.9412 251.2

Irish Core PII: Other Public Checkpoints

Repo Stack Full Core F1 Q8 Core F1 Q8 Multilingual PPSN F1 Notes
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc1 Hybrid classifier prototype 0.9487 Predates the public q8 artifact.

Finance-boundary q8 F1 is 1.0000 for OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc6, OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc7, OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc8, and all public IrishCore-DiffMask releases from rc1 to rc6. OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc5 ships 0.8750 on that public q8 suite.

PPSN-Only: Comparable Public Artifacts

Repo Artifact Irish Large F1 Multilingual PPSN F1 User Raw F1 QA v8 F1 CPU ex/s
temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1 fp32 canonical checkpoint 0.8979 0.9704 0.8000 0.7385 57.4
temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1-fp16 fp16 CPU/GPU artifact 0.9704 0.8000 0.7385 45.8
temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1-q8 dynamic int8 CPU artifact 0.9040 132.1

PPSN-Only: Historical Public Checkpoints

Repo Main Published Metrics Notes
temsa/OpenMed-PPSN-mLiteClinical-v1 same as canonical fp32 repo: multilingual 0.9704, user raw 0.8000 Legacy alias; prefer temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1.
temsa/OpenMed-PPSN-v6-raw-rc2 irish_reg_v5 0.8750; user_raw 0.8000; qa_v8 0.7385 Raw PPSN-only research checkpoint; no packaged multilingual CPU benchmark row.
temsa/OpenMed-PPSN-v5_1 irish_large_v2 raw 0.9285; qa_v6 hybrid strict 1.0000 Hybrid PPSN-only checkpoint; predates the canonical multilingual suite packaging.
temsa/OpenMed-PPSN-v5 irish_reg_v5 raw 0.8235; irish_reg_v5 hybrid strict 1.0000 Hybrid PPSN-only checkpoint; predates the canonical multilingual suite packaging.
temsa/OpenMed-PPSN-v4 synthetic non-PPSN drift check only Predates the current PPSN eval suite; no packaged apples-to-apples multilingual CPU row.

If you need the strongest current raw-only Irish core model, start with IrishCore-GlobalPointer-135M-v1-rc4. If you need the fastest CPU-first raw-only line, compare it against IrishCore-DiffMask-135M-v1-rc6. If you need a PPSN-only artifact, compare the canonical fp32, fp16, and q8 variants of OpenMed-mLiteClinical-IrishPPSN-135M-v1 directly in the table above.

Downloads last month
411
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v1

Datasets used to train temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v1

Evaluation results