pii-ner-nemotron

Model summary

PII NER model trained on nemotron dataset for multilingual PII entity extraction.

  • Base model: xlm-roberta-large
  • Repository: scanpatch/pii-ner-nemotron
  • Training run name: pii-ner-nemotron
  • Export timestamp (UTC): 2025-12-29T12:06:13.731145+00:00

Labels

Entity types

  • address
  • address_apartment
  • address_building
  • address_city
  • address_country
  • address_district
  • address_geolocation
  • address_house
  • address_postal_code
  • address_region
  • address_street
  • date
  • document_number
  • email
  • first_name
  • ip
  • last_name
  • middle_name
  • military_individual_number
  • mobile_phone
  • name
  • name_initials
  • nickname
  • organization
  • snils
  • tin
  • vehicle_number

Evaluation

Metric Value
test_f1 0.9768405285513023
test_precision 0.9734942064790006
test_recall 0.9802099354987895
test_accuracy 0.9977181928808507
train_runtime 1693.5057
train_samples_per_second 238.116

How to use

from transformers import pipeline

ner = pipeline(
    "token-classification",
    model="scanpatch/pii-ner-nemotron",
    aggregation_strategy="simple",
)

text = "Contact me at test@example.com and my phone is +380 67 123 45 67."
print(ner(text))
Downloads last month
6
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for scanpatch/pii-ner-nemotron

Finetuned
(893)
this model

Evaluation results