token classification models

20 models · ranked by HuggingFace downloads

indonesian-roberta-base-posp-tagger

indonesian-roberta-base-posp-tagger performs sequence labeling: each input token receives a class label aligned to its text position. Typical tasks include NER, chunking, and slot filling.

2,558,549 ↓ · 10 ♡

bert-base-NER

bert-base-NER performs sequence labeling: each input token receives a class label aligned to its text position. Typical tasks include NER, chunking, and slot filling.

1,811,772 ↓ · 720 ♡

stanford-deidentifier-base

stanford-deidentifier-base uses a BERT encoder with a per-token classification head. The BIO tagging scheme is standard for its NER fine-tunes.

1,255,786 ↓ · 81 ♡

wikineural-multilingual-ner

wikineural-multilingual-ner performs sequence labeling: each input token receives a class label aligned to its text position. Typical tasks include NER, chunking, and slot filling.

888,160 ↓ · 165 ♡

A DistilBERT model fine-tuned on Twitter/X data for token classification tasks, likely part-of-speech tagging or named entity recognition on social media text. BERTweet-based initialization means it handles informal spelling, hashtags, and abbreviations better than standard BERT. The training split and label schema are not publicly documented.

749,718 ↓ · 0 ♡

fullstop-punctuation-multilang-large

fullstop-punctuation-multilang-large assigns labels to individual tokens in a sequence, directly applicable to named entity recognition, part-of-speech tagging, and span extraction.

664,441 ↓ · 177 ♡

bert-large-cased-finetuned-conll03-english

bert-large-cased-finetuned-conll03-english assigns labels to individual tokens in a sequence, directly applicable to named entity recognition, part-of-speech tagging, and span extraction.

652,031 ↓ · 96 ♡

xlm-roberta-large-ner-hrl

xlm-roberta-large-ner-hrl uses a RoBERTa encoder with a per-token classification head. The BIO tagging scheme is standard for its NER fine-tunes.

573,311 ↓ · 15 ♡

punctuate-all

punctuate-all uses a RoBERTa encoder with a per-token classification head. The BIO tagging scheme is standard for its NER fine-tunes.

537,291 ↓ · 28 ♡

layoutreader

layoutreader is a LayoutLMv3-based token classification model designed to predict reading order in document layouts. Built on top of Microsoft's LayoutLMv3, it classifies tokens by their sequential reading position, making it useful for document parsing pipelines that need to linearize visual document structure.

517,756 ↓ · 43 ♡

sat-3l-sm

SAT-3l-sm (Segment Any Text, 3-layer small) is a multilingual text segmentation model supporting over 85 languages, designed to split continuous text into meaningful sentence or paragraph segments. Unlike rule-based sentence tokenisers that rely on punctuation, SAT uses a contextual XLM-based token classifier to handle languages with unusual or absent punctuation conventions. The small variant trades some accuracy for faster inference.

475,569 ↓ · 12 ♡

deid_roberta_i2b2

deid_roberta_i2b2 is a RoBERTa model fine-tuned on the i2b2 de-identification dataset to detect and classify protected health information (PHI) in clinical notes. It identifies PHI spans such as names, dates, locations, and IDs. MIT-licensed for integration into clinical NLP de-identification pipelines.

436,119 ↓ · 39 ♡

llmlingua-2-xlm-roberta-large-meetingbank

llmlingua-2-xlm-roberta-large-meetingbank assigns labels to individual tokens in a sequence, directly applicable to named entity recognition, part-of-speech tagging, and span extraction.

430,411 ↓ · 28 ♡

ner-english-fast

Flair's fast English NER model using the Flair framework's sequence labeling approach with character-level language model embeddings. 'Fast' indicates a smaller, speed-optimized variant compared to Flair's standard NER model. Recognizes standard NE classes (PER, ORG, LOC, MISC).

415,692 ↓ · 26 ♡

bert-base-NER-Russian

A BERT-base model fine-tuned for named entity recognition on Russian text. Handles standard NER categories (persons, organizations, locations) on Russian-language inputs using the standard token-classification approach.

357,936 ↓ · 22 ♡

privacy-filter

Built for token classification and NER, privacy-filter is a model with publicly available weights. privacy-filter is Apache 2.0-licensed, clearing it for closed-source and paid products. Check the privacy-filter model card for benchmarks and intended use before adopting it.

316,092 ↓ · 1,583 ♡

roberta-large-ner-english

roberta-large-ner-english is a roberta-based open-weight model aimed at token classification and NER. Permissive MIT terms let roberta-large-ner-english go straight into commercial pipelines. Read roberta-large-ner-english's card for hardware requirements and licensing fine print before deploying.

308,637 ↓ · 79 ♡

bert-portuguese-ner

Built for token classification and NER, bert-portuguese-ner is a bert-based model with publicly available weights. bert-portuguese-ner is MIT-licensed, clearing it for closed-source and paid products. Check the bert-portuguese-ner model card for benchmarks and intended use before adopting it.

304,136 ↓ · 9 ♡

bert-base-chinese-ws

bert-base-chinese-ws is an open-weight token classification and NER model in the bert family. Distribution of bert-base-chinese-ws is under GPL-3.0, which is worth reading before you ship. Like most open checkpoints, bert-base-chinese-ws rewards a quick in-domain eval before commitment.

296,571 ↓ · 19 ♡

bert-base-multilingual-cased-ner-hrl

As a bert-based open-weight model, bert-base-multilingual-cased-ner-hrl focuses on token classification and NER. bert-base-multilingual-cased-ner-hrl lists a non-standard license, so confirm permissions before deployment. Check the bert-base-multilingual-cased-ner-hrl model card for benchmarks and intended use before adopting it.

292,339 ↓ · 82 ♡

Search