--- license: cc-by-sa-4.0 datasets: - bavarian-nlp/gliner-bavarian-v0.1 language: - bar base_model: - gerturax/gerturax-3 tags: - GLiNER - Bavarian --- # Bavarian GLiNER Model (v0.1) **GLiNER** is a Named Entity Recognition (NER) model that leverages bidirectional transformer encoders (similar to BERT) to detect any type of entity. It offers a practical alternative to traditional NER models, which are restricted to predefined entity types, and to Large Language Models (LLMs), which—while flexible—are often too large and expensive for resource-limited environments. The initial GLiNER models were trained mainly on English data. Thankfully, [GLiNER-X](https://huggingface.co/collections/knowledgator/gliner-x-684320a3f1220315c651d2f5) improved performance and adaptability across diverse languages using multilingual NER datasets. However, GLiNER-X does not support Bavarian at the moment, so this repository hosts the first GLiNER model for Bavarian 🥨 The Bavarian GLiNER model has the strong performing [GERTuraX-3](https://huggingface.co/gerturax/gerturax-3) as backbone model and was trained on over 100,000 sentences from the [Gemini-powered Bavarian NER Dataset](https://huggingface.co/datasets/bavarian-nlp/gemini-bavarian-ner-v0.1). # Installation & Usage Just install the latest GLiNER package incl. the tokenizers dependency to get started: ```python pip3 install gliner[tokenizers] -U ``` After that the Bavarian GLiNER is ready to use: ```python from gliner import GLiNER model = GLiNER.from_pretrained("bavarian-nlp/gliner-bavarian-v0.1") text = """Oktobafestln woan friaha in Bayern koa Sejtnheit. Se hom dozua deand, as eihglogade Meaznbia voam Ofong vo da neien Brausaison afz'braucha. D'Wuazln van heiting Mingara Oktobafest gengan 200 Joar zrugg. Zan easchnt Moi hods om 17. Oktoba 1810 stottg'fundn. Om 12. Oktoba 1810 hod ba da Hozadfeia van Kronprinz Ludwig (spada Ludwig I.) und Prinzessin Therese af ana Wiesn voa dena Stodmauan vo Minga a groß's Pferdlrenna stottg'fundn.""" label_set = ["location", "organization", "person", "prince", "event", "date"] entities = model.predict_entities(text, label_set, threshold=0.5) for entity in entities: print(entity["text"], "=>", entity["label"]) ``` outputs: ```text Oktobafestln => event Bayern => location Mingara Oktobafest => event 17. Oktoba 1810 => date 12. Oktoba 1810 => date Ludwig => prince Ludwig I. => prince Therese => prince Minga => location ``` # Changelog * 09.07.2025: Initial version of this repo. More details about evaluation and pretraining will follow! # Licence The Bavarian GLiNER models is licenced under CC-BY-SA-4.0.