GLiNER Community

community

https://github.com/urchade/GLiNER/

Activity Feed

AI & ML interests

NER, relation extraction, information retrieval

Recent Activity

Ihor updated a model 25 days ago

gliner-community/gliner_large-v2.5

Ihor updated a model 25 days ago

gliner-community/gliner_medium-v2.5

Ihor updated a model 25 days ago

gliner-community/gliner_small-v2.5

View all activity

Ihor

updated 3 models 25 days ago

updated a Space 28 days ago

README

🏃

urchade

authored a paper about 1 month ago

Pioneer Agent: Continual Improvement of Small Language Models in Production

Paper • 2604.09791 • Published Apr 10 • 11

Ihor

authored a paper 3 months ago

The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder

Paper • 2602.18487 • Published Feb 11 • 6

Ihor

published a Space 3 months ago

README

🏃

Ihor

posted an update 3 months ago

Post

364

🧠 One Model to Classify, Verify, and Guard — Meet GLiClass-Instruct

When we first released GLiClass, it was a fast, zero-shot text classifier that could rival cross-encoders at a fraction of the cost. But classification alone wasn't enough. Our real ambition was a single, lightweight model that could handle the diverse range of text-understanding tasks via classification.
We are excited to announce GLiClass-Instruct – a leap forward that transforms GLiClass from a classifier into an instruction-following, multi-task engine.
What's new:
▪️ Hierarchical labeling: organize labels into structured groups for complex taxonomies
▪️ In-context learning via examples: provide few-shot examples to adapt on the fly, no fine-tuning needed
▪️ Prompting support: guide classification behavior with natural-language task descriptions
▪️ EWC for preventing catastrophic forgetting: add new capabilities without losing old ones
▪️ 3x faster inference thanks to FlashDeBERTa
New multi-task capabilities:
Beyond topic classification and sentiment analysis, GLiClass now supports:
▪️ Hallucination detection: verify whether LLM outputs are grounded in context
▪️ Rule-following verification: check if content complies with custom guidelines
▪️ Safety classification: detect prompt injections, jailbreaks, and harmful requests
These tasks are crucial for building reliable and efficient agentic systems, where every LLM output needs to be verified, every user input needs to be screened, and every response needs to follow the rules, all at minimal latency.
We release 3 instruction-following models (edge, base, large), with the large model matching SoTA classification models while unlocking entirely new task categories.
🔗 Explore more:
GitHub repo: https://github.com/Knowledgator/GLiClass
Models: https://huggingface.co/knowledgator/gliclass-multitask-large-v1.0
Our other solutions: https://www.knowledgator.com/

Ihor

posted an update 4 months ago

Post

318

Meet **GLinker** — an ultra-fast, modular, **zero-shot entity linking** framework 🚀

When we introduced the **GLiNER bi-encoder** in 2024, it enabled efficient zero-shot NER across hundreds of entity types. But that was just the beginning. Our bigger goal was always clear: **linking text to millions of entities dynamically, without retraining**.

In other words: **true entity linking at scale** ⚡

This unlocks powerful applications:
▪️ More precise search with real-world entity disambiguation
▪️ Knowledge graph construction across diverse document collections
▪️ Wikification — turning raw text into richly linked, navigable knowledge

After nearly two years of research + engineering, this vision is now real.

We’re excited to release **GLinker** — a **production-ready**, zero-shot entity linking system powered by our novel **GLiNER bi-encoder**. It efficiently detects entity spans of any length and matches them directly to entity descriptions — **no retraining required**.

**Why GLinker?**
▪️ Production-ready: multi-layer caching (Redis → Elasticsearch → PostgreSQL)
▪️ Research-friendly: fully configurable YAML pipelines
▪️ High performance: precomputed embeddings for bi-encoder models
▪️ Scalable by design: DAG-based execution + efficient batch processing

GLinker transforms raw text into **structured, disambiguated entity mentions**, bridging unstructured language with large, evolving knowledge bases.

🔗 Explore more:
GitHub: https://github.com/Knowledgator/GLinker
Report: https://github.com/Knowledgator/GLinker/blob/main/papers/GLiNER_bi_Encoder_paper.pdf
Linking models: https://huggingface.co/collections/knowledgator/gliner-linker
Bi-encoder models: https://huggingface.co/collections/knowledgator/gliner-bi-encoder

Ihor

posted an update 7 months ago

Post

1332

Hey builders 👷‍♀️

We’re Knowledgator, the team behind open-source NLP models like GLiNER, GLiClass, and many other used for zero-shot text classification and information extraction.

If you’ve explored them on Hugging Face or used our frameworks from GitHub, we’d love your input:
🧩 Which of our models, like GLiNER or zero-shot classifiers, do you find helpful in your practical workflows?
🧩 How’s the setup, performance, and accuracy been for you?
🧩 Anything confusing, buggy, or missing that would make your workflow smoother?

Your feedback helps us improve speed, clarity, and stability for everyone in the open-source community.

💬 Comment directly here or join the discussion. We read every one 😉:
GitHub: https://github.com/Knowledgator
Discord: https://discord.gg/GXRcAVJQ
HuggingFace:

knowledgator

📝 Want to shape our next release?
Click here to complete this 2-min survey: https://docs.google.com/forms/d/e/1FAIpQLSdyz2UMHrMDX8S9stpBk0wyfngtKSYzwk-02mN1VNYDdTw8OQ/viewform

Ihor

authored a paper 9 months ago

GLiClass: Generalist Lightweight Model for Sequence Classification Tasks

Paper • 2508.07662 • Published Aug 11, 2025 • 9

urchade

authored a paper 10 months ago

GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface

Paper • 2507.18546 • Published Jul 24, 2025 • 39

Ihor

authored a paper about 1 year ago

GLiNER-biomed: A Suite of Efficient Models for Open Biomedical Named Entity Recognition

Paper • 2504.00676 • Published Apr 1, 2025 • 5

Ihor

posted an update over 1 year ago

Post

1930

🚀 Reproducing DeepSeek R1 for Text-to-Graph Extraction

I’ve been working on replicating DeepSeek R1, focusing on zero-shot text-to-graph extraction—a challenging task where LMs extract entities and relations from text based on predefined types.

🧠 Key Insight:
Language models struggle when constrained by entity/relation types. Supervised training alone isn’t enough, but reinforcement learning (RL), specifically Guided Reward Policy Optimization (GRPO), shows promise.

💡 Why GRPO?
It trains the model to generate structured graphs, optimizing multiple reward functions (format, JSON validity, and extraction accuracy).
It allows the model to learn from both positive and hard negative examples dynamically.
RL can be fine-tuned to emphasize relation extraction improvements.

📊 Early Results:
Even with limited training, F1 scores consistently improved, and we saw clear benefits from RL-based optimization. More training = better performance!

🔬 Next Steps:
We’re scaling up experiments with larger models and high-quality data. Stay tuned for updates! Meanwhile, check out one of our experimental models here:
Ihor/Text2Graph-R1-Qwen2.5-0.5b

📔 Learn more details from the blog post: https://medium.com/p/d8b648d9f419

Feel free to share your thoughts and ask questions!

2 replies

Ihor

posted an update over 1 year ago

Post

1244

🚀 Welcome the New and Improved GLiNER-Multitask! 🚀

Since the release of our beta version, GLiNER-Multitask has received many positive responses. It's been embraced in many consulting, research, and production environments. Thank you everyone for your feedback, it helped us rethink the strengths and weaknesses of the first model and we are excited to present the next iteration of this multi-task information extraction model.

💡 What’s New?
Here are the key improvements in this latest version:
🔹 Expanded Task Support: Now includes text classification and other new capabilities.
🔹 Enhanced Relation Extraction: Significantly improved accuracy and robustness.
🔹 Improved Prompt Understanding: Optimized for open-information extraction tasks.
🔹 Better Named Entity Recognition (NER): More accurate and reliable results.

🔧 How We Made It Better:
These advancements were made possible by:
🔹 Leveraging a better and more diverse dataset.
🔹 Using a larger backbone model for increased capacity.
🔹 Implementing advanced model merging techniques.
🔹 Employing self-learning strategies for continuous improvement.
🔹 Better training strategies and hyperparameters tuning.

📄 Read the Paper: https://arxiv.org/abs/2406.12925
⚙️ Try the Model: knowledgator/gliner-multitask-v1.0
💻 Test the Demo: knowledgator/GLiNER_HandyLab
📌 Explore the Repo: https://github.com/urchade/GLiNER

Ihor

posted an update over 1 year ago

Post

434

🚀 Let’s transform LLMs into encoders 🚀

Auto-regressive LMs have ruled, but encoder-based architectures like GLiNER are proving to be just as powerful for information extraction while offering better efficiency and interpretability. 🔍✨

Past encoder backbones were limited by small pre-training datasets and old techniques, but with innovations like LLM2Vec, we've transformed decoders into high-performing encoders! 🔄💡

What’s New?
🔹Converted Llama & Qwen decoders to advanced encoders
🔹Improved GLiNER architecture to be able to work with rotary positional encoding
🔹New GLiNER (zero-shot NER) & GLiClass (zero-shot classification) models

🔥 Check it out:

New models: knowledgator/llm2encoder-66d1c76e3c8270397efc5b5e

GLiNER package: https://github.com/urchade/GLiNER

GLiClass package: https://github.com/Knowledgator/GLiClass

💻 Read our blog for more insights, and stay tuned for what’s next!
https://medium.com/@knowledgrator/llm2encoders-e7d90b9f5966

Ihor

updated a model over 1 year ago

gliner-community/gliner_xxl-v2.5

Token Classification • Updated Aug 31, 2024 • 22 • 7

Ihor

posted an update almost 2 years ago

Post

772

🚀 Meet the new GLiNER architecture 🚀
GLiNER revolutionized zero-shot NER by demonstrating that lightweight encoders can achieve excellent results. We're excited to continue R&D with this spirit 🔥. Our new bi-encoder and poly-encoder architectures were developed to address the main limitations of the original GLiNER architecture and bring the following new possibilities:

🔹 An unlimited number of entities can be recognized at once.
🔹Faster inference when entity embeddings are preprocessed.
🔹Better generalization to unseen entities.

While the bi-encoder architecture can lack inter-label understanding, we developed a poly-encoder architecture with post-fusion. It achieves the same or even better results on many benchmarking datasets compared to the original GLiNER, while still offering the listed advantages of bi-encoders.
Now, it’s possible to run GLiNER with hundreds of entities much faster and more reliably.

📌 Try the new models here:
knowledgator/gliner-bi-encoders-66c492ce224a51c54232657b

4 replies

Ihor

posted an update almost 2 years ago

Post

911

🚀 Meet Our New Line of Efficient and Accurate Zero-Shot Classifiers! 🚀

The new architecture brings better inter-label understanding and can solve complex classification tasks at a single forward pass.

Key Applications:
✅ Multi-class classification (up to 100 classes in a single run)
✅ Topic classification
✅ Sentiment analysis
✅ Event classification
✅ Prompt-based constrained classification
✅ Natural Language Inference
✅ Multi- and single-label classification

knowledgator/gliclass-6661838823756265f2ac3848
knowledgator/GLiClass_SandBox
knowledgator/gliclass-base-v1.0-lw

Ihor

authored a paper almost 2 years ago

GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks

Paper • 2406.12925 • Published Jun 14, 2024 • 25

AI & ML interests

Recent Activity

Team members 2

gliner-community's activity

README

README