@Maziyar Panahi first of all, love the work you are doing!
However, I am a bit overwhelmed by the volume of models available. I understand offering differently sized models (S/M/L) for different hardware environments and use cases but why different models for different datasets and concepts? I thought the whole benefit of using the GLiNER architecture was it's ability to enable zero-shot identification and support a multitude of semantically similar targets.
It seems like with the catalog of models you've provided, the user needs to determine the specific entities of interest prior to selecting a model (i.e. oncology vs. pharmacology vs. chemistry, etc.). Shouldn't a sufficiently large model with a sufficiently large (merged) dataset be capable of producing similar results without having to create a family of models for each family of concepts?
Just trying to understand the motivation here. Also, I see you released the GPT-OSS dataset for summarization (thank you!), have you also released your training data for the NER tasks somewhere which I'm not seeing? I'm trying to understand if there's a starting point I can use to try and synthesize all of these model variants into a single set of models.
Also, do you have a list of the specific labels you used when assessing the performance of the models against the benchmark datasets (I'm assuming it's not identical to the ground truth labels since some of them likely need to be collapsed)?
Any guidance is much appreciated!