Entity linking and multilabel entity recognition

#3
by Yoelis - opened

Hello, first thanks for you work !

I'm exploring a two-step process to extract entities from different domains: iteratively building a small ontology using suggestions from the decoder and small data samples, grouping them and optionally linking them to DBpedia/Wikidata entries. Once I'm happy with my small ontology, I would like to feed it back to GLiNER so it extracts everything in one batch based on my custom ontology. But to do that, I need (1) to find a way to express some hierarchy—for instance, I would like to extract any person, but also mentions of e.g., Charlie Chaplin (and it should give me back "Chaplin" under the label "person" and also under the label "Charlie Chaplin"). I'd also like (2) some primitive entity resolution based on user knowledge ("CC" denotes Charlie Chaplin, so the model should extract "CC" with the label "Charlie Chaplin").

In the home page key features, you mention the multilabel entity recognition that would help me for (1). You also mention the possibility to do entity linking that I think would help for (2). Could you elaborate on how to use those features?
The best I could find was to express some hierarchy with labels such as "person_charlie_chaplin" that will match "charlie" or "chaplin" or "chaplin" and another label "person" that will match any other person in the text. Is there a better way?

Sign up or log in to comment