techysanoj commited on
Commit
08f1d7c
·
verified ·
1 Parent(s): d00f07d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -42
README.md CHANGED
@@ -23,8 +23,8 @@ tags:
23
  - indicnlp
24
  ---
25
 
26
- # IndicNER
27
- IndicNER is a model trained to complete the task of identifying named entities from sentences in Indian languages. Our model is specifically fine-tuned to the 11 Indian languages mentioned above over millions of sentences. The model is then benchmarked over a human annotated testset and multiple other publicly available Indian NER datasets.
28
  The 11 languages covered by IndicNER are: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.
29
 
30
  ## Training Corpus
@@ -39,48 +39,9 @@ Update 20 Dec 2022: We released a new paper documenting IndicNER and Naamapadam.
39
 
40
  You can use [this Colab notebook](https://colab.research.google.com/drive/1sYa-PDdZQ_c9SzUgnhyb3Fl7j96QBCS8?usp=sharing) for samples on using IndicNER or for finetuning a pre-trained model on Naampadam dataset to build your own NER models.
41
 
42
- <!-- citing information -->
43
- ## Citing
44
-
45
- If you are using IndicNER, please cite the following article:
46
- ```
47
- @misc{mhaske2022naamapadam,
48
- doi = {10.48550/ARXIV.2212.10168},
49
- url = {https://arxiv.org/abs/2212.10168},
50
- author = {Mhaske, Arnav and Kedia, Harshit and Doddapaneni, Sumanth and Khapra, Mitesh M. and Kumar, Pratyush and Murthy, Rudra and Kunchukuttan, Anoop},
51
- title = {Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages}
52
- publisher = {arXiv},
53
- year = {2022},
54
- copyright = {arXiv.org perpetual, non-exclusive license}
55
- }
56
-
57
- ```
58
- We would like to hear from you if:
59
-
60
- - You are using our resources. Please let us know how you are putting these resources to use.
61
- - You have any feedback on these resources.
62
 
63
 
64
  <!-- License -->
65
  ## License
66
 
67
- The IndicNER code (and models) are released under the MIT License.
68
-
69
- <!-- Contributors -->
70
- ## Contributors
71
- - Arnav Mhaske <sub> ([AI4Bharat](https://ai4bharat.org), [IITM](https://www.iitm.ac.in)) </sub>
72
- - Harshit Kedia <sub> ([AI4Bharat](https://ai4bharat.org), [IITM](https://www.iitm.ac.in)) </sub>
73
- - Sumanth Doddapaneni <sub> ([AI4Bharat](https://ai4bharat.org), [IITM](https://www.iitm.ac.in)) </sub>
74
- - Mitesh M. Khapra <sub> ([AI4Bharat](https://ai4bharat.org), [IITM](https://www.iitm.ac.in)) </sub>
75
- - Pratyush Kumar <sub> ([AI4Bharat](https://ai4bharat.org), [Microsoft](https://www.microsoft.com/en-in/), [IITM](https://www.iitm.ac.in)) </sub>
76
- - Rudra Murthy <sub> ([AI4Bharat](https://ai4bharat.org), [IBM](https://www.ibm.com))</sub>
77
- - Anoop Kunchukuttan <sub> ([AI4Bharat](https://ai4bharat.org), [Microsoft](https://www.microsoft.com/en-in/), [IITM](https://www.iitm.ac.in)) </sub>
78
-
79
- This work is the outcome of a volunteer effort as part of the [AI4Bharat initiative](https://ai4bharat.iitm.ac.in).
80
-
81
-
82
- <!-- Contact -->
83
- ## Contact
84
- - Anoop Kunchukuttan ([anoop.kunchukuttan@gmail.com](mailto:anoop.kunchukuttan@gmail.com))
85
- - Rudra Murthy V ([rmurthyv@in.ibm.com](mailto:rmurthyv@in.ibm.com))
86
-
 
23
  - indicnlp
24
  ---
25
 
26
+ # fine-tuned IndicNER
27
+ fine-tuned IndicNER is a model trained to complete the task of identifying named entities from sentences in Indian languages. Our model is specifically fine-tuned to the 11 Indian languages mentioned above over millions of sentences. The model is then benchmarked over a human annotated testset and multiple other publicly available Indian NER datasets.
28
  The 11 languages covered by IndicNER are: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.
29
 
30
  ## Training Corpus
 
39
 
40
  You can use [this Colab notebook](https://colab.research.google.com/drive/1sYa-PDdZQ_c9SzUgnhyb3Fl7j96QBCS8?usp=sharing) for samples on using IndicNER or for finetuning a pre-trained model on Naampadam dataset to build your own NER models.
41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
 
44
  <!-- License -->
45
  ## License
46
 
47
+ The fine-tuned-IndicNER code (and models) are released under the MIT License.