--- language: - sa # Sanskrit - en # English (since some models are English) - mr # Marathi license: apache-2.0 tags: - compound-type-identification - multi-task-learning - contextual-embedding - bert - xlm-roberta - pos-tagging - dependency-parsing - sanskrit - marathi - english - nlp - classification model_name: SaCTI datasets: - custom metrics: - accuracy - f1 --- # SaCTI: Sanskrit Compound Type Identifier Trained Models for the paper ["A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in Sanskrit"](https://aclanthology.org/2022.coling-1.358/). If you use these models please cite the paper. ## How to use the models ### 1. Clone the github repository of the paper ```bash git clone https://github.com/ashishgupta2598/SaCTI.git ``` ### 2. Create a new environment and activate it ```bash conda create --name sactienv python=3.9 conda activate sactienv ``` ### 3. Install all required packages ``` pip3 install -r requirements.txt ``` ### 4. Download the model corresponding to the experiment from hugging face repository from the Available Models list given below ```bash /save_models_english /save_models_marathi /save_models_saCTIbase_coarse /save_models_saCTIbase_fine /save_models_saCTIlarge_coarse /save_models_saCTIlarge_fine ``` Each of the above model has a `bert model`, `posdep model` and an `xlm-roberta-base model` ### Run following command in bash ```python python3 main.py --model_path='' --experiment='' --training= False ``` Following are the valid exp-names `english`, `marathi`, `sacti-base_coarse`, `sacti-base_fine`, `sacti-large_coarse`, `sacti-large_fine` **NOTE**: These models are obtained after running the training pipeline as mentioned in the official github repository using default `batch size = 75` and `epochs = 70`. ## Folder Structure ``` ├── LICENSE ├── README.md ├── save_models_english/ │ ├── bert/ │ │ └── model.pth │ ├── posdep/ │ │ └── model.pth │ └── xlm-roberta-base/ │ └── customized-mwt-ner/ │ ├── customized-mwt-ner.tagger.mdl │ └── customized-mwt-ner.vocabs.json ├── save_models_marathi/ │ └── ... (same structure as above) ├── save_models_saCTIbase_coarse/ │ └── ... (same structure as above) ├── save_models_saCTIbase_fine/ │ └── ... (same structure as above) ├── save_models_saCTIlarge_coarse/ │ └── ... (same structure as above) └── save_models_saCTIlarge_fine/ └── ... (same structure as above) ``` Each folder contains three models: 1. `bert/model.pth` 2. `posdep/model.pth` 3. `xlm-roberta-base/customized-mwt-ner/` >`customized-mwt-ner.tagger.mdl` >`customized-mwt-ner.vocabs.json` ## Citation ```bibtex @inproceedings{sandhan-etal-2022-novel, title = "A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in {S}anskrit", author = "Sandhan, Jivnesh and Gupta, Ashish and Terdalkar, Hrishikesh and Sandhan, Tushar and Samanta, Suvendu and Behera, Laxmidhar and Goyal, Pawan", booktitle = "Proceedings of the 29th International Conference on Computational Linguistics", month = oct, year = "2022", address = "Gyeongju, Republic of Korea", publisher = "International Committee on Computational Linguistics", url = "https://aclanthology.org/2022.coling-1.358", pages = "4071--4083", abstract = "The phenomenon of compounding is ubiquitous in Sanskrit. It serves for achieving brevity in expressing thoughts, while simultaneously enriching the lexical and structural formation of the language. In this work, we focus on the Sanskrit Compound Type Identification (SaCTI) task, where we consider the problem of identifying semantic relations between the components of a compound word. Earlier approaches solely rely on the lexical information obtained from the components and ignore the most crucial contextual and syntactic information useful for SaCTI. However, the SaCTI task is challenging primarily due to the implicitly encoded context-sensitive semantic relation between the compound components. Thus, we propose a novel multi-task learning architecture which incorporates the contextual information and enriches the complementary syntactic information using morphological tagging and dependency parsing as two auxiliary tasks. Experiments on the benchmark datasets for SaCTI show 6.1 points (Accuracy) and 7.7 points (F1-score) absolute gain compared to the state-of-the-art system. Further, our multi-lingual experiments demonstrate the efficacy of the proposed architecture in English and Marathi languages.", } ``` ## License This project is licensed under the terms of the `Apache license 2.0`. ## Acknowledgements 1. The models in this repository are obtained after training based on the [original paper](https://aclanthology.org/2022.coling-1.358/) 2. Official [Github repository](https://github.com/ashishgupta2598/SaCTI) of the paper.