---
language:
  - sa  # Sanskrit
  - en  # English (since some models are English)
  - mr  # Marathi
license: apache-2.0
tags:
  - compound-type-identification
  - multi-task-learning
  - contextual-embedding
  - bert
  - xlm-roberta
  - pos-tagging
  - dependency-parsing
  - sanskrit
  - marathi
  - english
  - nlp
  - classification
model_name: SaCTI
datasets:
  - custom
metrics:
  - accuracy
  - f1
---


# SaCTI: Sanskrit Compound Type Identifier

Trained Models for the paper ["A Novel Multi-Task Learning Approach for  Context-Sensitive Compound Type Identification in Sanskrit"](https://aclanthology.org/2022.coling-1.358/). If you use these models please cite the paper.
 

## How to use the models
### 1. Clone the github repository of the paper
```bash
git clone https://github.com/ashishgupta2598/SaCTI.git
```

### 2. Create a new environment and activate it 
```bash
conda create --name sactienv python=3.9
conda activate sactienv
```

### 3. Install all required packages
```
pip3 install -r requirements.txt
```

### 4. Download the model corresponding to the experiment from hugging face repository from the Available Models list given below

```bash   
/save_models_english
/save_models_marathi
/save_models_saCTIbase_coarse
/save_models_saCTIbase_fine
/save_models_saCTIlarge_coarse
/save_models_saCTIlarge_fine

```
Each of the above model has a `bert model`, `posdep model` and an `xlm-roberta-base model`
### Run following command in bash
```python
python3 main.py --model_path='<path to downloaded model>' --experiment='<exp-name>' --training= False
```
Following are the valid exp-names  
`english`, `marathi`, `sacti-base_coarse`, `sacti-base_fine`, `sacti-large_coarse`, `sacti-large_fine`

**NOTE**: These models are obtained after running the training pipeline as mentioned in the official github repository using default `batch size = 75` and `epochs = 70`.

## Folder Structure
```
├── LICENSE
├── README.md
├── save_models_english/
│ ├── bert/
│ │  └── model.pth
│ ├── posdep/
│ │  └── model.pth
│ └── xlm-roberta-base/
│    └── customized-mwt-ner/
│       ├── customized-mwt-ner.tagger.mdl
│       └── customized-mwt-ner.vocabs.json
├── save_models_marathi/
│ └── ... (same structure as above)
├── save_models_saCTIbase_coarse/
│ └── ... (same structure as above)
├── save_models_saCTIbase_fine/
│ └── ... (same structure as above)
├── save_models_saCTIlarge_coarse/
│ └── ... (same structure as above)
└── save_models_saCTIlarge_fine/
└── ... (same structure as above)
```
Each folder contains three models:
1. `bert/model.pth`
2. `posdep/model.pth`
3. `xlm-roberta-base/customized-mwt-ner/`  
    >`customized-mwt-ner.tagger.mdl`  
    >`customized-mwt-ner.vocabs.json`


## Citation
```bibtex
@inproceedings{sandhan-etal-2022-novel,
    title = "A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in {S}anskrit",
    author = "Sandhan, Jivnesh  and Gupta, Ashish  and Terdalkar, Hrishikesh  and Sandhan, Tushar  and Samanta, Suvendu  and Behera, Laxmidhar  and Goyal, Pawan",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics",
    url = "https://aclanthology.org/2022.coling-1.358",
    pages = "4071--4083",
    abstract = "The phenomenon of compounding is ubiquitous in Sanskrit. It serves for achieving brevity in expressing thoughts, while simultaneously enriching the lexical and structural formation of the language. In this work, we focus on the Sanskrit Compound Type Identification (SaCTI) task, where we consider the problem of identifying semantic relations between the components of a compound word. Earlier approaches solely rely on the lexical information obtained from the components and ignore the most crucial contextual and syntactic information useful for SaCTI. However, the SaCTI task is challenging primarily due to the implicitly encoded context-sensitive semantic relation between the compound components. Thus, we propose a novel multi-task learning architecture which incorporates the contextual information and enriches the complementary syntactic information using morphological tagging and dependency parsing as two auxiliary tasks. Experiments on the benchmark datasets for SaCTI show 6.1 points (Accuracy) and 7.7 points (F1-score) absolute gain compared to the state-of-the-art system. Further, our multi-lingual experiments demonstrate the efficacy of the proposed architecture in English and Marathi languages.",
}
```

## License
This project is licensed under the terms of the `Apache license 2.0`.

## Acknowledgements
1. The models in this repository are obtained after training based on the [original paper](https://aclanthology.org/2022.coling-1.358/) 

2. Official [Github repository](https://github.com/ashishgupta2598/SaCTI) of the paper.