| Official code for the paper ["Systematic Investigation of Strategies Tailored for Low-Resource Settings for Low-Resource Dependency Parsing"](https://arxiv.org/abs/2201.11374). | |
| If you use this code please cite our paper. | |
| ## Requirements | |
| * Python 3.7 | |
| * Pytorch 1.1.0 | |
| * Cuda 9.0 | |
| * Gensim 3.8.1 | |
| We assume that you have installed conda beforehand. | |
| ``` | |
| conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=9.0 -c pytorch | |
| pip install gensim==3.8.1 | |
| ``` | |
| ## Pretrained embeddings for Sanskrit | |
| * Pretrained FastText embeddings for STBC/VST can be obtained from [here](https://drive.google.com/drive/folders/1SwdEqikTq-N2vOL7QSUX2vqi3faZE7bq?usp=sharing). Make sure that `.txt` file is placed at `data/` | |
| * The main results are reported on the systems trained by combining train and dev splits. | |
| ## How to train model for Sanskrit | |
| To run proposed system: (1) Pretraining (2) Integration, then simply run bash script `run_STBC.sh` or `run_VST.sh` for the respective dataset. With these scripts you will be able to reproduce our results reported in Section-3 and Table 2. | |
| ```bash | |
| bash run_STBC.sh | |
| ``` | |
| ## Citations | |
| ``` | |
| @misc{sandhan_systematic, | |
| doi = {10.48550/ARXIV.2201.11374}, | |
| url = {https://arxiv.org/abs/2201.11374}, | |
| author = {Sandhan, Jivnesh and Behera, Laxmidhar and Goyal, Pawan}, | |
| keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, | |
| title = {Systematic Investigation of Strategies Tailored for Low-Resource Settings for Low-Resource Dependency Parsing}, | |
| publisher = {arXiv}, | |
| year = {2022}, | |
| copyright = {Creative Commons Attribution 4.0 International} | |
| } | |
| ``` | |
| ## Acknowledgements | |
| Our ensembled system is built on the top of ["DCST Implementation"](https://github.com/rotmanguy/DCST) | |