jefson08
/

kha-roberta

Model card Files Files and versions

kha-roberta / README.md

jefson08's picture

Update README.md

8733fe2 verified about 1 year ago

|

history blame contribute delete

2.71 kB


	# Khasi Fill-Mask Model

	This project demonstrates how to use the Hugging Face Transformers library to perform a fill-mask task using the `jefson08/kha-roberta` model. The fill-mask task predicts the most likely token(s) to replace the `[MASK]` token in a given sentence.

	---

	## Usage

	### 1. Import Dependencies

	```python
	from transformers import pipeline, AutoTokenizer
	```

	### 2. Initialize the Model and Tokenizer

	Load the tokenizer and model pipeline:

	```python
	# Initialisation
	tokenizer = AutoTokenizer.from_pretrained('jefson08/kha-roberta')
	fill_mask = pipeline(
	"fill-mask",
	model="jefson08/kha-roberta",
	tokenizer=tokenizer,
	device="cuda", # Use "cuda" for GPU or omit for CPU
	)
	```

	### 3. Predict the [MASK] Token

	Provide a sentence with a `[MASK]` token for prediction:

	```python
	# Predict [MASK] token
	sentence = "Nga dei u briew u ba [MASK] bha."
	predictions = fill_mask(sentence)

	# Display predictions
	for prediction in predictions:
	print(f"{prediction['sequence']} (score: {prediction['score']:.4f})")
	```

	---

	## Example Output

	Given the input sentence:

	```plaintext
	"Nga dei u briew u ba [MASK] bha."
	```

	The model might output:

	```plaintext
	[{'score': 0.09230164438486099,
	'token': 6086,
	'token_str': 'mutlop',
	'sequence': 'Nga dei u briew u ba mutlop bha.'},
	{'score': 0.051360130310058594,
	'token': 2059,
	'token_str': 'stad',
	'sequence': 'Nga dei u briew u ba stad bha.'},
	{'score': 0.045497000217437744,
	'token': 1864,
	'token_str': 'khuid',
	'sequence': 'Nga dei u briew u ba khuid bha.'},
	{'score': 0.04180142655968666,
	'token': 668,
	'token_str': 'kham',
	'sequence': 'Nga dei u briew u ba kham bha.'},
	{'score': 0.027332570403814316,
	'token': 2817,
	'token_str': 'khlaiñ',
	'sequence': 'Nga dei u briew u ba khlaiñ bha.'}]
	```

	---

	## Model Information

	The `jefson08/kha-roberta` model is fine-tuned for Khasi text tasks. It uses the fill-mask pipeline to predict and replace `[MASK]` tokens in sentences, providing insights into contextual language understanding.


	---

	## Dependencies

	- [Transformers](https://huggingface.co/docs/transformers): Provides the pipeline and model-loading utilities.
	- [PyTorch](https://pytorch.org/): Backend framework for running the model.

	Install the dependencies with:

	```bash
	pip install transformers torch
	```

	---

	## Acknowledgements

	- Hugging Face [Transformers](https://huggingface.co/docs/transformers) library.
	- Model by [N Donald Jefferson Thabah](https://huggingface.co/jefson08/kha-roberta).

	---

	## License

	This project is licensed under the MIT License. See the [LICENSE](./LICENSE) file for more details.

	---