SRDdev
/

MaskedLM

Model card Files Files and versions

MaskedLM / README.md

SRDdev's picture

Update README.md

1fece1b almost 3 years ago

|

1.38 kB

	---
	license: afl-3.0
	datasets:
	- WillHeld/hinglish_top
	language:
	- en
	- hi
	metrics:
	- accuracy
	library_name: transformers
	pipeline_tag: fill-mask
	---

	### SRDberta

	This is a BERT model trained for Masked Language Modeling for Hinglish Data.

	Hinglish is a term used to describe the hybrid language spoken in India, which combines elements of Hindi and English. It is commonly used in informal conversations and in media such as Bollywood films

	### Dataset
	Hinglish-Top [Dataset](https://huggingface.co/datasets/WillHeld/hinglish_top) columns
	- en_query
	- cs_query
	- en_parse
	- cs_parse
	- domain

	### Training
	\|Epoch\|Loss\|
	\|:--:\|:--:\|
	\|1 \|0.0485\|
	\|2 \|0.00837\|
	\|3 \|0.00812\|
	\|4 \|0.0029\|
	\|5 \|0.014\|
	\|6 \|0.00748\|
	\|7 \|0.0041\|
	\|8 \|0.00543\|
	\|9 \|0.00304\|
	\|10 \|0.000574\|

	### Inference
	```python
	from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline

	tokenizer = AutoTokenizer.from_pretrained("SRDdev/SRDBerta")

	model = AutoModelForMaskedLM.from_pretrained("SRDdev/SRDBerta")

	fill = pipeline('fill-mask', model='SRDberta', tokenizer='SRDberta')
	```
	```python
	fill_mask = fill.tokenizer.mask_token
	fill(f'Aap {fill_mask} ho?')
	```

	### Citation
	Author: @[SRDdev](https://huggingface.co/SRDdev)
	```
	Name : Shreyas Dixit
	framework : Pytorch
	Year: Jan 2023
	Pipeline : fill-mask
	Github : https://github.com/SRDdev
	LinkedIn : https://www.linkedin.com/in/srddev/
	```