Spaces:

shayan5422
/

back_rag_huggingface

Paused

App Files Files Community

back_rag_huggingface / data /model_data_json /DeepPavlov_rubert-base-cased-conversational.json

shayan5422's picture

Upload 1308 files

e9162e8 verified 9 months ago

history blame contribute delete

1.37 kB

	{
	"model_id": "DeepPavlov/rubert-base-cased-conversational",
	"downloads": 216822,
	"tags": [
	"transformers",
	"pytorch",
	"jax",
	"bert",
	"feature-extraction",
	"ru",
	"endpoints_compatible",
	"region:us"
	],
	"description": "--- language: - ru --- # rubert-base-cased-conversational Conversational RuBERT \\(Russian, cased, 12‑layer, 768‑hidden, 12‑heads, 180M parameters\\) was trained on OpenSubtitles\\[1\\], Dirty, Pikabu, and a Social Media segment of Taiga corpus\\[2\\]. We assembled a new vocabulary for Conversational RuBERT model on this data and initialized the model with RuBERT. 08.11.2021: upload model with MLM and NSP heads \\[1\\]: P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation \\(LREC 2016\\) \\[2\\]: Shavrina T., Shapovalova O. \\(2017\\) TO THE METHODOLOGY OF CORPUS CONSTRUCTION FOR MACHINE LEARNING: «TAIGA» SYNTAX TREE CORPUS AND PARSER. in proc. of “CORPORA2017”, international conference , Saint-Petersbourg, 2017.",
	"model_explanation_gemini": "Russian conversational BERT model trained on diverse dialogue datasets for tasks like masked language modeling and next sentence prediction."
	}