timcryt
/

epo_lstm

Text Generation

Model card Files Files and versions

epo_lstm / README.md

timcryt's picture

Update README.md

051a9b1 verified 9 months ago

|

history blame contribute delete

1.57 kB

	---
	license: cc-by-sa-4.0
	language:
	- eo
	pipeline_tag: text-generation
	---

	## Description (Priskribo)

	This is a simple generative LSTM model for Esperanto, trained on public data from Esperanto Telegram groups, using internet slang and informal language style.

	Tio ĉi estas simpla generativa LSTM-modelo por Esperanto, trejnita per publikaj datumoj de esperantaj Telegram-grupoj, uzante interretan slangon kaj neformalan stilon.


	The model can be used in two modes - generative (in this case a simple query is enough), and dialog (in this case the query must end with the ">" symbol). Note - due to the peculiarities of the training data in the dialog mode the model barely preserves the semantics of the question, and using this mode is not recommended.

	La modelo uzeblas en du reĝimoj: generativa (ĉi-okaze sufiĉas simpla demando ) kaj dialoga (ĉi-okaze la demando devas finiĝi je la simbolo ">"). Bonvolu noti, ke pro la naturo de la trejnaj datumoj en dialoga reĝimo la modelo apenaŭ konservas la semantikon de la demando, kaj uzi ĉi tiun reĝimon estas malrekomendinde.

	## Using (Uzado)

	```python
	from epo_lstm import LSTMGeneratorPipeline, EOLSTMGenerator

	from transformers import PreTrainedTokenizerFast

	generator = LSTMGeneratorPipeline(
	model=EOLSTMGenerator.from_pretrained("./version_1"),
	tokenizer=PreTrainedTokenizerFast(
	tokenizer_file="./version_1/tokenizer.json",
	bos_token="[BOS]",
	eos_token="\n",
	unk_token="[UNK]",
	pad_token="[PAD]",
	),
	)

	print(generator('saluton, mi estas'))
	```