Update README.md

5af05dc verified over 1 year ago

4.84 kB

	---
	license: afl-3.0
	language:
	- kk
	base_model: nur-dev/llama-1.9B-kaz
	library_name: transformers
	---

	# LLaMA 1.9B Kazakh Instruct Model

	This repository contains the LLaMA 1.9B model fine-tuned on a Kazakh language dataset for instruction-based tasks. The model is trained to provide helpful, relevant, and context-aware responses to various prompts in Kazakh. It is particularly effective in answering questions, providing explanations, and assisting in educational and professional contexts.
	This model comes with an integrated chat template that structures conversations for proper input formatting. The `Tokenizer` supports this feature, allowing for easier interaction by formatting messages before they are passed to the model.

	The template follows this structure:

	```jinja
	{%- if messages[0]['role'] == 'system' %}
	{%- set offset = 1 %}
	{%- else %}
	{%- set offset = 0 %}
	{%- endif %}
	<\|begin_of_text\|>
	{%- for message in messages %}
	{{- '<\|start_header_id\|>' + message['role'] + '<\|end_header_id\|>\n\n' + message['content'] \| trim + '<\|eot_id\|>' }}
	{%- endfor %}
	{{- '<\|start_header_id\|>' + 'көмекші' + '<\|end_header_id\|>\n\n' }}
	```
	## Model Details

	- Model Name: LLaMA 1.9B Kazakh Instruct
	- Model ID: `nur-dev/llama-1.9B-kaz-instruct`
	- Parameters: 1.94 billion
	- Architecture: Causal Language Model (LLaMA)
	- Tokenizer: LLaMA tokenizer
	- Language: Kazakh

	## Training Data

	The model was fine-tuned on a dataset containing 22000 samples designed for instruction-based tasks. The dataset includes a diverse set of prompts and responses to help the model learn to handle a wide range of topics, from everyday queries to specialized questions.

	## How to Use

	### Using the Model Directly for Inference
	Using the LlamaForCausalLM and AutoTokenizer classes to load a custom model, format a conversation, and generate a response using various generation parameters like top_k, top_p, and temperature.

	```python
	from transformers import LlamaForCausalLM, AutoTokenizer
	import torch

	# Load the model and tokenizer
	model_directory = "nur-dev/llama-1.9B-kaz-instruct"
	model = LlamaForCausalLM.from_pretrained(model_directory)
	tokenizer = AutoTokenizer.from_pretrained(model_directory)

	# Set the model to evaluation mode and move to appropriate device
	model.eval()
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model.to(device)

	# Example input in Kazakh

	# Conversation history
	conversation_history = [
	{"role": "system", "content": "Сіз сұрақтарға жауап беріп, ақпарат ұсынатын сенімді AI көмекшісісіз."},
	{"role": "пайдаланушы", "content": "Жасанды интеллект денсаулық сақтау саласына қандай өзгерістер енгізе алады?"}
	]

	# Format conversation using the chat template (custom method)
	formatted_conversation = tokenizer.apply_chat_template(conversation_history, tokenize=False)

	# Tokenize input
	input_ids = tokenizer.encode(formatted_conversation, return_tensors="pt").to(device)

	# Generate a response from the model
	with torch.no_grad():
	output = model.generate(
	input_ids,
	max_length=1000,
	num_return_sequences=1,
	pad_token_id=tokenizer.eos_token_id,
	no_repeat_ngram_size=2,
	early_stopping=True,
	do_sample=True,
	top_k=10,
	top_p=0.5,
	eos_token_id=tokenizer.eos_token_id,
	temperature=1.3
	)

	# Decode and print the model's response
	response = tokenizer.decode(output[0], skip_special_tokens=False)
	print(response)
	```
	### Using the Pipeline for Text Generation
	Using the pipeline API, which abstracts much of the setup, allowing you to generate responses with less boilerplate. The assistant responds in a “pirate” style to a user query.

	```python
	from transformers import pipeline

	# Initialize the text generation pipeline
	pipe = pipeline("text-generation", model="nur-dev/llama-1.9B-kaz-instruct")

	# Define the conversation messages
	messages = [
	{"role": "system", "content": "Сіз сұрақтарға жауап беріп, ақпарат ұсынатын сенімді AI көмекшісісіз."},
	{"role": "пайдаланушы", "content": "Жасанды интеллект денсаулық сақтау саласына қандай өзгерістер енгізе алады?"}
	]

	response = pipe(messages, max_new_tokens=128)[0]['generated_text']

	print(response)
	```


	@misc {nurgali_kadyrbek_2024,
	author = { {NURGALI Kadyrbek} },
	title = { llama-1.9B-kaz-instruct (Revision 4059a4e) },
	year = 2024,
	url = { https://huggingface.co/nur-dev/llama-1.9B-kaz-instruct },
	doi = { 10.57967/hf/3114 },
	publisher = { Hugging Face }
	}