Update README.md

ea062f0 verified over 1 year ago

4.33 kB

	---
	license: apache-2.0
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: text-generation
	---
	# Using NeyabAI:

	## Direct Use:

	```python
	import torch
	from transformers import GPT2LMHeadModel, GPT2TokenizerFast

	model = GPT2LMHeadModel.from_pretrained("XsoraS/NeyabAI")
	tokenizer = GPT2TokenizerFast.from_pretrained("XsoraS/NeyabAI")
	```
	```python
	def generate_response(prompt):
	inputs = tokenizer(prompt, return_tensors='pt') # You can add .to(torch.device("cuda")) to use GPU acceleration.
	return tokenizer.decode(model.generate(inputs.input_ids, max_length=512, do_sample=True,top_p=0.8, temperature=0.7, num_return_sequences=1,attention_mask=None)[0],skip_special_tokens=True)

	prompt = "Hello"
	response = ' '.join(map(str, str(generate_response("### Human: "+prompt+" \n### AI:")).replace("</s>","").split()))
	print(response)
	```

	## Fine-Tuning:

	This repository demonstrates how to fine-tune the NeyabAI(GPT-2) language model on a custom dataset using PyTorch and Hugging Face's Transformers library. The code provides an end-to-end example, from loading the dataset to training the model and evaluating its performance.

	## Requirements

	- Python 3.6+
	- PyTorch
	- Transformers (Hugging Face)
	- NumPy

	You can install the required packages using pip:
	```bash
	pip install torch transformers numpy
	```

	## Fine-Tuning Script
	The following script outlines the steps for fine-tuning GPT-2 on a custom dataset:
	```python
	from transformers import GPT2LMHeadModel, GPT2TokenizerFast, AdamW
	import torch
	from torch.utils.data import DataLoader, TensorDataset
	import numpy as np

	# Load pre-trained model and tokenizer
	model_name = "XsoraS/NeyabAI"
	model = GPT2LMHeadModel.from_pretrained(model_name)
	tokenizer = GPT2TokenizerFast.from_pretrained(model_name)
	tokenizer.pad_token = tokenizer.eos_token

	# Example dataset
	dataset = ["Your custom dataset goes here."] # Replace with your actual dataset

	# Tokenization function
	def tokenize_function(examples):
	return tokenizer(examples, padding='max_length', truncation=True, max_length=512)

	# Tokenize the dataset
	tokenized_inputs = [tokenize_function(text) for text in dataset]
	input_ids = [input['input_ids'] for input in tokenized_inputs]
	attention_masks = [input['attention_mask'] for input in tokenized_inputs]

	# Convert to torch tensors
	input_ids = torch.tensor(input_ids)
	attention_masks = torch.tensor(attention_masks)
	labels = input_ids.clone()

	# Create DataLoader
	batch_size = 8
	dataset = TensorDataset(input_ids, attention_masks, labels)
	dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

	# Configure device
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model.to(device)
	model = model.half()

	# Set up optimizer
	optimizer = AdamW(model.parameters(), lr=3e-5)

	# Define accuracy calculation
	def calculate_accuracy(preds, labels):
	pred_flat = np.argmax(preds, axis=-1).flatten()
	labels_flat = labels.flatten()
	return np.sum(pred_flat == labels_flat) / len(labels_flat)

	# Training loop (simplified)
	for epoch in range(3): # Adjust the number of epochs as needed
	for batch in dataloader:
	batch = tuple(t.to(device) for t in batch)
	input_ids, attention_masks, labels = batch

	outputs = model(input_ids, attention_mask=attention_masks, labels=labels)
	loss = outputs.loss
	logits = outputs.logits

	loss.backward()
	optimizer.step()
	optimizer.zero_grad()

	preds = logits.detach().cpu().numpy()
	label_ids = labels.to('cpu').numpy()
	acc = calculate_accuracy(preds, label_ids)

	print(f"Loss: {loss.item()}, Accuracy: {acc}")

	print("Training complete!")
	```

	## Notes

	- Dataset: Replace the `dataset` variable with your actual dataset.
	- Max Length: Adjust the `max_length` parameter in the `tokenize_function` as needed based on the length of your input texts.
	- Batch Size and Learning Rate: You may need to tune the `batch_size` and learning rate (`lr`) according to your dataset and hardware capabilities.
	- Epochs: Adjust the number of epochs based on your convergence criteria.

	## Acknowledgments

	- This project uses the [Transformers](https://huggingface.co/transformers/) library by Hugging Face.
	- Inspired by various fine-tuning examples and tutorials from the Hugging Face community.