| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | pipeline_tag: text-generation |
| | --- |
| | # Using NeyabAI: |
| |
|
| | ## Direct Use: |
| |
|
| | ```python |
| | import torch |
| | from transformers import GPT2LMHeadModel, GPT2TokenizerFast |
| | |
| | model = GPT2LMHeadModel.from_pretrained("XsoraS/NeyabAI") |
| | tokenizer = GPT2TokenizerFast.from_pretrained("XsoraS/NeyabAI") |
| | ``` |
| | ```python |
| | def generate_response(prompt): |
| | inputs = tokenizer(prompt, return_tensors='pt') # You can add .to(torch.device("cuda")) to use GPU acceleration. |
| | return tokenizer.decode(model.generate(inputs.input_ids, max_length=512, do_sample=True,top_p=0.8, temperature=0.7, num_return_sequences=1,attention_mask=None)[0],skip_special_tokens=True) |
| | |
| | prompt = "Hello" |
| | response = ' '.join(map(str, str(generate_response("### Human: "+prompt+" \n### AI:")).replace("</s>","").split())) |
| | print(response) |
| | ``` |
| |
|
| | ## Fine-Tuning: |
| |
|
| | This repository demonstrates how to fine-tune the NeyabAI(GPT-2) language model on a custom dataset using PyTorch and Hugging Face's Transformers library. The code provides an end-to-end example, from loading the dataset to training the model and evaluating its performance. |
| |
|
| | ## Requirements |
| |
|
| | - Python 3.6+ |
| | - PyTorch |
| | - Transformers (Hugging Face) |
| | - NumPy |
| |
|
| | You can install the required packages using pip: |
| | ```bash |
| | pip install torch transformers numpy |
| | ``` |
| |
|
| | ## Fine-Tuning Script |
| | The following script outlines the steps for fine-tuning GPT-2 on a custom dataset: |
| | ```python |
| | from transformers import GPT2LMHeadModel, GPT2TokenizerFast, AdamW |
| | import torch |
| | from torch.utils.data import DataLoader, TensorDataset |
| | import numpy as np |
| | |
| | # Load pre-trained model and tokenizer |
| | model_name = "XsoraS/NeyabAI" |
| | model = GPT2LMHeadModel.from_pretrained(model_name) |
| | tokenizer = GPT2TokenizerFast.from_pretrained(model_name) |
| | tokenizer.pad_token = tokenizer.eos_token |
| | |
| | # Example dataset |
| | dataset = ["Your custom dataset goes here."] # Replace with your actual dataset |
| | |
| | # Tokenization function |
| | def tokenize_function(examples): |
| | return tokenizer(examples, padding='max_length', truncation=True, max_length=512) |
| | |
| | # Tokenize the dataset |
| | tokenized_inputs = [tokenize_function(text) for text in dataset] |
| | input_ids = [input['input_ids'] for input in tokenized_inputs] |
| | attention_masks = [input['attention_mask'] for input in tokenized_inputs] |
| | |
| | # Convert to torch tensors |
| | input_ids = torch.tensor(input_ids) |
| | attention_masks = torch.tensor(attention_masks) |
| | labels = input_ids.clone() |
| | |
| | # Create DataLoader |
| | batch_size = 8 |
| | dataset = TensorDataset(input_ids, attention_masks, labels) |
| | dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True) |
| | |
| | # Configure device |
| | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| | model.to(device) |
| | model = model.half() |
| | |
| | # Set up optimizer |
| | optimizer = AdamW(model.parameters(), lr=3e-5) |
| | |
| | # Define accuracy calculation |
| | def calculate_accuracy(preds, labels): |
| | pred_flat = np.argmax(preds, axis=-1).flatten() |
| | labels_flat = labels.flatten() |
| | return np.sum(pred_flat == labels_flat) / len(labels_flat) |
| | |
| | # Training loop (simplified) |
| | for epoch in range(3): # Adjust the number of epochs as needed |
| | for batch in dataloader: |
| | batch = tuple(t.to(device) for t in batch) |
| | input_ids, attention_masks, labels = batch |
| | |
| | outputs = model(input_ids, attention_mask=attention_masks, labels=labels) |
| | loss = outputs.loss |
| | logits = outputs.logits |
| | |
| | loss.backward() |
| | optimizer.step() |
| | optimizer.zero_grad() |
| | |
| | preds = logits.detach().cpu().numpy() |
| | label_ids = labels.to('cpu').numpy() |
| | acc = calculate_accuracy(preds, label_ids) |
| | |
| | print(f"Loss: {loss.item()}, Accuracy: {acc}") |
| | |
| | print("Training complete!") |
| | ``` |
| |
|
| | ## Notes |
| |
|
| | - **Dataset:** Replace the `dataset` variable with your actual dataset. |
| | - **Max Length:** Adjust the `max_length` parameter in the `tokenize_function` as needed based on the length of your input texts. |
| | - **Batch Size and Learning Rate:** You may need to tune the `batch_size` and learning rate (`lr`) according to your dataset and hardware capabilities. |
| | - **Epochs:** Adjust the number of epochs based on your convergence criteria. |
| |
|
| | ## Acknowledgments |
| |
|
| | - This project uses the [Transformers](https://huggingface.co/transformers/) library by Hugging Face. |
| | - Inspired by various fine-tuning examples and tutorials from the Hugging Face community. |
| |
|