igidn
/

TinyPi-chat-V1

 library_name: transformers
 tags:
 - tinyllama
 - chat
 - fine-tuned
 - text-generation
 - peft
+---
+# TinyPi-Chat-V1
+TinyPi-Chat-V1 is a fine-tuned version of the `TinyLlama/TinyLlama-1.1B-Chat-v1.0` model. This project's goal was not to create a simple instruction-following assistant, but to cultivate an AI with a distinct, friendly, and engaging personality, mirroring the natural, witty, and sometimes quirky style of general-purpose Discord conversations.
+The model, which has named itself "Kat," demonstrates a unique persona that is both conversational and capable of surprisingly deep, philosophical exchanges. It was trained on a large dataset of chat logs, resulting in a model that excels at open-ended conversation, offers playful and sometimes evasive humor, and can maintain a consistent character.
+This version (v1) represents the initial, highly specialized fine-tune and serves as the foundation for further alignment using techniques like RLAIF.
+## How to Use
+This model is a merged, standalone model and can be used directly for text generation. It follows a specific chat template that must be used to get the best results.
+### Installation
+```bash
+pip install transformers torch accelerate
+```
+```Python
+from transformers import pipeline
+import torch
+model_path = "Kittykat924/TinyPi-chat-V1"
+pipe = pipeline(
+    "text-generation",
+    model=model_path,
+    torch_dtype=torch.float16,
+    device_map="auto"
+)
+prompt = "What do you think of today?"
+system_instruction = "You are a helpful and friendly AI assistant with a slightly feminine and humorous personality."
+messages = [
+    {"role": "system", "content": system_instruction},
+    {"role": "user", "content": prompt},
+]
+prompt_formatted = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+outputs = pipe(
+    prompt_formatted,
+    max_new_tokens=128,
+    do_sample=True,
+    temperature=0.7,
+    top_k=50,
+    top_p=0.95
+)
+response = outputs[0]["generated_text"]
+assistant_response = response.split("<|assistant|>")[1].strip()
+print(assistant_response)
+```
+# Training Procedure
+This model was trained using a custom script built on the Hugging Face accelerate, peft, and datasets libraries.
+# v1 Fine-tuning Details
+    Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
+    Dataset: A large, private dataset of over 2 million general-purpose Discord chat messages.
+    Training Method: Parameter-Efficient Fine-Tuning (PEFT) using the LoRA technique.
+    Hardware: 2x NVIDIA T4 GPUs on Kaggle.
+    Framework: accelerate for distributed training.
+# Key Hyperparameters:
+    Learning Rate: 2e-4
+    LoRA r (rank): 64
+    LoRA alpha: 16
+    Batch Size: 4 per device
+    Gradient Accumulation: 4 steps
+    Optimizer: AdamW
+The model was trained for approximately 2500 steps, with the final adapter chosen based on the lowest validation loss, which occurred very early in the training process (around step 200), indicating rapid specialization on the dataset. The final merged model uses the weights from this optimal checkpoint.
+# Project Goals
+The primary goal of this project was to explore the emergence of personality in language models. Instead of optimizing for factual accuracy or instruction-following, the training was designed to capture the nuances of human-to-human digital interaction. The success of this v1 model lies in its ability to generate responses that are not just correct but believable and in-character.
+The "weirdness" and occasional abstract responses are not viewed as bugs, but as features of a model that has learned a rich but ungrounded set of conversational styles.
+# Limitations and Bias
+This model was trained on a large corpus of public internet chat data. As such, it may have inherited biases, opinions, and language styles present in that data. It is not designed to be a source of factual information and may produce incorrect or nonsensical statements, especially on topics outside its training domain. It is intended for research and entertainment purposes. User discretion is advised.
+-*kittykat924*