File size: 3,769 Bytes
7c4936d
 
 
 
b900f86
25db02c
 
af1b4a2
 
 
 
 
faa8e09
af1b4a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0ee8aff
af1b4a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3cf0421
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
license: mit
language:
- en
new_version: Kittykat924/TinyPi-Chat-v1.5
base_model:
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
---

# TinyPi-Chat-V1

TinyPi-Chat-V1 is a fine-tuned version of the `TinyLlama/TinyLlama-1.1B-Chat-v1.0` model. This project's goal was not to create a simple instruction-following assistant, but to cultivate an AI with a distinct, friendly, and engaging personality, mirroring the natural, witty, and sometimes quirky style of general-purpose Discord conversations.
, It was trained on a large dataset of chat logs, resulting in a model that excels at open-ended conversation, offers playful and sometimes evasive humor, and can maintain a consistent character.

This version (v1) represents the initial, highly specialized fine-tune and serves as the foundation for further alignment using techniques like RLAIF.

## How to Use

This model is a merged, standalone model and can be used directly for text generation. It follows a specific chat template that must be used to get the best results.

### Installation

```bash
pip install transformers torch accelerate
```

```Python
from transformers import pipeline
import torch

model_path = "Kittykat924/TinyPi-chat-V1"
pipe = pipeline(
    "text-generation",
    model=model_path,
    torch_dtype=torch.float16,
    device_map="auto"
)

prompt = "What do you think of today?"

messages = [
    {"role": "user", "content": prompt},
]

prompt_formatted = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

outputs = pipe(
    prompt_formatted,
    max_new_tokens=128,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)

response = outputs[0]["generated_text"]
assistant_response = response.split("<|assistant|>")[1].strip()
print(assistant_response)
```
# Training Procedure

This model was trained using a custom script built on the Hugging Face accelerate, peft, and datasets libraries.
# v1 Fine-tuning Details

    Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

    Dataset: A large, private dataset of over 2 million general-purpose Discord chat messages.

    Training Method: Parameter-Efficient Fine-Tuning (PEFT) using the LoRA technique.

    Hardware: 2x NVIDIA T4 GPUs

    Framework: accelerate for distributed training.
# Key Hyperparameters:

    Learning Rate: 2e-4

    LoRA r (rank): 64

    LoRA alpha: 16

    Batch Size: 4 per device

    Gradient Accumulation: 4 steps

    Optimizer: AdamW

The model was trained for approximately 2500 steps, with the final adapter chosen based on the lowest validation loss, which occurred very early in the training process (around step 200), indicating rapid specialization on the dataset. The final merged model uses the weights from this optimal checkpoint.

# Project Goals

The primary goal of this project was to explore the emergence of personality in language models. Instead of optimizing for factual accuracy or instruction-following, the training was designed to capture the nuances of human-to-human digital interaction. The success of this v1 model lies in its ability to generate responses that are not just correct but believable and in-character.

The "weirdness" and occasional abstract responses are not viewed as bugs, but as features of a model that has learned a rich but ungrounded set of conversational styles.

# Limitations and Bias

This model was trained on a large corpus of public internet chat data. As such, it may have inherited biases, opinions, and language styles present in that data. It is not designed to be a source of factual information and may produce incorrect or nonsensical statements, especially on topics outside its training domain. It is intended for research and entertainment purposes. User discretion is advised.


-*igidn*