Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
---
|
| 2 |
-
base_model: Qwen/Qwen2.5-
|
| 3 |
tags:
|
| 4 |
- text-generation-inference
|
| 5 |
- transformers
|
|
@@ -19,16 +19,16 @@ pipeline_tag: text-generation
|
|
| 19 |
library_name: transformers
|
| 20 |
---
|
| 21 |
|
| 22 |
-

|
|
| 68 |
|
| 69 |
prompt = "How tall is the Eiffel tower?"
|
| 70 |
messages = [
|
| 71 |
-
{"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5
|
| 72 |
{"role": "user", "content": prompt}
|
| 73 |
]
|
| 74 |
text = tokenizer.apply_chat_template(
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: Qwen/Qwen2.5-7B-Instruct
|
| 3 |
tags:
|
| 4 |
- text-generation-inference
|
| 5 |
- transformers
|
|
|
|
| 19 |
library_name: transformers
|
| 20 |
---
|
| 21 |
|
| 22 |
+

|
| 23 |
|
| 24 |
+
# Zurich 7B GammaCorpus v2-10k
|
| 25 |
*A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
|
| 26 |
|
| 27 |
## Overview
|
| 28 |
+
Zurich 7B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 7B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-10k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-10k).
|
| 29 |
|
| 30 |
## Model Details
|
| 31 |
+
- **Base Model:** [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
|
| 32 |
- **Type:** Causal Language Models
|
| 33 |
- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
| 34 |
- **Number of Parameters:** 7.61B
|
|
|
|
| 38 |
|
| 39 |
## Training Details
|
| 40 |
|
| 41 |
+
Zurich-7B-GCv2-10k underwent fine-tuning with 1 T4 GPU for ~20 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-7B-GCv2-10k was trained for **60 Epochs**.
|
| 42 |
|
| 43 |
## Usage
|
| 44 |
|
|
|
|
| 57 |
```python
|
| 58 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 59 |
|
| 60 |
+
model_name = "rubenroy/Zurich-7B-GCv2-10k"
|
| 61 |
|
| 62 |
model = AutoModelForCausalLM.from_pretrained(
|
| 63 |
model_name,
|
|
|
|
| 68 |
|
| 69 |
prompt = "How tall is the Eiffel tower?"
|
| 70 |
messages = [
|
| 71 |
+
{"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 7B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
|
| 72 |
{"role": "user", "content": prompt}
|
| 73 |
]
|
| 74 |
text = tokenizer.apply_chat_template(
|