MAIRK commited on
Commit
53f447c
·
verified ·
1 Parent(s): 9125d41

Upload MODEL_CARD.md

Browse files

---

license: mit
data:

* Chinese conversational pairs (Weibo, Zhihu)
* English conversational pairs (Reddit, StackExchange)
* Domain-specific Q\&A (IT, healthcare, finance)
language:
* zh
* en
metrics:
* C-Eval EM: 68.3%
* GPT4Bot-Bench F1: 72.1%
* SelfChat Sim: 0.87
base\_model: LLaMA 2 7B
new\_version: 1.0.0
pipeline\_tag: text-generation, conversational
auto\_detect:
* language
* sentiment
library\_name:
* llama.cpp
* FastAPI
tags:
* chatbot
* self-hosted
* bilingual
* low-latency
eval\_results: see Evaluation Results section
documentation: [https://huggingface.co/your-username/my-chatbot-llama2-7b](https://huggingface.co/your-username/my-chatbot-llama2-7b)

---

# Model Card for `my-chatbot-llama2-7b`

## Model Details

* **Model Name:** my-chatbot-llama2-7b
* **Version:** 1.0.0
* **Authors:** Your Name or Organization
* **License:** MIT License (see `LICENSE`)
* **Repository:** [https://huggingface.co/your-username/my-chatbot-llama2-7b](https://huggingface.co/your-username/my-chatbot-llama2-7b)
* **Library Dependencies:** llama.cpp (v0.1+), FastAPI, Python >=3.8
* **Hardware Requirements:** CPU-only (4+ cores, 8 GB RAM) or GPU (≥4 GB VRAM recommended)

## Model Description

`my-chatbot-llama2-7b` is a fine-tuned variant of Meta’s LLaMA 2 7B model, optimized for chatbot interactions in Chinese and English. The model has been adapted via supervised fine-tuning on a mixed dataset of conversational logs, code snippets, and knowledge-base Q\&A pairs. It supports up to 2048 tokens of context and responds with balanced informativeness and conciseness.

## Intended Use

* **Primary Use Cases:**

* Chatbot applications (customer support, personal assistant)
* FAQ generation and knowledge retrieval
* Low-latency on-premises inference
* **Users:** Developers seeking an open-source, self-hosted chat model.
* **Exclusions:** Not for generating disallowed content (hate speech, misinformation, medical or legal advice without expert oversight).

## How to Use

1. **Installation**

```bash
# Clone and build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make
pip install fastapi uvicorn
```
2. **Download Model Weights**
Obtain `llama2-7b.gguf` from Hugging Face or convert official weights:

```bash
python convert-llama2-to-gguf.py /path/to/llama2-7b /models/llama2-7b.gguf
```
3. **Run Inference API:**

```bash
uvicorn app:app --host 0.0.0.0 --port 8000 --reload
```
4. **Sample Request:**

```bash
curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{"prompt": "你好,世界!", "token": "YOUR_SECURE_TOKEN"}'
```

## Training Data

* **Base Model:** LLaMA 2 7B (Meta)
* **Fine-Tuning Data:**

* 200k Chinese conversational pairs (Weibo, Zhihu)
* 150k English conversational pairs (Reddit, StackExchange)
* 50k domain-specific Q\&A (IT, healthcare, finance)
* **Preprocessing:** Unicode normalization, deduplication, profanity filtering

## Evaluation Results

| Benchmark | Metric | Score | Notes |
| ------------------ | ---------- | ----- | --------------------------------- |
| C-Eval (Chinese) | EM | 68.3% | Compared against human reference |
| GPT4Bot-Bench | F1 | 72.1% | Conversational question answering |
| SelfChat Sim Score | Similarity | 0.87 | Diversity of responses |

## Limitations

* May occasionally produce plausible-sounding but incorrect answers (hallucinations).
* Limited knowledge cutoff: September 2023.
* Sensitive to prompt phrasing; may require few-shot examples for best performance.

## Ethical Considerations

* **Bias:** Inherits biases present in training data. Users should monitor and filter harmful outputs.
* **Privacy:** No personal data was used in fine-tuning.
* **Misuse Risk:** Could be used to generate misleading or spam content. Users should implement rate-limiting and content moderation.

## Citation

```bibtex


@misc
{mychatbot2025,
title = {my-chatbot-llama2-7b: A Self-Hosted Conversational AI},
author = {Your Name or Organization},
year = {2025},
howpublished = {\url{https://huggingface.co/your-username/my-chatbot-llama2-7b}}
}
```

Files changed (1) hide show
  1. MODEL_CARD.md +110 -17
MODEL_CARD.md CHANGED
@@ -1,36 +1,129 @@
1
  ---
2
- license: mit
3
 
 
4
  data:
5
 
6
- Chinese conversational pairs (Weibo, Zhihu)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
- English conversational pairs (Reddit, StackExchange)
 
 
 
 
 
 
 
9
 
10
- Domain-specific Q&A (IT, healthcare, finance)
 
 
 
11
 
12
- language: zh, en
 
 
 
13
 
14
- metrics:
 
 
 
 
15
 
16
- C-Eval EM: 68.3%
17
 
18
- GPT4Bot-Bench F1: 72.1%
 
19
 
20
- SelfChat Sim: 0.87
 
 
 
21
 
22
- base_model: LLaMA 2 7B
23
 
24
- new_version: 1.0.0
 
 
 
 
25
 
26
- pipeline_tag: text-generation, conversational
27
 
28
- auto_detect: language, sentiment
 
 
29
 
30
- library_name: llama.cpp, FastAPI
31
 
32
- tags: chatbot, self-hosted, bilingual, low-latency
 
 
33
 
34
- eval_results: see Evaluation Results section
35
 
36
- documentation: https://huggingface.co/your-username/my-chatbot-llama2-7b
 
 
 
 
 
 
 
 
1
  ---
 
2
 
3
+ license: mit
4
  data:
5
 
6
+ * Chinese conversational pairs (Weibo, Zhihu)
7
+ * English conversational pairs (Reddit, StackExchange)
8
+ * Domain-specific Q\&A (IT, healthcare, finance)
9
+ language:
10
+ * zh
11
+ * en
12
+ metrics:
13
+ * C-Eval EM: 68.3%
14
+ * GPT4Bot-Bench F1: 72.1%
15
+ * SelfChat Sim: 0.87
16
+ base\_model: LLaMA 2 7B
17
+ new\_version: 1.0.0
18
+ pipeline\_tag: text-generation, conversational
19
+ auto\_detect:
20
+ * language
21
+ * sentiment
22
+ library\_name:
23
+ * llama.cpp
24
+ * FastAPI
25
+ tags:
26
+ * chatbot
27
+ * self-hosted
28
+ * bilingual
29
+ * low-latency
30
+ eval\_results: see Evaluation Results section
31
+ documentation: [https://huggingface.co/your-username/my-chatbot-llama2-7b](https://huggingface.co/your-username/my-chatbot-llama2-7b)
32
+
33
+ ---
34
+
35
+ # Model Card for `my-chatbot-llama2-7b`
36
+
37
+ ## Model Details
38
+
39
+ * **Model Name:** my-chatbot-llama2-7b
40
+ * **Version:** 1.0.0
41
+ * **Authors:** Your Name or Organization
42
+ * **License:** MIT License (see `LICENSE`)
43
+ * **Repository:** [https://huggingface.co/your-username/my-chatbot-llama2-7b](https://huggingface.co/your-username/my-chatbot-llama2-7b)
44
+ * **Library Dependencies:** llama.cpp (v0.1+), FastAPI, Python >=3.8
45
+ * **Hardware Requirements:** CPU-only (4+ cores, 8 GB RAM) or GPU (≥4 GB VRAM recommended)
46
+
47
+ ## Model Description
48
+
49
+ `my-chatbot-llama2-7b` is a fine-tuned variant of Meta’s LLaMA 2 7B model, optimized for chatbot interactions in Chinese and English. The model has been adapted via supervised fine-tuning on a mixed dataset of conversational logs, code snippets, and knowledge-base Q\&A pairs. It supports up to 2048 tokens of context and responds with balanced informativeness and conciseness.
50
+
51
+ ## Intended Use
52
+
53
+ * **Primary Use Cases:**
54
+
55
+ * Chatbot applications (customer support, personal assistant)
56
+ * FAQ generation and knowledge retrieval
57
+ * Low-latency on-premises inference
58
+ * **Users:** Developers seeking an open-source, self-hosted chat model.
59
+ * **Exclusions:** Not for generating disallowed content (hate speech, misinformation, medical or legal advice without expert oversight).
60
+
61
+ ## How to Use
62
+
63
+ 1. **Installation**
64
 
65
+ ```bash
66
+ # Clone and build llama.cpp
67
+ git clone https://github.com/ggerganov/llama.cpp
68
+ cd llama.cpp && make
69
+ pip install fastapi uvicorn
70
+ ```
71
+ 2. **Download Model Weights**
72
+ Obtain `llama2-7b.gguf` from Hugging Face or convert official weights:
73
 
74
+ ```bash
75
+ python convert-llama2-to-gguf.py /path/to/llama2-7b /models/llama2-7b.gguf
76
+ ```
77
+ 3. **Run Inference API:**
78
 
79
+ ```bash
80
+ uvicorn app:app --host 0.0.0.0 --port 8000 --reload
81
+ ```
82
+ 4. **Sample Request:**
83
 
84
+ ```bash
85
+ curl -X POST http://localhost:8000/generate \
86
+ -H "Content-Type: application/json" \
87
+ -d '{"prompt": "你好,世界!", "token": "YOUR_SECURE_TOKEN"}'
88
+ ```
89
 
90
+ ## Training Data
91
 
92
+ * **Base Model:** LLaMA 2 7B (Meta)
93
+ * **Fine-Tuning Data:**
94
 
95
+ * 200k Chinese conversational pairs (Weibo, Zhihu)
96
+ * 150k English conversational pairs (Reddit, StackExchange)
97
+ * 50k domain-specific Q\&A (IT, healthcare, finance)
98
+ * **Preprocessing:** Unicode normalization, deduplication, profanity filtering
99
 
100
+ ## Evaluation Results
101
 
102
+ | Benchmark | Metric | Score | Notes |
103
+ | ------------------ | ---------- | ----- | --------------------------------- |
104
+ | C-Eval (Chinese) | EM | 68.3% | Compared against human reference |
105
+ | GPT4Bot-Bench | F1 | 72.1% | Conversational question answering |
106
+ | SelfChat Sim Score | Similarity | 0.87 | Diversity of responses |
107
 
108
+ ## Limitations
109
 
110
+ * May occasionally produce plausible-sounding but incorrect answers (hallucinations).
111
+ * Limited knowledge cutoff: September 2023.
112
+ * Sensitive to prompt phrasing; may require few-shot examples for best performance.
113
 
114
+ ## Ethical Considerations
115
 
116
+ * **Bias:** Inherits biases present in training data. Users should monitor and filter harmful outputs.
117
+ * **Privacy:** No personal data was used in fine-tuning.
118
+ * **Misuse Risk:** Could be used to generate misleading or spam content. Users should implement rate-limiting and content moderation.
119
 
120
+ ## Citation
121
 
122
+ ```bibtex
123
+ @misc{mychatbot2025,
124
+ title = {my-chatbot-llama2-7b: A Self-Hosted Conversational AI},
125
+ author = {Your Name or Organization},
126
+ year = {2025},
127
+ howpublished = {\url{https://huggingface.co/your-username/my-chatbot-llama2-7b}}
128
+ }
129
+ ```