TheBloke commited on
Commit
08d9358
·
1 Parent(s): d2c9679

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -28
README.md CHANGED
@@ -1,11 +1,11 @@
1
  ---
2
- base_model: migtissera/Tess-Medium-200K-v1.0
3
  inference: false
4
  license: other
5
  license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
6
  license_name: yi-34b
7
  model_creator: Migel Tissera
8
- model_name: Tess Medium 200K v1.0
9
  model_type: yi
10
  prompt_template: 'SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack
11
  when necessary to construct a clear, cohesive Chain of Thought reasoning. Always
@@ -37,14 +37,14 @@ quantized_by: TheBloke
37
  <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
38
  <!-- header end -->
39
 
40
- # Tess Medium 200K v1.0 - AWQ
41
  - Model creator: [Migel Tissera](https://huggingface.co/migtissera)
42
- - Original model: [Tess Medium 200K v1.0](https://huggingface.co/migtissera/Tess-Medium-200K-v1.0)
43
 
44
  <!-- description start -->
45
  ## Description
46
 
47
- This repo contains AWQ model files for [Migel Tissera's Tess Medium 200K v1.0](https://huggingface.co/migtissera/Tess-Medium-200K-v1.0).
48
 
49
  These files were quantised using hardware kindly provided by [Massed Compute](https://massedcompute.com/).
50
 
@@ -65,10 +65,10 @@ It is supported by:
65
  <!-- repositories-available start -->
66
  ## Repositories available
67
 
68
- * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-AWQ)
69
- * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-GPTQ)
70
- * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-GGUF)
71
- * [Migel Tissera's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/migtissera/Tess-Medium-200K-v1.0)
72
  <!-- repositories-available end -->
73
 
74
  <!-- prompt-template start -->
@@ -93,7 +93,7 @@ Models are released as sharded safetensors files.
93
 
94
  | Branch | Bits | GS | AWQ Dataset | Seq Len | Size |
95
  | ------ | ---- | -- | ----------- | ------- | ---- |
96
- | [main](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-AWQ/tree/main) | 4 | 128 | [wikitext](https://huggingface.co/datasets/wikitext/viewer/wikitext-2-raw-v1) | 4096 | 19.23 GB
97
 
98
  <!-- README_AWQ.md-provided-files end -->
99
 
@@ -105,11 +105,11 @@ Please make sure you're using the latest version of [text-generation-webui](http
105
  It is strongly recommended to use the text-generation-webui one-click-installers unless you're sure you know how to make a manual install.
106
 
107
  1. Click the **Model tab**.
108
- 2. Under **Download custom model or LoRA**, enter `TheBloke/Tess-Medium-200K-v1.0-AWQ`.
109
  3. Click **Download**.
110
  4. The model will start downloading. Once it's finished it will say "Done".
111
  5. In the top left, click the refresh icon next to **Model**.
112
- 6. In the **Model** dropdown, choose the model you just downloaded: `Tess-Medium-200K-v1.0-AWQ`
113
  7. Select **Loader: AutoAWQ**.
114
  8. Click Load, and the model will load and is now ready for use.
115
  9. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
@@ -127,7 +127,7 @@ Documentation on installing and using vLLM [can be found here](https://vllm.read
127
  For example:
128
 
129
  ```shell
130
- python3 -m vllm.entrypoints.api_server --model TheBloke/Tess-Medium-200K-v1.0-AWQ --quantization awq --dtype auto
131
  ```
132
 
133
  - When using vLLM from Python code, again set `quantization=awq`.
@@ -152,7 +152,7 @@ prompts = [prompt_template.format(prompt=prompt) for prompt in prompts]
152
 
153
  sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
154
 
155
- llm = LLM(model="TheBloke/Tess-Medium-200K-v1.0-AWQ", quantization="awq", dtype="auto")
156
 
157
  outputs = llm.generate(prompts, sampling_params)
158
 
@@ -172,7 +172,7 @@ Use TGI version 1.1.0 or later. The official Docker container is: `ghcr.io/huggi
172
  Example Docker parameters:
173
 
174
  ```shell
175
- --model-id TheBloke/Tess-Medium-200K-v1.0-AWQ --port 3000 --quantize awq --max-input-length 3696 --max-total-tokens 4096 --max-batch-prefill-tokens 4096
176
  ```
177
 
178
  Example Python code for interfacing with TGI (requires [huggingface-hub](https://github.com/huggingface/huggingface_hub) 0.17.0 or later):
@@ -239,7 +239,7 @@ pip3 install .
239
  ```python
240
  from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
241
 
242
- model_name_or_path = "TheBloke/Tess-Medium-200K-v1.0-AWQ"
243
 
244
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
245
  model = AutoModelForCausalLM.from_pretrained(
@@ -353,29 +353,23 @@ And thank you again to a16z for their generous grant.
353
 
354
  <!-- footer end -->
355
 
356
- # Original model card: Migel Tissera's Tess Medium 200K v1.0
357
 
358
 
359
  # Tess
360
 
361
- ![Tess](https://huggingface.co/migtissera/Tess-XL-v1.0/resolve/main/Tess.png)
362
 
363
- Tess, short for Tessoro/Tessoso, is a general purpose Large Language Model series. Tess-XS is trained on the Mistral-7B base.
 
 
364
 
365
 
366
  # Prompt Format:
367
 
368
  ```
369
- SYSTEM:
370
  USER: What is the relationship between Earth's atmosphere, magnetic field and gravity?
371
  ASSISTANT:
372
  ```
373
 
374
- # Synthia-CoT Format:
375
- Tess also supports Synthia-CoT format:
376
-
377
- ```
378
- SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation.
379
- USER:
380
- ASSISTANT:
381
- ```
 
1
  ---
2
+ base_model: migtissera/Tess-M-Creative-v1.0
3
  inference: false
4
  license: other
5
  license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
6
  license_name: yi-34b
7
  model_creator: Migel Tissera
8
+ model_name: Tess M Creative v1.0
9
  model_type: yi
10
  prompt_template: 'SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack
11
  when necessary to construct a clear, cohesive Chain of Thought reasoning. Always
 
37
  <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
38
  <!-- header end -->
39
 
40
+ # Tess M Creative v1.0 - AWQ
41
  - Model creator: [Migel Tissera](https://huggingface.co/migtissera)
42
+ - Original model: [Tess M Creative v1.0](https://huggingface.co/migtissera/Tess-M-Creative-v1.0)
43
 
44
  <!-- description start -->
45
  ## Description
46
 
47
+ This repo contains AWQ model files for [Migel Tissera's Tess M Creative v1.0](https://huggingface.co/migtissera/Tess-M-Creative-v1.0).
48
 
49
  These files were quantised using hardware kindly provided by [Massed Compute](https://massedcompute.com/).
50
 
 
65
  <!-- repositories-available start -->
66
  ## Repositories available
67
 
68
+ * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-AWQ)
69
+ * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-GPTQ)
70
+ * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-GGUF)
71
+ * [Migel Tissera's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/migtissera/Tess-M-Creative-v1.0)
72
  <!-- repositories-available end -->
73
 
74
  <!-- prompt-template start -->
 
93
 
94
  | Branch | Bits | GS | AWQ Dataset | Seq Len | Size |
95
  | ------ | ---- | -- | ----------- | ------- | ---- |
96
+ | [main](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-AWQ/tree/main) | 4 | 128 | [wikitext](https://huggingface.co/datasets/wikitext/viewer/wikitext-2-raw-v1) | 4096 | 19.23 GB
97
 
98
  <!-- README_AWQ.md-provided-files end -->
99
 
 
105
  It is strongly recommended to use the text-generation-webui one-click-installers unless you're sure you know how to make a manual install.
106
 
107
  1. Click the **Model tab**.
108
+ 2. Under **Download custom model or LoRA**, enter `TheBloke/Tess-M-Creative-v1.0-AWQ`.
109
  3. Click **Download**.
110
  4. The model will start downloading. Once it's finished it will say "Done".
111
  5. In the top left, click the refresh icon next to **Model**.
112
+ 6. In the **Model** dropdown, choose the model you just downloaded: `Tess-M-Creative-v1.0-AWQ`
113
  7. Select **Loader: AutoAWQ**.
114
  8. Click Load, and the model will load and is now ready for use.
115
  9. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
 
127
  For example:
128
 
129
  ```shell
130
+ python3 -m vllm.entrypoints.api_server --model TheBloke/Tess-M-Creative-v1.0-AWQ --quantization awq --dtype auto
131
  ```
132
 
133
  - When using vLLM from Python code, again set `quantization=awq`.
 
152
 
153
  sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
154
 
155
+ llm = LLM(model="TheBloke/Tess-M-Creative-v1.0-AWQ", quantization="awq", dtype="auto")
156
 
157
  outputs = llm.generate(prompts, sampling_params)
158
 
 
172
  Example Docker parameters:
173
 
174
  ```shell
175
+ --model-id TheBloke/Tess-M-Creative-v1.0-AWQ --port 3000 --quantize awq --max-input-length 3696 --max-total-tokens 4096 --max-batch-prefill-tokens 4096
176
  ```
177
 
178
  Example Python code for interfacing with TGI (requires [huggingface-hub](https://github.com/huggingface/huggingface_hub) 0.17.0 or later):
 
239
  ```python
240
  from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
241
 
242
+ model_name_or_path = "TheBloke/Tess-M-Creative-v1.0-AWQ"
243
 
244
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
245
  model = AutoModelForCausalLM.from_pretrained(
 
353
 
354
  <!-- footer end -->
355
 
356
+ # Original model card: Migel Tissera's Tess M Creative v1.0
357
 
358
 
359
  # Tess
360
 
361
+ ![Tess](https://huggingface.co/migtissera/Tess-M-v1.0/resolve/main/Tess.png)
362
 
363
+ Tess, short for Tessoro/Tessoso, is a general purpose Large Language Model series. Tess-M series is trained on the Yi-34B-200K base.
364
+
365
+ Tess-M-Creative is an AI most suited for creative tasks, such as writing, role play, design and exploring novel concepts. While it has been trained on STEM, its reasoning capabilities may lag state-of-the-art. Please download Tess-M-STEM series for reasoning, logic and STEM related tasks.
366
 
367
 
368
  # Prompt Format:
369
 
370
  ```
371
+ SYSTEM: <ANY SYSTEM CONTEXT>
372
  USER: What is the relationship between Earth's atmosphere, magnetic field and gravity?
373
  ASSISTANT:
374
  ```
375