TheBloke
/

Tess-M-Creative-v1.0-AWQ

@@ -1,11 +1,11 @@
 ---
-base_model: migtissera/Tess-Medium-200K-v1.0
 inference: false
 license: other
 license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
 license_name: yi-34b
 model_creator: Migel Tissera
-model_name: Tess Medium 200K v1.0
 model_type: yi
 prompt_template: 'SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack
   when necessary to construct a clear, cohesive Chain of Thought reasoning. Always
@@ -37,14 +37,14 @@ quantized_by: TheBloke
 <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
 <!-- header end -->
-# Tess Medium 200K v1.0 - AWQ
 - Model creator: [Migel Tissera](https://huggingface.co/migtissera)
-- Original model: [Tess Medium 200K v1.0](https://huggingface.co/migtissera/Tess-Medium-200K-v1.0)
 <!-- description start -->
 ## Description
-This repo contains AWQ model files for [Migel Tissera's Tess Medium 200K v1.0](https://huggingface.co/migtissera/Tess-Medium-200K-v1.0).
 These files were quantised using hardware kindly provided by [Massed Compute](https://massedcompute.com/).
@@ -65,10 +65,10 @@ It is supported by:
 <!-- repositories-available start -->
 ## Repositories available
-* [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-AWQ)
-* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-GPTQ)
-* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-GGUF)
-* [Migel Tissera's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/migtissera/Tess-Medium-200K-v1.0)
 <!-- repositories-available end -->
 <!-- prompt-template start -->
@@ -93,7 +93,7 @@ Models are released as sharded safetensors files.
 | Branch | Bits | GS | AWQ Dataset | Seq Len | Size |
 | ------ | ---- | -- | ----------- | ------- | ---- |
-| [main](https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-AWQ/tree/main) | 4 | 128 | [wikitext](https://huggingface.co/datasets/wikitext/viewer/wikitext-2-raw-v1) | 4096 | 19.23 GB
 <!-- README_AWQ.md-provided-files end -->
@@ -105,11 +105,11 @@ Please make sure you're using the latest version of [text-generation-webui](http
 It is strongly recommended to use the text-generation-webui one-click-installers unless you're sure you know how to make a manual install.
 1. Click the **Model tab**.
-2. Under **Download custom model or LoRA**, enter `TheBloke/Tess-Medium-200K-v1.0-AWQ`.
 3. Click **Download**.
 4. The model will start downloading. Once it's finished it will say "Done".
 5. In the top left, click the refresh icon next to **Model**.
-6. In the **Model** dropdown, choose the model you just downloaded: `Tess-Medium-200K-v1.0-AWQ`
 7. Select **Loader: AutoAWQ**.
 8. Click Load, and the model will load and is now ready for use.
 9. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
@@ -127,7 +127,7 @@ Documentation on installing and using vLLM [can be found here](https://vllm.read
 For example:
 ```shell
-python3 -m vllm.entrypoints.api_server --model TheBloke/Tess-Medium-200K-v1.0-AWQ --quantization awq --dtype auto
 ```
 - When using vLLM from Python code, again set `quantization=awq`.
@@ -152,7 +152,7 @@ prompts = [prompt_template.format(prompt=prompt) for prompt in prompts]
 sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
-llm = LLM(model="TheBloke/Tess-Medium-200K-v1.0-AWQ", quantization="awq", dtype="auto")
 outputs = llm.generate(prompts, sampling_params)
@@ -172,7 +172,7 @@ Use TGI version 1.1.0 or later. The official Docker container is: `ghcr.io/huggi
 Example Docker parameters:
 ```shell
---model-id TheBloke/Tess-Medium-200K-v1.0-AWQ --port 3000 --quantize awq --max-input-length 3696 --max-total-tokens 4096 --max-batch-prefill-tokens 4096
 ```
 Example Python code for interfacing with TGI (requires [huggingface-hub](https://github.com/huggingface/huggingface_hub) 0.17.0 or later):
@@ -239,7 +239,7 @@ pip3 install .
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
-model_name_or_path = "TheBloke/Tess-Medium-200K-v1.0-AWQ"
 tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
 model = AutoModelForCausalLM.from_pretrained(
@@ -353,29 +353,23 @@ And thank you again to a16z for their generous grant.
 <!-- footer end -->
-# Original model card: Migel Tissera's Tess Medium 200K v1.0
 # Tess
-![Tess](https://huggingface.co/migtissera/Tess-XL-v1.0/resolve/main/Tess.png)
-Tess, short for Tessoro/Tessoso, is a general purpose Large Language Model series. Tess-XS is trained on the Mistral-7B base.
 # Prompt Format:
 ```
-SYSTEM:
 USER: What is the relationship between Earth's atmosphere, magnetic field and gravity?
 ASSISTANT:
 ```
-# Synthia-CoT Format:
-Tess also supports Synthia-CoT format:
-```
-SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation.
-USER:
-ASSISTANT:
-```

 ---
+base_model: migtissera/Tess-M-Creative-v1.0
 inference: false
 license: other
 license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
 license_name: yi-34b
 model_creator: Migel Tissera
+model_name: Tess M Creative v1.0
 model_type: yi
 prompt_template: 'SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack
   when necessary to construct a clear, cohesive Chain of Thought reasoning. Always
 <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
 <!-- header end -->
+# Tess M Creative v1.0 - AWQ
 - Model creator: [Migel Tissera](https://huggingface.co/migtissera)
+- Original model: [Tess M Creative v1.0](https://huggingface.co/migtissera/Tess-M-Creative-v1.0)
 <!-- description start -->
 ## Description
+This repo contains AWQ model files for [Migel Tissera's Tess M Creative v1.0](https://huggingface.co/migtissera/Tess-M-Creative-v1.0).
 These files were quantised using hardware kindly provided by [Massed Compute](https://massedcompute.com/).
 <!-- repositories-available start -->
 ## Repositories available
+* [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-AWQ)
+* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-GPTQ)
+* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-GGUF)
+* [Migel Tissera's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/migtissera/Tess-M-Creative-v1.0)
 <!-- repositories-available end -->
 <!-- prompt-template start -->
 | Branch | Bits | GS | AWQ Dataset | Seq Len | Size |
 | ------ | ---- | -- | ----------- | ------- | ---- |
+| [main](https://huggingface.co/TheBloke/Tess-M-Creative-v1.0-AWQ/tree/main) | 4 | 128 | [wikitext](https://huggingface.co/datasets/wikitext/viewer/wikitext-2-raw-v1) | 4096 | 19.23 GB
 <!-- README_AWQ.md-provided-files end -->
 It is strongly recommended to use the text-generation-webui one-click-installers unless you're sure you know how to make a manual install.
 1. Click the **Model tab**.
+2. Under **Download custom model or LoRA**, enter `TheBloke/Tess-M-Creative-v1.0-AWQ`.
 3. Click **Download**.
 4. The model will start downloading. Once it's finished it will say "Done".
 5. In the top left, click the refresh icon next to **Model**.
+6. In the **Model** dropdown, choose the model you just downloaded: `Tess-M-Creative-v1.0-AWQ`
 7. Select **Loader: AutoAWQ**.
 8. Click Load, and the model will load and is now ready for use.
 9. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
 For example:
 ```shell
+python3 -m vllm.entrypoints.api_server --model TheBloke/Tess-M-Creative-v1.0-AWQ --quantization awq --dtype auto
 ```
 - When using vLLM from Python code, again set `quantization=awq`.
 sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
+llm = LLM(model="TheBloke/Tess-M-Creative-v1.0-AWQ", quantization="awq", dtype="auto")
 outputs = llm.generate(prompts, sampling_params)
 Example Docker parameters:
 ```shell
+--model-id TheBloke/Tess-M-Creative-v1.0-AWQ --port 3000 --quantize awq --max-input-length 3696 --max-total-tokens 4096 --max-batch-prefill-tokens 4096
 ```
 Example Python code for interfacing with TGI (requires [huggingface-hub](https://github.com/huggingface/huggingface_hub) 0.17.0 or later):
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
+model_name_or_path = "TheBloke/Tess-M-Creative-v1.0-AWQ"
 tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
 model = AutoModelForCausalLM.from_pretrained(
 <!-- footer end -->
+# Original model card: Migel Tissera's Tess M Creative v1.0
 # Tess
+![Tess](https://huggingface.co/migtissera/Tess-M-v1.0/resolve/main/Tess.png)
+Tess, short for Tessoro/Tessoso, is a general purpose Large Language Model series. Tess-M series is trained on the Yi-34B-200K base.
+Tess-M-Creative is an AI most suited for creative tasks, such as writing, role play, design and exploring novel concepts. While it has been trained on STEM, its reasoning capabilities may lag state-of-the-art. Please download Tess-M-STEM series for reasoning, logic and STEM related tasks.
 # Prompt Format:
 ```
+SYSTEM: <ANY SYSTEM CONTEXT>
 USER: What is the relationship between Earth's atmosphere, magnetic field and gravity?
 ASSISTANT:
 ```