Instructions to use blockblockblock/General-Stories-Mistral-7B-bpw4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use blockblockblock/General-Stories-Mistral-7B-bpw4 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="blockblockblock/General-Stories-Mistral-7B-bpw4")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("blockblockblock/General-Stories-Mistral-7B-bpw4")
model = AutoModelForCausalLM.from_pretrained("blockblockblock/General-Stories-Mistral-7B-bpw4")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use blockblockblock/General-Stories-Mistral-7B-bpw4 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "blockblockblock/General-Stories-Mistral-7B-bpw4"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "blockblockblock/General-Stories-Mistral-7B-bpw4",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/blockblockblock/General-Stories-Mistral-7B-bpw4

SGLang

How to use blockblockblock/General-Stories-Mistral-7B-bpw4 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "blockblockblock/General-Stories-Mistral-7B-bpw4" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "blockblockblock/General-Stories-Mistral-7B-bpw4",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "blockblockblock/General-Stories-Mistral-7B-bpw4" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "blockblockblock/General-Stories-Mistral-7B-bpw4",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use blockblockblock/General-Stories-Mistral-7B-bpw4 with Docker Model Runner:
```
docker model run hf.co/blockblockblock/General-Stories-Mistral-7B-bpw4
```

General-Stories-Mistral-7B

This model is based on my dataset General-Stories-Collection which has 1.3 million stories especially meant for General audience.

After an extensive training period spanning over 15 days, this model has been meticulously honed to deliver captivating narratives with broad appeal. Leveraging a vast synthetic dataset comprising approximately 1.3 million stories tailored for diverse readership, this model possesses a deep understanding of narrative intricacies and themes. What sets my model apart is not just its ability to generate stories, but its capacity to evoke emotion, spark imagination, and forge connections with its audience.

I am excited to introduce this powerful tool, ready to spark imagination and entertain readers worldwide with its versatile storytelling capabilities.

As we embark on this exciting journey of AI storytelling, I invite you to explore the endless possibilities my model has to offer. Whether you're a writer seeking inspiration, a reader in search of a captivating tale, or a creative mind eager to push the boundaries of storytelling, my model is here to inspire, entertain, and enrich your literary experience.

Kindly note this is qLoRA version.

GGUF & Exllama

GGUF: TBA

Exllama: TBA

Training

Entire dataset was trained on 4 x A100 80GB. For 2 epoch, training took more than 15 Days. Axolotl codebase was used for training purpose. Entire data is trained on Mistral-7B-v0.1.

Example Prompt:

This model uses ChatML prompt format.

<|im_start|>system
You are a Helpful Assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

You can modify above Prompt as per your requirement.

I want to say special Thanks to the Open Source community for helping & guiding me to better understand the AI/Model development.

Thank you for your love & support.

Example Output

Example 1

Example 2

Example 3

Example 4

Downloads last month: 4

blockblockblock
/

General-Stories-Mistral-7B-bpw4

Dataset used to train blockblockblock/General-Stories-Mistral-7B-bpw4