Instructions to use clibrain/mamba-2.8b-instruct-openhermes with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use clibrain/mamba-2.8b-instruct-openhermes with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="clibrain/mamba-2.8b-instruct-openhermes")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("clibrain/mamba-2.8b-instruct-openhermes", device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use clibrain/mamba-2.8b-instruct-openhermes with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "clibrain/mamba-2.8b-instruct-openhermes"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clibrain/mamba-2.8b-instruct-openhermes",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/clibrain/mamba-2.8b-instruct-openhermes

SGLang

How to use clibrain/mamba-2.8b-instruct-openhermes with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "clibrain/mamba-2.8b-instruct-openhermes" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clibrain/mamba-2.8b-instruct-openhermes",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "clibrain/mamba-2.8b-instruct-openhermes" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clibrain/mamba-2.8b-instruct-openhermes",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use clibrain/mamba-2.8b-instruct-openhermes with Docker Model Runner:
```
docker model run hf.co/clibrain/mamba-2.8b-instruct-openhermes
```

Script?

by Vezora - opened Jan 2, 2024

Discussion

Vezora

Jan 2, 2024

Could you please share the training script?

UphamProjects

Jan 3, 2024

I second that request. Is it anything like method used by Q-bert/Mamba-370M? Or was this done with shell commands? A python solution would be awesome. I used this to make summaries for some emails and calls with no training and it works pretty well for generation.

UphamProjects

Jan 11, 2024

Here's how I'm training mine. It would probably be safe to assume I don't know what I'm doing but I can finagle some half coherent answers from this
I'm using 370m state space model and training it on a random assortment of insurance pdf going over handbooks histories and basic comp. Nothing is organized, but it does train.
https://colab.research.google.com/drive/199DTxoqJFRwrsykIbZpuIxVd40RCP-LJ?usp=sharing

lkurlandski

Jan 13, 2024

@UphamProjects I think you need to open the access of that link such that "Anyone Can View".

UphamProjects

Jan 15, 2024

I've been tinkering with it, it's still not organized but should be accessible for now.

eramax

Jan 24, 2024

please share the training script?

UphamProjects

Jan 24, 2024

Here. You'll need to change paths and stuff, but this should let you train on colab.

https://colab.research.google.com/drive/16AKSrMI3jEgXfWObrJalmeypqZSSDySo?usp=sharing

eramax

Jan 25, 2024

Here. You'll need to change paths and stuff, but this should let you train on colab.

https://colab.research.google.com/drive/16AKSrMI3jEgXfWObrJalmeypqZSSDySo?usp=sharing

Thanks

ramzeez88

Mar 27, 2024

•

edited Mar 27, 2024

could you train the hugging face (transformers) mamba variant please?

UphamProjects

Mar 28, 2024

That's new from last I checked. Yeah I'll check it out.

UphamProjects

Mar 28, 2024

https://colab.research.google.com/drive/1HB69O16hFeQwLZdfIiGlqDrdQliY1Sbb?usp=sharing

Training is pretty straight forward works out of the box on colab as they say on the model page. Just mind the transformers install.

ramzeez88

Mar 28, 2024

i have never trained a model yet. I will have a look at the colab though :)

UphamProjects

Mar 28, 2024

Just a note use
!pip install git+https://github.com/huggingface/transformers@main
!pip install datasets trl peft mamba-ssm causal-conv1d>=1.2.0
not
!pip install git+https://github.com/huggingface/transformers@main
!pip install datasets trl peft
the second will still let you train but will royally hog the gpu

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment