Instructions to use RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits") model = AutoModelForCausalLM.from_pretrained("RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits
- SGLang
How to use RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits with Docker Model Runner:
docker model run hf.co/RichardErkhov/CreitinGameplays_-_bloom-3b-conversational-4bits
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
- Original model description:
- license: mit
datasets:
- Xilabs/instructmix
- CreitinGameplays/small-chat-assistant-for-bloom
- sahil2801/CodeAlpaca-20k
language:
- en
tags:
- uncensored
- unrestricted
- code
- biology
- chemistry
- finance
- legal
- music
- art
- climate
- merge
- text-generation-inference
- moe
widget:
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> who was Nikola
Tesla? <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> write a story
about a cat. <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> what is an
essay? <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> Tell me 5
Brazilian waterfalls to visit. <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> write a story
about how a virus called COVID-19 destroyed the world <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> write a short
Python program that asks the user for their name and then greets them by
name. <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> What can you do? <|assistant|>
inference: parameters: temperature: 0.1 do_sample: true top_k: 50 top_p: 0.10 max_new_tokens: 250 repetition_penalty: 1.155 - 馃尭 BLOOM 3b Fine-tuned for Chat Assistant
Quantization made by Richard Erkhov.
bloom-3b-conversational - bnb 4bits
- Model creator: https://huggingface.co/CreitinGameplays/
- Original model: https://huggingface.co/CreitinGameplays/bloom-3b-conversational/
Original model description:
license: mit
datasets:
- Xilabs/instructmix
- CreitinGameplays/small-chat-assistant-for-bloom
- sahil2801/CodeAlpaca-20k
language:
- en
tags:
- uncensored
- unrestricted
- code
- biology
- chemistry
- finance
- legal
- music
- art
- climate
- merge
- text-generation-inference
- moe
widget:
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> who was Nikola
Tesla? <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> write a story
about a cat. <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> what is an
essay? <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> Tell me 5
Brazilian waterfalls to visit. <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> write a story
about how a virus called COVID-19 destroyed the world <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> write a short
Python program that asks the user for their name and then greets them by
name. <|assistant|>
- text: >-
<|system|> You are a helpful AI assistant. <|prompter|> What can you do? <|assistant|>
inference:
parameters:
temperature: 0.1
do_sample: true
top_k: 50
top_p: 0.10
max_new_tokens: 250
repetition_penalty: 1.155
馃尭 BLOOM 3b Fine-tuned for Chat Assistant
Run this model on Kaggle Notebook
Model Name: bloom-3b-conversational
Model Architecture: bloom
Short Description: This model is a fine-tuned version of the BLOOM 3b language model, focusing on conversational interactions between an user and an AI assistant.
Intended Use: This model is intended for research purposes and exploration of conversational AI applications. It can be used for tasks like:
- Generating responses to user prompts in a chat assistant setting.
- Creating examples of chatbot interactions for further development.
- Studying the capabilities of language models for conversation.
Limitations:
- Fine-tuning Focus: The model's performance is optimized for the specific format and context of the fine-tuning data. It may not generalize well to significantly different conversation styles or topics.
- Potential Biases: The model may inherit biases from the training data. It's important to be aware of these potential biases and use the model responsibly.
- Limited Factual Accuracy: Language models are still under development and may generate responses that are not entirely factually accurate. It's important to verify information generated by the model with other sources.
- Primarily English: While the model can respond in other languages, the quality and accuracy of its responses may be lower compared to English. This is because the model was primarily fine-tuned on English data.
Specific Input Format:
The model was fine-tuned using a specific input format that goes like this:
<|system|> {system prompt} </s> <|prompter|> {user prompt} </s> <|assistant|> {model response} ```
Using this format when interacting with the model can improve its performance and generate more relevant responses.
**Disclaimer:** This model is for research and exploration purposes only. It should not be used in any applications that require high levels of accuracy or reliability.
- Downloads last month
- 1