Instructions to use Etherll/Mellum-4b-sft-rust with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Etherll/Mellum-4b-sft-rust with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Etherll/Mellum-4b-sft-rust")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Etherll/Mellum-4b-sft-rust") model = AutoModelForCausalLM.from_pretrained("Etherll/Mellum-4b-sft-rust") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Etherll/Mellum-4b-sft-rust with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Etherll/Mellum-4b-sft-rust" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Etherll/Mellum-4b-sft-rust", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Etherll/Mellum-4b-sft-rust
- SGLang
How to use Etherll/Mellum-4b-sft-rust with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Etherll/Mellum-4b-sft-rust" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Etherll/Mellum-4b-sft-rust", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Etherll/Mellum-4b-sft-rust" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Etherll/Mellum-4b-sft-rust", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Unsloth Studio new
How to use Etherll/Mellum-4b-sft-rust with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Etherll/Mellum-4b-sft-rust to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Etherll/Mellum-4b-sft-rust to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Etherll/Mellum-4b-sft-rust to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Etherll/Mellum-4b-sft-rust", max_seq_length=2048, ) - Docker Model Runner
How to use Etherll/Mellum-4b-sft-rust with Docker Model Runner:
docker model run hf.co/Etherll/Mellum-4b-sft-rust
Etherll/Mellum-4b-sft-rust
Etherll/Mellum-4b-sft-rust is a large language model (LLM) fine-tuned specifically for Rust code Fill-in-the-Middle (FIM) tasks. It is built upon JetBrains/Mellum-4b-base model.
This model has been fine-tuned on the Etherll/CodeFIM-Rust-Mellum dataset, which comprises approximately 57,000 Rust-specific FIM examples, to enhance its proficiency in completing Rust code snippets accurately and contextually.
A GGUF version for CPU inference is also available: Etherll/Mellum-4b-sft-rust-GGUF.
Model Description
This model leverages the LLaMA-style architecture of Mellum-4b-base (4 billion parameters) and its extensive pre-training on over 4 trillion tokens. The fine-tuning process focused on adapting the model to the nuances of Rust syntax and common coding patterns for FIM tasks.
Key Features:
- Specialized for Rust: Optimized for Fill-in-the-Middle tasks in Rust.
- Based on Mellum-4b-base: Benefits from JetBrains' robust base model.
- Efficient: Suitable for both cloud and local deployment.
- IDE Integration Ready: Designed for use in developer tooling, and works particularly well with Continue.dev for an enhanced coding assistant experience.
Fine-tuning Data
- Dataset:
Etherll/CodeFIM-Rust-Mellum - Size: ~57,000 rows
- Focus: Rust code Fill-in-the-Middle
FIM Format
This model is trained to recognize a specific format for Fill-in-the-Middle tasks. When providing input for FIM, please use the following structure:
<filename>{{{filename}}}
<fim_suffix>{{{suffix_code}}}<fim_prefix>{{{prefix_code}}}<fim_middle>
How to Use
With Continue.dev
For the best integrated development experience, it's highly recommended to use this model with Continue.dev.
Refer to the Continue.dev documentation for instructions on how to add custom LLMs.
GGUF Version
A GGUF version is available at Etherll/Mellum-4b-sft-rust-GGUF. This format is suitable for local inference on CPU (and GPU with appropriate llama.cpp/Ollama builds) using tools like:
Support & Community
If you need any help, have questions, or just want to chat, feel free to message me on Discord: etherl
- Downloads last month
- 14
