Instructions to use marcoonorato91/LLAMUsic2-1b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use marcoonorato91/LLAMUsic2-1b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="marcoonorato91/LLAMUsic2-1b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("marcoonorato91/LLAMUsic2-1b")
model = AutoModelForCausalLM.from_pretrained("marcoonorato91/LLAMUsic2-1b", device_map="auto")

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use marcoonorato91/LLAMUsic2-1b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "marcoonorato91/LLAMUsic2-1b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "marcoonorato91/LLAMUsic2-1b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/marcoonorato91/LLAMUsic2-1b

SGLang

How to use marcoonorato91/LLAMUsic2-1b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "marcoonorato91/LLAMUsic2-1b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "marcoonorato91/LLAMUsic2-1b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "marcoonorato91/LLAMUsic2-1b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "marcoonorato91/LLAMUsic2-1b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use marcoonorato91/LLAMUsic2-1b with Docker Model Runner:
```
docker model run hf.co/marcoonorato91/LLAMUsic2-1b
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)

Model Information

WELCOME TO LLAMUSIC 2! The LLAMUsic is a finetuned version of Llama 3.2 instruction-tuned generative models in 1B size (text in/text out).

Model Developers: Marco Onorato, Riccardo Preite, Niccolò Monaco

Model Architecture: Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Supported Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported.

Llama 3.2 Model Family: Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability.

Model Release Date: 2025-03-11

Status: This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety.

License: MIT License, please use this with conscience.

Feedback: You can contact info.llamusic@gmail.com

Intended Use

Intended Use Cases: Llama 3.2 is intended for personal and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and query and prompt rewriting. Pretrained models can be adapted for a variety of additional natural language generation tasks. Similarly, quantized models can be adapted for a variety of on-device use-cases with limited compute resources.

Out of Scope: Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card.

How to use

Use with transformers

Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.

Make sure to update your transformers installation via pip install --upgrade transformers.

import torch
from transformers import pipeline
model_id = "marcoonorato91/LLAMUsic2-1b"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are LLAMUsic, an artificial intelligence expert of music."},
    {"role": "user", "content": "Write a guitar tab in the style of Metallica and include lyrics."},
]
outputs = pipe(
    messages,
    max_new_tokens=4096,
)
print(outputs[0]["generated_text"][-1])

Use with `ollama`

Please, follow the instructions here to install ollama

Then you can pull from the public llamusic ollama hub

Two models are available: the standard version and the Q4_K_M quantized version

Downloads last month: 4

Safetensors

Model size

1B params

Tensor type

F16

Model tree for marcoonorato91/LLAMUsic2-1b

Quantizations

1 model

marcoonorato91
/

LLAMUsic2-1b

Model Information

Intended Use

How to use

Use with transformers

Use with `ollama`

Model tree for marcoonorato91/LLAMUsic2-1b

Space using marcoonorato91/LLAMUsic2-1b 1

Model Information

Intended Use

How to use

Use with transformers

Use with ollama

Model tree for marcoonorato91/LLAMUsic2-1b

Space using marcoonorato91/LLAMUsic2-1b 1

Use with `ollama`