Instructions to use Mattimax/DAC60M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Mattimax/DAC60M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Mattimax/DAC60M")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Mattimax/DAC60M")
model = AutoModelForCausalLM.from_pretrained("Mattimax/DAC60M")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Mattimax/DAC60M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Mattimax/DAC60M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mattimax/DAC60M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Mattimax/DAC60M

SGLang

How to use Mattimax/DAC60M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Mattimax/DAC60M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mattimax/DAC60M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Mattimax/DAC60M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mattimax/DAC60M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Mattimax/DAC60M with Docker Model Runner:
```
docker model run hf.co/Mattimax/DAC60M
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

☕ Support my research

DAC60M

DAC60M è un compact language model sviluppato da M.INC. Research e addestrato da Mattimax, progettato per esplorare il trade-off tra dimensioni ridotte ed efficacia nella generazione testuale in lingua italiana.

Il modello adotta un’architettura LLaMA-style decoder-only, con un totale di ~67 milioni di parametri, ed è ottimizzato per scenari di ricerca, sperimentazione e deployment su risorse limitate.

Key Facts

Developer: M.INC. Research
Trainer: Mattimax
https://huggingface.co/Mattimax
Model type: Decoder-only Transformer (LLaMA-style causal LM)
Parameters: ~67M
Primary language: Italian

🔍 Overview

DAC60M nasce come modello leggero ma strutturalmente solido, pensato per:

sperimentare architetture LLaMA compatte,
effettuare fine-tuning rapidi e a basso costo,
testare pipeline conversational su hardware limitato,
fungere da base per distillazione o ricerca accademica.

L’obiettivo non è competere con modelli di scala superiore, ma offrire un baseline pulito, trasparente e facilmente estendibile nel segmento small language models.

🧠 Architecture

DAC60M utilizza una variante personalizzata di LlamaForCausalLM.

Core Configuration

{
  "architectures": ["LlamaForCausalLM"],
  "model_type": "llama",
  "hidden_size": 512,
  "intermediate_size": 2048,
  "num_hidden_layers": 8,
  "num_attention_heads": 8,
  "num_key_value_heads": 8,
  "head_dim": 64,
  "hidden_act": "silu",
  "max_position_embeddings": 2048,
  "vocab_size": 32768,
  "attention_bias": false,
  "attention_dropout": 0.0,
  "mlp_bias": false,
  "rms_norm_eps": 1e-06,
  "rope_theta": 10000.0,
  "rope_scaling": null,
  "tie_word_embeddings": false,
  "initializer_range": 0.02,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "torch_dtype": "float32",
  "use_cache": true,
  "transformers_version": "4.51.3"
}

🔤 Tokenizer

DAC60M utilizza il tokenizer del modello:

sapienzanlp/Minerva-350M-base-v1.0 https://huggingface.co/sapienzanlp/Minerva-350M-base-v1.0

Motivazione:

Vocabolario ampio (32k)
Buon supporto multilingua
Stabilità comprovata

📚 Training

Dettagli sul training:

Framework: PyTorch + HuggingFace Transformers
Objective: Causal Language Modeling
Precisione: float32

(Ulteriori dettagli su dataset, token count e schedule possono essere aggiunti se disponibili)

🎯 Intended Use

DAC60M è adatto per:

Generazione di testo
Autocompletamento
Chatbot sperimentali
Studio di scaling laws
Distillazione

Non è progettato per:

Uso medicale
Uso legale
Decision making critico

⚠️ Limitations

Capacità limitate rispetto a modelli >1B parametri
Possibili allucinazioni
Sensibile alla qualità dei prompt

🛡️ Ethical Considerations

Il modello può generare contenuti scorretti o fuorvianti. È responsabilità dell’utente:

Filtrare output
Implementare moderation
Evitare usi dannosi

🔁 Reproducibility

Per riprodurre l’ambiente:

pip install transformers==4.51.3 torch

📌 Citation

@misc{dac60m,
  title={DAC60M: A Compact LLaMA-style Language Model},
  author={M.INC. Research and Mattimax},
  year={2025},
  url={https://huggingface.co/Mattimax}
}