Instructions to use Loom-Labs/Daedalus-1-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Loom-Labs/Daedalus-1-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Loom-Labs/Daedalus-1-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Loom-Labs/Daedalus-1-8B")
model = AutoModelForCausalLM.from_pretrained("Loom-Labs/Daedalus-1-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Loom-Labs/Daedalus-1-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Loom-Labs/Daedalus-1-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Loom-Labs/Daedalus-1-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Loom-Labs/Daedalus-1-8B

SGLang

How to use Loom-Labs/Daedalus-1-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Loom-Labs/Daedalus-1-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Loom-Labs/Daedalus-1-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Loom-Labs/Daedalus-1-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Loom-Labs/Daedalus-1-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use Loom-Labs/Daedalus-1-8B with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Loom-Labs/Daedalus-1-8B to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Loom-Labs/Daedalus-1-8B to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Loom-Labs/Daedalus-1-8B to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Loom-Labs/Daedalus-1-8B",
    max_seq_length=2048,
)

Docker Model Runner
How to use Loom-Labs/Daedalus-1-8B with Docker Model Runner:
```
docker model run hf.co/Loom-Labs/Daedalus-1-8B
```

Daedalus-1-8B

Daedalus-1-8B is an 8 billion parameter language model for code generation and reasoning, developed by Noema Research.
It is a finetuned derivative of Seed-Coder-8B-Reasoning,
with enhancements for instruction following, structured code generation, and improved safety alignment.

Model Overview

Base model: ByteDance-Seed/Seed-Coder-8B-Reasoning
Architecture: Decoder-only transformer
Parameters: ~8.25B
Context length: Long-context support (up to ~64k tokens)
Domain: Programming and natural language reasoning
Primary applications:
- Code generation and completion
- Debugging and error explanation
- Unit test generation
- Structured outputs (e.g., JSON, function calls)
License: MIT

Key Improvements

Relative to the base model, Daedalus introduces targeted post-training improvements:

Instruction tuning for developer-oriented tasks
Structured output fidelity, supporting JSON and schema-constrained responses
Enhanced reasoning for debugging and multi-step problem solving
Reduced error rate in code execution benchmarks
Safety-oriented adjustments, including avoidance of unsafe coding patterns

Usage

The model is released in Hugging Face Transformers format. Example:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "NoemaResearch/Daedalus-1-8B"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

messages = [
    {"role":"system", "content":"You are Daedalus, a coding assistant."},
    {"role":"user", "content":"Write a memory-efficient quicksort in Python with unit tests."}
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.2, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Recommended settings:

temperature=0.2–0.6 for deterministic code generation
top_p=0.9–0.95 for balanced creativity and correctness

Evaluation

Daedalus inherits strong performance on competitive programming and reasoning tasks from Seed-Coder-8B-Reasoning. Internal evaluations indicate:

Higher unit test pass rates
Improved structured output validity
Reduced incidence of hallucinated APIs

A comprehensive benchmark report will be released in future updates. For upstream benchmarks, please refer to the Seed-Coder-8B-Reasoning model card.

Limitations

Daedalus remains subject to common limitations of large language models:

Hallucinated libraries or functions: the model may generate non-existent APIs
Insecure coding patterns: suggestions should be reviewed for security and safety
Reasoning errors: multi-step solutions may fail on complex edge cases
Dependence on prompt quality: outputs are sensitive to phrasing and context

All generated code should be verified, linted, and tested before use in production.

Responsible Use

Do not provide secrets or credentials in prompts.
Use outputs only in controlled, sandboxed, or reviewed environments.
The model should not be employed for generating malicious software or unsafe code.
We encourage the use of additional guardrails (static analyzers, test harnesses, execution sandboxes) in deployment contexts.

Model Variants

Full-precision (safetensors) — for research and high-fidelity inference
bf16 / fp16 — for efficient inference on modern accelerators
Quantized variants (int8, int4) — for resource-constrained environments

Citation

If you use this model, please cite both Daedalus and the underlying Seed-Coder base model:

@misc{noema2025daedalus,
  title={Daedalus-1-8B},
  author={Noema Research},
  year={2025},
  howpublished={\url{https://huggingface.co/NoemaResearch/Daedalus-1-8B}}
}

Acknowledgements

Daedalus builds upon the Seed-Coder family of models developed by ByteDance-Seed. We thank the Seed team for releasing their models under permissive terms, enabling further research and refinement.