How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RoeAcquisitions/aether-mini-v1
# Run inference directly in the terminal:
llama cli -hf RoeAcquisitions/aether-mini-v1
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RoeAcquisitions/aether-mini-v1
# Run inference directly in the terminal:
llama cli -hf RoeAcquisitions/aether-mini-v1
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf RoeAcquisitions/aether-mini-v1
# Run inference directly in the terminal:
./llama-cli -hf RoeAcquisitions/aether-mini-v1
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf RoeAcquisitions/aether-mini-v1
# Run inference directly in the terminal:
./build/bin/llama-cli -hf RoeAcquisitions/aether-mini-v1
Use Docker
docker model run hf.co/RoeAcquisitions/aether-mini-v1
Quick Links

Aether Mini V1

Aether Mini V1 is a fast, efficient 0.5B parameter language model optimized for general tasks, code generation, and instruction following. Based on Qwen2.5 architecture, fine-tuned for enterprise use cases.

Features

  • Fast Inference: Sub-second response times on consumer hardware
  • Code Generation: Optimized for Python, JavaScript, TypeScript, and more
  • Instruction Following: Fine-tuned for precise instruction adherence
  • Low Resource: Runs on CPU with minimal memory requirements

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("RoeAcquisitions/aether-mini-v1")
tokenizer = AutoTokenizer.from_pretrained("RoeAcquisitions/aether-mini-v1")

messages = [
    {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt")

generated_ids = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

API

Access via Aether Tech AI API:

curl https://aether-models.nebulahq.work/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "aether-mini-v1",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Pricing

  • Input: $0.10 per 1M tokens
  • Output: $0.30 per 1M tokens

License

Apache 2.0

Contact

Downloads last month
113
GGUF
Model size
0.5B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support