Instructions to use sparrowaisolutions/aras-ember-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use sparrowaisolutions/aras-ember-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="sparrowaisolutions/aras-ember-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("sparrowaisolutions/aras-ember-v2")
model = AutoModelForCausalLM.from_pretrained("sparrowaisolutions/aras-ember-v2")

Inference
Local Apps Settings

vLLM

How to use sparrowaisolutions/aras-ember-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sparrowaisolutions/aras-ember-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sparrowaisolutions/aras-ember-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/sparrowaisolutions/aras-ember-v2

SGLang

How to use sparrowaisolutions/aras-ember-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "sparrowaisolutions/aras-ember-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sparrowaisolutions/aras-ember-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "sparrowaisolutions/aras-ember-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sparrowaisolutions/aras-ember-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use sparrowaisolutions/aras-ember-v2 with Docker Model Runner:
```
docker model run hf.co/sparrowaisolutions/aras-ember-v2
```

🔥 Aras-Ember v2

Aras-Ember v2 is a lightweight conversational AI model developed by Sparrow AI Solutions.

It is built by fine-tuning Gemma-2-2B, a language model created by Google, using the Ember dataset.

Aras-Ember v2 is designed for research, experimentation, and lightweight conversational AI applications.

This project is independent and not affiliated with Google.

⚠️ Important Notice

This model is a derivative work of the Gemma model family released by Google.

Use of this model is subject to:

Gemma Terms of Use https://ai.google.dev/gemma/terms

and

Gemma Prohibited Use Policy https://ai.google.dev/gemma/prohibited_use_policy

By downloading, using, or distributing this model you agree to comply with those terms.

[] (https://colab.research.google.com/drive/1wjdkj7niIiKeBpVVjnyBrSYu88lCtZxn?usp=sharing)

🧠 Model Details

Model name: Aras-Ember v2 Developer: Sparrow AI Solutions Base model: google/gemma-2-2b Model architecture: Gemma decoder-only transformer Parameter count: ~2 Billion

Training Approach

LoRA instruction tuning
Conversational fine-tuning
LoRA weights merged into the base model
Exported as a standalone full model

Frameworks Used

Transformers
PEFT
PyTorch
Hugging Face Datasets

📚 Dataset

Training dataset:

https://huggingface.co/datasets/sparrowaisolutions/ember-dataset

Dataset Structure

{
  "instruction": "...",
  "response": "..."
}

The dataset consists of instruction–response conversational pairs designed for training instruction‑following language models.

Dataset Intended Uses

Conversational AI
Creative generation
Instruction following

⚙️ Training Details

Training Configuration

Base model: Gemma‑2‑2B
Training method: LoRA fine‑tuning
Dataset size: ~30,000 examples
Epochs: 2

Optimization

Mixed precision (FP16)
Gradient accumulation
LoRA adapters merged after training

Training objective:

Instruction‑following conversational generation.

🚀 Usage

Install Dependencies

pip install transformers torch

Example Inference

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "sparrowaisolutions/aras-ember-v2"

tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

prompt = """
You are Aras-Ember, a creative AI assistant.

Write a short poem about the sea and the moon.
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=150,
    temperature=1.1,
    top_p=0.95,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

💬 Prompt Format

Recommended prompt format:

You are Aras-Ember, a creative AI assistant.

Prompt: <instruction>
Response:

Example:

Prompt: Explain black holes simply.
Response:

🧪 Intended Use

Aras‑Ember v2 is intended for:

Conversational AI research
Educational projects
Experimentation with LLM fine‑tuning
Creative writing generation
Lightweight chatbots
AI development testing

❌ Out-of-Scope Uses

This model should not be used for:

Medical advice
Legal advice
Safety‑critical systems
Automated decision‑making
Misinformation generation
Illegal or harmful activities

Users are responsible for how the model is used.

⚠️ Limitations

Because this is a relatively small language model:

May generate incorrect or fabricated information
Limited reasoning ability
Limited long‑context understanding
Performance depends on prompt quality
Not suitable for high‑stakes applications

🏗 Architecture

Base architecture:

Gemma‑2‑2B by Google

Reference model:

https://huggingface.co/google/gemma-2-2b

This project modifies the base model through instruction fine‑tuning only.

No architectural changes were made.

👨‍💻 Authors

Developed by:

Sparrow AI Solutions

Hugging Face profile:

https://huggingface.co/sparrowaisolutions

❤️ Acknowledgements

Special thanks to:

Google Gemma team
Hugging Face
Open‑source AI community

📜 License

This project is released under the Apache License 2.0.

However, because the model is derived from Gemma, use of the model is also subject to:

Gemma Terms of Use https://ai.google.dev/gemma/terms

Gemma Prohibited Use Policy https://ai.google.dev/gemma/prohibited_use_policy

Users must comply with both licenses when using or distributing this model.

⚖️ Disclaimer

The model is provided "AS IS", without warranty of any kind.

The developers are not responsible for any damages, misuse, or consequences resulting from the use of this model.

Users assume full responsibility for ensuring compliance with applicable laws and policies.

Research Paper

EMBER Dataset and ARAS-EMBER Models: Open Lightweight AI Systems for Creative and Conversational Language Generation

DOI: https://doi.org/10.6084/m9.figshare.31617994

📅 Version

Current version: Aras‑Ember v2 Release date: 2026

Downloads last month: 19

Safetensors

Model size

3B params

Tensor type

F16

Model tree for sparrowaisolutions/aras-ember-v2

Base model

google/gemma-2-2b

Finetuned

(563)

this model

sparrowaisolutions
/

aras-ember-v2