Instructions to use zyoralabs/zyora-Byte-32B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use zyoralabs/zyora-Byte-32B with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")
model = PeftModel.from_pretrained(base_model, "zyoralabs/zyora-Byte-32B")

Transformers

How to use zyoralabs/zyora-Byte-32B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="zyoralabs/zyora-Byte-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("zyoralabs/zyora-Byte-32B", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use zyoralabs/zyora-Byte-32B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "zyoralabs/zyora-Byte-32B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zyoralabs/zyora-Byte-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/zyoralabs/zyora-Byte-32B

SGLang

How to use zyoralabs/zyora-Byte-32B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "zyoralabs/zyora-Byte-32B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zyoralabs/zyora-Byte-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "zyoralabs/zyora-Byte-32B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zyoralabs/zyora-Byte-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use zyoralabs/zyora-Byte-32B with Docker Model Runner:
```
docker model run hf.co/zyoralabs/zyora-Byte-32B
```

Zyora-Byte-32B

An AI Teaching Assistant for Anna University Engineering Students, fine-tuned from Qwen2.5-Coder-32B-Instruct using QLoRA.

Model Details

Model Description

Zyora-Byte-32B (ByteBuddy) is a 32B parameter language model fine-tuned to help engineering students at Anna University with syllabus information, course details, problem-solving, and coding assistance. It understands both formal queries and casual student language.

Developed by: Zyora Labs
Model type: Causal Language Model (LoRA Adapter)
Language(s): English
License: Apache 2.0
Finetuned from: Qwen/Qwen2.5-Coder-32B-Instruct

Model Sources

Repository: zyoralabs/zyora-Byte-32B

Uses

Direct Use

Answer syllabus questions for Anna University R2021 curriculum
Provide course information (credits, semesters, prerequisites)
Solve math problems (calculus, Laplace transforms, differential equations)
Generate code with explanations (C, Python)
Academic guidance for engineering students

Supported Branches

Computer Science and Engineering (CSE)
Electronics and Communication Engineering (ECE)
Electrical and Electronics Engineering (EEE)
Mechanical Engineering
Civil Engineering
Information Technology (IT)
Computer Science - Data Science
Computer Science - Cyber Security
Computer Science - Business Systems (CSBS)

Out-of-Scope Use

Non-Anna University curriculum queries
Medical, legal, or financial advice
Content generation for harmful purposes

Bias, Risks, and Limitations

Focused specifically on Anna University R2021 curriculum
Best performance for Semesters 1-4 (based on training data)
May hallucinate for courses not in the training dataset
Requires the base model (Qwen2.5-Coder-32B-Instruct) for inference

Recommendations

Users should verify critical academic information with official university sources. This model is meant to assist, not replace, official curriculum documents.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load base model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-32B-Instruct",
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "zyoralabs/zyora-Byte-32B")

# Generate response
prompt = """<|im_start|>system
You are ByteBuddy, an AI teaching assistant helping engineering students understand their coursework.<|im_end|>
<|im_start|>user
What are the subjects in CSE Semester 1?<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
Source: Custom dataset extracted from Anna University R2021 syllabus PDFs
Samples: 16,109 training examples
Content: Course information, credits, semesters, study tips, project ideas, placement prep, problem-solving examples
Training Procedure
Training Hyperparameters
Training regime: bf16 mixed precision
Batch size: 2 per device
Gradient accumulation: 4 steps
Effective batch size: 8
Learning rate: 2e-4
Scheduler: Cosine with 3% warmup
Optimizer: paged_adamw_8bit
Epochs: 3
Final loss: 0.00844
LoRA Configuration

LoraConfig(
    r=32,
    lora_alpha=64,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)
Speeds, Sizes, Times
Training time: ~9 hours total
Adapter size: 1.0 GB
Training cost: ~$35 (Modal)
Evaluation
Example Queries & Results
Query	Response
"List all courses in Semester 1 for CSE"	Full course list with codes and credits
"How many credits is MA25C01?"	4 credits
"bro what are the lab subjects in mech sem 2?"	Lab subjects list (understands casual language)
"Find the derivative of x³ + 2x² - 5x + 7"	3x² + 4x - 5
"Find the Laplace transform of e^(-2t)"	1/(s+2)
"Write a C program for factorial"	Complete working code with explanation
Environmental Impact
Hardware Type: NVIDIA A100-80GB
Hours used: ~9 hours
Cloud Provider: Modal
Compute Region: US
Technical Specifications
Model Architecture and Objective
Architecture: Transformer (Qwen2.5 architecture)
Parameters: 32B base + ~1GB LoRA adapters
Objective: Causal Language Modeling
Compute Infrastructure
Hardware
1x NVIDIA A100-80GB GPU
64GB System RAM
8 CPU cores
Software
PyTorch 2.5.0
Transformers 4.46.3
PEFT 0.18.1
bitsandbytes 0.44.0
Hardware Requirements for Inference
Setup	VRAM Required
4-bit Quantized	~18GB
8-bit Quantized	~32GB
Full Precision	~64GB
Citation

@misc{zyora-byte-32b-2025,
  author = {Zyora Labs},
  title = {Zyora-Byte-32B: AI Teaching Assistant for Anna University},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/zyoralabs/zyora-Byte-32B}
}
Model Card Authors
Zyora Labs

Model Card Contact
Zyora Labs on Hugging Face

Framework versions
PEFT 0.18.1
Transformers 4.46.3
PyTorch 2.5.0

Downloads last month: 2

Model tree for zyoralabs/zyora-Byte-32B

Base model

Qwen/Qwen2.5-32B

Finetuned

Qwen/Qwen2.5-Coder-32B

Finetuned

Qwen/Qwen2.5-Coder-32B-Instruct

Adapter

(71)

this model