Tiny QA Model (2M)

A 2M-parameter question-answering model built to probe the lower limits of how small a usable generative QA model can be. It produces somewhat coherent responses to questions, given its extreme size constraints.

Model Details

Parameters: ~2M (1.5M non-embedding)
Architecture: Llama (loadable with any standard Llama-compatible loader)
Language: English
Training data: ishanb3d/synthetic_qa

Prompt Format

Prompts should follow this exact format:

<bos>Question: What is the purpose of unit testing in software projects?\nAnswer:

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ishanb3d/atto-language-model"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = "<bos>Question: What is the purpose of unit testing in software projects?\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Use

This model is intended exclusively for research and development — for example, studying small-model behavior, capability limits, and synthetic-data training dynamics.

Limitations

At only 2M parameters, output quality is limited. Responses may be incoherent, factually wrong, or otherwise unreliable, and the model should not be used in production or any setting requiring accuracy or safety.

License

Released under CC BY 4.0.

Downloads last month: 17

Safetensors

Model size

1.57M params

Tensor type

F32

ishanb3d
/

atto-language-model