ishanb3d
/

atto-language-model

Text Generation

question-answering

Model card Files Files and versions

atto-language-model / README.md

ishanb3d's picture

Update README.md

ed3c612 verified 7 days ago

|

history blame contribute delete

1.83 kB

	---
	license: cc-by-4.0
	datasets:
	- ishanb3d/synthetic_qa
	language:
	- en
	tags:
	- question-answering
	- llama
	- tiny-model
	- experimental
	pipeline_tag: text-generation
	---

	# Tiny QA Model (2M)

	A 2M-parameter question-answering model built to probe the lower limits of how
	small a usable generative QA model can be. It produces somewhat coherent responses
	to questions, given its extreme size constraints.

	## Model Details

	- Parameters: ~2M (1.5M non-embedding)
	- Architecture: Llama (loadable with any standard Llama-compatible loader)
	- Language: English
	- Training data: [ishanb3d/synthetic_qa](https://huggingface.co/datasets/ishanb3d/synthetic_qa)

	## Prompt Format

	Prompts should follow this exact format:

	```
	<bos>Question: What is the purpose of unit testing in software projects?\nAnswer:
	```

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "ishanb3d/atto-language-model"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id)

	prompt = "<bos>Question: What is the purpose of unit testing in software projects?\nAnswer:"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=64)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Intended Use

	This model is intended exclusively for research and development — for example,
	studying small-model behavior, capability limits, and synthetic-data training dynamics.

	## Limitations

	At only 2M parameters, output quality is limited. Responses may be incoherent,
	factually wrong, or otherwise unreliable, and the model should not be used in
	production or any setting requiring accuracy or safety.

	## License

	Released under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/).