--- license: cc-by-4.0 datasets: - ishanb3d/synthetic_qa language: - en tags: - question-answering - llama - tiny-model - experimental pipeline_tag: text-generation --- # Tiny QA Model (2M) A **2M-parameter** question-answering model built to probe the lower limits of how small a usable generative QA model can be. It produces somewhat coherent responses to questions, given its extreme size constraints. ## Model Details - **Parameters:** ~2M (1.5M non-embedding) - **Architecture:** Llama (loadable with any standard Llama-compatible loader) - **Language:** English - **Training data:** [ishanb3d/synthetic_qa](https://huggingface.co/datasets/ishanb3d/synthetic_qa) ## Prompt Format Prompts should follow this exact format: ``` Question: What is the purpose of unit testing in software projects?\nAnswer: ``` ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "ishanb3d/atto-language-model" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) prompt = "Question: What is the purpose of unit testing in software projects?\nAnswer:" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=64) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Intended Use This model is intended **exclusively for research and development** — for example, studying small-model behavior, capability limits, and synthetic-data training dynamics. ## Limitations At only 2M parameters, output quality is limited. Responses may be incoherent, factually wrong, or otherwise unreliable, and the model should **not** be used in production or any setting requiring accuracy or safety. ## License Released under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/).