--- base_model: - Qwen/Qwen3-1.7B tags: - text-generation-inference - transformers - unsloth - qwen3 license: other license_name: anvdl-1.0 license_link: https://huggingface.co/apexion-ai/Nous-V1-8B/blob/main/LICENSE.md language: - en - fr - pt - de - ro - sv - da - bg - ru - cs - el - uk - es - nl - sk - hr - pl - lt - nb - nn - fa - sl - gu - lv - it - oc - ne - mr - be - sr - lb - vec - as - cy - szl - ast - hne - awa - mai - bho - sd - ga - fo - hi - pa - bn - or - tg - yi - lmo - lij - scn - fur - sc - gl - ca - is - sq - li - prs - af - mk - si - ur - mag - bs - hy - zh - yue - my - ar - he - mt - id - ms - tl - ceb - jv - su - min - ban - pag - ilo - war - ta - te - kn - ml - tr - az - uz - kk - ba - tt - th - lo - fi - et - hu - vi - km - ja - ko - ka - eu - ht - pap - kea - tpi - sw --- ![banner](https://huggingface.co/NoemaResearch/Apollo-1-4B/resolve/main/img/banner.png) # Apollo-1-2B [![Model](https://img.shields.io/badge/Model-Apollo--1--2B-blue)](https://huggingface.co/NoemaResearch/Apollo-1-2B) [![Base](https://img.shields.io/badge/Base-Qwen3--1.7B-green)](https://huggingface.co/Qwen/Qwen3-1.7B) [![License](https://img.shields.io/badge/License-Apache_2.0-yellow)](LICENSE) Apollo-1-2B is a **2 billion parameter instruction-tuned model** developed by **Noema Research**. It is based on [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) and optimized for **general reasoning, language understanding, and lightweight deployment**. This model is the first release in the **Apollo series**, intended as a foundation for scalable experimentation and real-world applications in constrained environments. --- ## Model Overview - **Base model:** `Qwen3-1.7B` - **Architecture:** Decoder-only transformer - **Parameters:** ~2B - **Context length:** up to 32k tokens (inherits Qwen3 long-context support) - **Domain:** General-purpose reasoning and instruction following - **Primary applications:** - Conversational AI - Lightweight reasoning tasks - Education and tutoring - Prototype agents and assistants - **License:** anvdl-1.0 --- ## Key Features - **Instruction tuned**: More reliable responses in conversational and task-oriented settings - **Lightweight deployment**: Optimized for environments with limited compute or memory resources - **Extended context**: Inherits long-context capability from Qwen3 base - **Balanced outputs**: Improved refusal behaviors and reduced hallucinations compared to the base model - **Multilingual ability**: Retains multilingual knowledge from Qwen3 family --- ## Usage The model is available in Hugging Face Transformers format. Example: ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "NoemaResearch/Apollo-1-2B" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) messages = [ {"role":"system", "content":"You are Apollo, a reasoning assistant."}, {"role":"user", "content":"Explain the difference between supervised and unsupervised learning."} ] inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ```` **Recommended settings:** * `temperature=0.5–0.9` * `top_p=0.85–0.95` * For structured outputs (e.g. JSON), use lower temperatures for stability --- ## Evaluation Apollo-1-2B has been evaluated internally on a range of reasoning and language tasks. Key findings: * Improved **instruction following** relative to Qwen3-1.7B * More **concise and accurate responses** in structured tasks * Maintains **multilingual performance** from the base model * Effective for **lightweight assistant applications** Future work will include publishing comprehensive benchmark comparisons against other models in the 1–3B parameter range. --- ## Limitations * **Reasoning depth**: As a 2B parameter model, Apollo cannot match larger-scale LLMs on complex reasoning tasks * **Knowledge coverage**: May lack depth in specialized or low-resource domains * **Hallucinations**: Although reduced, the model may still generate incorrect or fabricated information * **Sensitivity to prompts**: Outputs vary with prompt phrasing; careful prompt design recommended --- ## Responsible Use * Do not rely on Apollo for critical decision-making without human oversight * Generated outputs may contain inaccuracies; verification is required for factual or sensitive use cases * Avoid providing personal, private, or sensitive information in prompts * This model should not be used to generate disallowed, unsafe, or harmful content --- ## Model Variants * **Full precision (safetensors)** — research and full-fidelity inference * **bf16 / fp16** — optimized for inference on GPUs/TPUs * **Quantized versions (int8 / int4)** — for deployment in constrained hardware environments --- ## Citation If you use this model, please cite both Apollo-1-2B and the Qwen3 base model: ```bibtex @misc{noema2025apollo, title={Apollo-1-2B}, author={Noema Research}, year={2025}, howpublished={\url{https://huggingface.co/NoemaResearch/Apollo-1-2B}} } ``` --- ## Acknowledgements Apollo-1-2B builds upon the [Qwen3](https://huggingface.co/Qwen) series of models. We thank the Qwen team for making their work openly available under permissive terms, which enabled this derivative research. ---