--- base_model: - Qwen/Qwen3-8B tags: - text-generation-inference - transformers - unsloth - qwen3 license: other license_name: anvdl-1.0 license_link: https://huggingface.co/apexion-ai/Nous-V1-8B/blob/main/LICENSE.md language: - en - fr - pt - de - ro - sv - da - bg - ru - cs - el - uk - es - nl - sk - hr - pl - lt - nb - nn - fa - sl - gu - lv - it - oc - ne - mr - be - sr - lb - vec - as - cy - szl - ast - hne - awa - mai - bho - sd - ga - fo - hi - pa - bn - or - tg - yi - lmo - lij - scn - fur - sc - gl - ca - is - sq - li - prs - af - mk - si - ur - mag - bs - hy - zh - yue - my - ar - he - mt - id - ms - tl - ceb - jv - su - min - ban - pag - ilo - war - ta - te - kn - ml - tr - az - uz - kk - ba - tt - th - lo - fi - et - hu - vi - km - ja - ko - ka - eu - ht - pap - kea - tpi - sw --- ![banner](https://huggingface.co/NoemaResearch/Apollo-1-4B/resolve/main/img/banner.png) # Apollo-1-8B [![Model](https://img.shields.io/badge/Model-Apollo--1--8B-blue)](https://huggingface.co/NoemaResearch/Apollo-1-8B) [![Base](https://img.shields.io/badge/Base-Qwen3--8B-green)](https://huggingface.co/Qwen/Qwen3-8B) [![License](https://img.shields.io/badge/License-Apache_2.0-yellow)](LICENSE) Apollo-1-8B is a **8 billion parameter instruction-tuned model** developed by **Noema Research**. It is based on [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) and optimized for **advanced reasoning, instruction following, and high-performance deployment**. This model represents the **large-scale member** of the Apollo series, balancing strong reasoning capabilities with efficiency for multi-domain applications. --- ## Model Overview * **Base model:** `Qwen3-8B` * **Architecture:** Decoder-only transformer * **Parameters:** \~8B * **Context length:** up to 32k tokens (inherits Qwen3 long-context support) * **Domain:** General-purpose reasoning, instruction following, and code generation * **Primary applications:** * Advanced conversational AI * Multi-step reasoning and problem solving * Knowledge assistants and tutoring systems * Software development and code generation * **License:** anvdl-1.0 --- ## Key Features * **Instruction tuning** for reliable multi-step reasoning and task completion * **Extended reasoning depth** compared to Apollo-1-4B for complex queries * **Long-context handling**, inherited from Qwen3 architecture * **Multilingual coverage**, supporting diverse languages and domains * **Balanced resource requirements**, deployable on high-end consumer hardware and cloud GPUs --- ## Usage The model is available in Hugging Face Transformers format. Example: ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "NoemaResearch/Apollo-1-8B" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) messages = [ {"role":"system", "content":"You are Apollo, a reasoning assistant."}, {"role":"user", "content":"Explain the differences between supervised, unsupervised, and reinforcement learning with examples."} ] inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.6, top_p=0.9) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` **Recommended settings:** * `temperature=0.4–0.8` * `top_p=0.9–0.95` * Lower temperatures yield more factual and concise answers --- ## Evaluation Apollo-1-8B demonstrates stronger reasoning and instruction-following capabilities relative to Apollo-1-4B, with internal evaluations indicating: * Higher accuracy on complex multi-step reasoning tasks * More robust **instruction adherence** * Reduced **hallucinations** in factual and structured outputs * High efficiency for large-context tasks A full benchmark report will be provided in a future update. For upstream performance details, see the [Qwen3-8B model card](https://huggingface.co/Qwen/Qwen3-8B). --- ## Limitations * **Reasoning scale**: While improved, Apollo-1-8B cannot match ultra-large models (14B+) on extremely complex or open-ended tasks * **Knowledge breadth**: Some highly specialized or niche knowledge may be limited * **Hallucinations**: May generate plausible but incorrect information * **Prompt sensitivity**: Outputs remain dependent on careful prompt formulation --- ## Responsible Use * Do not rely on Apollo-1-8B for critical decisions without human oversight * Verify outputs before applying in factual, legal, or safety-critical contexts * Avoid providing personal or sensitive data in prompts * The model should not be used to generate unsafe, harmful, or disallowed content --- ## Model Variants * **Full precision (safetensors)** — research and high-fidelity inference * **bf16 / fp16** — efficient inference on modern accelerators * **Quantized versions (int8 / int4)** — deployment in resource-constrained environments --- ## Citation If you use this model, please cite both Apollo-1-8B and the Qwen3 base model: ```bibtex @misc{noema2025apollo8b, title={Apollo-1-8B}, author={Noema Research}, year={2025}, howpublished={\url{https://huggingface.co/NoemaResearch/Apollo-1-8B}} } ``` --- ## Acknowledgements Apollo-1-8B builds upon the [Qwen3](https://huggingface.co/Qwen) family of models. We thank the Qwen team for open-sourcing their models and enabling derivative research.