| | --- |
| | language: |
| | - en |
| | license: apache-2.0 |
| | library_name: transformers |
| | pipeline_tag: text-generation |
| | tags: |
| | - llama |
| | - causal-lm |
| | - code-generation |
| | - lightweight |
| | - 1.54B |
| | base_model: |
| | - Qwen/Qwen2.5-Coder-1.5B-Instruct |
| | --- |
| | |
| | <p align="center"> |
| | <img alt="HOS-OSS-1.54B" src="https://huggingface.co/hydffgg/HOS-OSS-270M/resolve/main/HOS-OSS-270M.png"> |
| | </p> |
| |
|
| |
|
| | # HOS-OSS-1.54B |
| |
|
| | HOS-OSS-1.54B is a lightweight 1.54B parameter causal language model optimized for text and code generation tasks. |
| | It is designed for fast inference, low resource usage, and local deployment. |
| |
|
| | --- |
| |
|
| | ## 🚀 Overview |
| |
|
| | - **Model size:** ~1.54B parameters |
| | - **Architecture:** LLaMA-style decoder-only transformer |
| | - **Base model:** Qwen2.5-Coder-1.5B-Instruct (distilled / adapted) |
| | - **Framework:** 🤗 Transformers |
| | - **Use cases:** |
| | - Code generation |
| | - Instruction following |
| | - Chat-style completion |
| | - Lightweight local AI assistant |
| |
|
| | --- |
| |
|
| | ## ⚡ Features |
| |
|
| | - Fast inference on low-end GPUs |
| | - Runs on Kaggle / Colab without large VRAM |
| | - Suitable for edge deployment |
| | - Clean instruction-response formatting |
| |
|
| | --- |
| |
|
| | ## 🧠 Example Usage |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | import torch |
| | |
| | model_name = "hydffgg/HOS-OSS-1.54B" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained(model_name) |
| | |
| | prompt = "User: Write a Python Hello World\nAssistant:" |
| | |
| | inputs = tokenizer(prompt, return_tensors="pt") |
| | |
| | with torch.no_grad(): |
| | outputs = model.generate( |
| | **inputs, |
| | max_new_tokens=100, |
| | temperature=0.7 |
| | ) |
| | |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |