--- language: - en license: apache-2.0 library_name: transformers pipeline_tag: text-generation tags: - llama - causal-lm - code-generation - lightweight - 1.54B base_model: - Qwen/Qwen2.5-Coder-1.5B-Instruct ---

HOS-OSS-1.54B

# HOS-OSS-1.54B HOS-OSS-1.54B is a lightweight 1.54B parameter causal language model optimized for text and code generation tasks. It is designed for fast inference, low resource usage, and local deployment. --- ## 🚀 Overview - **Model size:** ~1.54B parameters - **Architecture:** LLaMA-style decoder-only transformer - **Base model:** Qwen2.5-Coder-1.5B-Instruct (distilled / adapted) - **Framework:** 🤗 Transformers - **Use cases:** - Code generation - Instruction following - Chat-style completion - Lightweight local AI assistant --- ## ⚡ Features - Fast inference on low-end GPUs - Runs on Kaggle / Colab without large VRAM - Suitable for edge deployment - Clean instruction-response formatting --- ## 🧠 Example Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_name = "hydffgg/HOS-OSS-1.54B" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) prompt = "User: Write a Python Hello World\nAssistant:" inputs = tokenizer(prompt, return_tensors="pt") with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=100, temperature=0.7 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True))