File size: 1,652 Bytes
9d2def3
 
 
a7b9b2f
 
 
 
 
 
540ccb6
 
5e444bb
4566087
 
9d2def3
502c07a
b6a6869
d0995d1
b6a6869
 
 
d0995d1
502c07a
34182c6
540ccb6
 
 
 
 
 
34182c6
540ccb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d0995d1
540ccb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
language:
- en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- llama
- causal-lm
- code-generation
- lightweight
- 1.54B
base_model:
- Qwen/Qwen2.5-Coder-1.5B-Instruct
---

<p align="center">
  <img alt="HOS-OSS-1.54B" src="https://huggingface.co/hydffgg/HOS-OSS-270M/resolve/main/HOS-OSS-270M.png">
</p>


# HOS-OSS-1.54B

HOS-OSS-1.54B is a lightweight 1.54B parameter causal language model optimized for text and code generation tasks.  
It is designed for fast inference, low resource usage, and local deployment.

---

## 🚀 Overview

- **Model size:** ~1.54B parameters  
- **Architecture:** LLaMA-style decoder-only transformer  
- **Base model:** Qwen2.5-Coder-1.5B-Instruct (distilled / adapted)  
- **Framework:** 🤗 Transformers  
- **Use cases:**  
  - Code generation  
  - Instruction following  
  - Chat-style completion  
  - Lightweight local AI assistant  

---

## ⚡ Features

- Fast inference on low-end GPUs
- Runs on Kaggle / Colab without large VRAM
- Suitable for edge deployment
- Clean instruction-response formatting

---

## 🧠 Example Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "hydffgg/HOS-OSS-1.54B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "User: Write a Python Hello World\nAssistant:"

inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=100,
        temperature=0.7
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))