Instructions to use AxionLab-official/MiniBot-0.9M-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AxionLab-official/MiniBot-0.9M-Base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AxionLab-official/MiniBot-0.9M-Base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AxionLab-official/MiniBot-0.9M-Base")
model = AutoModelForCausalLM.from_pretrained("AxionLab-official/MiniBot-0.9M-Base")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AxionLab-official/MiniBot-0.9M-Base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AxionLab-official/MiniBot-0.9M-Base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AxionLab-official/MiniBot-0.9M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/AxionLab-official/MiniBot-0.9M-Base

SGLang

How to use AxionLab-official/MiniBot-0.9M-Base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AxionLab-official/MiniBot-0.9M-Base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AxionLab-official/MiniBot-0.9M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AxionLab-official/MiniBot-0.9M-Base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AxionLab-official/MiniBot-0.9M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use AxionLab-official/MiniBot-0.9M-Base with Docker Model Runner:
```
docker model run hf.co/AxionLab-official/MiniBot-0.9M-Base
```

AxionLab-official commited on Apr 5

Commit

5fe3a00

verified ·

1 Parent(s): a355081

Update README.md

Browse files

Files changed (1) hide show

README.md +115 -78

README.md CHANGED Viewed

@@ -12,76 +12,93 @@ tags:
 - chatbot
 ---
-🧠 MiniBot-0.9M-Base
-Ultra-lightweight GPT-2 style language model (~900K parameters) specialized in Portuguese conversational text.
-📌 Model Overview
-MiniBot-0.9M-Base is a tiny decoder-only Transformer (~0.9M parameters) based on the GPT-2 architecture, designed for efficient text generation in Portuguese.
-This model is a base (pretrained) model, meaning it was trained for next-token prediction without instruction tuning or alignment.
-It is intended primarily for:
-🧪 Fine-tuning experiments
-🎮 Playground usage
-⚡ Ultra-fast local inference
-🧠 Research on small-scale language models
-🎯 Key Characteristics
-🇧🇷 Language: Portuguese (primary)
-🧠 Architecture: GPT-2 style (decoder-only Transformer)
-🔤 Embeddings: GPT-2 compatible embeddings
-📉 Parameters: ~900,000
-⚙️ Objective: Causal Language Modeling (next-token prediction)
-🚫 Alignment: None (base model)
-🏗️ Architecture Details
-MiniBot-0.9M follows a scaled-down GPT-2 design, including:
-Token + positional embeddings
-Multi-head self-attention
-Feed-forward (MLP) layers
-Autoregressive decoding
-Despite its small size, it preserves the core inductive biases of GPT-2, making it ideal for experimentation and educational purposes.
-📚 Training
-Dataset
-The model was trained on a Portuguese conversational dataset, including:
-Pure text
-Training Notes
-Focused on language pattern learning, not reasoning
-No instruction tuning (no RLHF, no alignment)
-Lightweight training pipeline
-Optimized for small-scale experimentation
-💡 Capabilities
-✅ Strengths:
-Geração de texto em português
-Estrutura básica de diálogo
-Continuação de prompts simples
-Aprendizado de padrões linguísticos
-❌ Limitations:
-Raciocínio muito limitado
-Perda de contexto em conversas longas
-Respostas inconsistentes
-Possível repetição ou incoerência
-👉 This model behaves as a statistical language generator, not a reasoning system.
-🚀 Usage
-Hugging Face Transformers
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -90,7 +107,7 @@ model_name = "AxionLab-official/MiniBot-0.9M-Base"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name)
-prompt = "The cat "
 inputs = tokenizer(prompt, return_tensors="pt")
 outputs = model.generate(
@@ -98,49 +115,69 @@ outputs = model.generate(
     max_new_tokens=50,
     temperature=0.8,
     top_p=0.95,
-    do_sample=True
 )
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
-⚙️ Recommended Generation Settings
-For better results:
-temperature: 0.7 – 1.0
-top_p: 0.9 – 0.95
-do_sample: True
-max_new_tokens: 30 – 80
-🧪 Intended Use
-This is a foundation model, ideal for:
-🧠 Fine-tuning (chat, instruction, roleplay, tools)
-🎮 Prompt playground experimentation
-🔬 Research in tiny LLMs
-📉 Benchmarking small architectures
-⚠️ Limitations
-Due to its extremely small size:
-Limited world knowledge
-Weak generalization
-No safety alignment
-Not suitable for production use
-🔮 Future Work
-Planned directions:
-🧠 Instruction-tuned version (MiniBot-Instruct)
-📚 Larger dataset scaling
-🔤 Tokenizer improvements
-📈 Larger variants (1M–10M params)
-🤖 Experimental reasoning fine-tuning
-📜 License
-MIT
-👤 Author
-Developed by AxionLab

 - chatbot
 ---
+# 🧠 MiniBot-0.9M-Base
+> **Ultra-lightweight GPT-2 style language model (~900K parameters) specialized in Portuguese conversational text.**
+[![Model](https://img.shields.io/badge/🤗%20Hugging%20Face-MiniBot--0.9M--Base-yellow)](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Base)
+[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
+[![Language](https://img.shields.io/badge/Language-Portuguese-blue)](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Base)
+[![Parameters](https://img.shields.io/badge/Parameters-~900K-orange)](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Base)
+---
+## 📌 Overview
+**MiniBot-0.9M-Base** is a tiny decoder-only Transformer (~0.9M parameters) based on the GPT-2 architecture, designed for efficient text generation in **Portuguese**.
+This is a **base (pretrained) model** — trained purely for next-token prediction, with no instruction tuning or alignment of any kind. It serves as the foundation for fine-tuned variants such as [MiniBot-0.9M-Instruct](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Instruct).
+---
+## 🎯 Key Characteristics
+| Attribute | Detail |
+|---|---|
+| 🇧🇷 **Language** | Portuguese (primary) |
+| 🧠 **Architecture** | GPT-2 style (Transformer decoder-only) |
+| 🔤 **Embeddings** | GPT-2 compatible |
+| 📉 **Parameters** | ~900K |
+| ⚙️ **Objective** | Causal Language Modeling (next-token prediction) |
+| 🚫 **Alignment** | None (base model) |
+---
+## 🏗️ Architecture
+MiniBot-0.9M follows a scaled-down GPT-2 design:
+- Token embeddings + positional embeddings
+- Multi-head self-attention
+- Feed-forward (MLP) layers
+- Autoregressive decoding
+Despite its small size, it preserves the core inductive biases of GPT-2, making it well-suited for experimentation and educational purposes.
+---
+## 📚 Training Dataset
+The model was trained on a Portuguese conversational dataset focused on language pattern learning.
+**Training notes:**
+- Pure next-token prediction objective
+- No instruction tuning (no SFT, no RLHF, no alignment)
+- Lightweight training pipeline
+- Optimized for small-scale experimentation
+---
+## 💡 Capabilities
+### ✅ Strengths
+- Portuguese text generation
+- Basic dialogue structure
+- Simple prompt continuation
+- Linguistic pattern learning
+### ❌ Limitations
+- Very limited reasoning ability
+- Loses context in long conversations
+- Inconsistent outputs
+- Prone to repetition or incoherence
+> ⚠️ This model behaves as a statistical language generator, not a reasoning system.
+---
+## 🚀 Getting Started
+### Installation
+```bash
+pip install transformers torch
+```
+### Usage with Hugging Face Transformers
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name)
+prompt = "User: Me explique o que é gravidade\nBot:"
 inputs = tokenizer(prompt, return_tensors="pt")
 outputs = model.generate(
     max_new_tokens=50,
     temperature=0.8,
     top_p=0.95,
+    do_sample=True,
 )
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+### ⚙️ Recommended Settings
+| Parameter | Recommended Value | Description |
+|---|---|---|
+| `temperature` | `0.7 – 1.0` | Controls randomness |
+| `top_p` | `0.9 – 0.95` | Nucleus sampling |
+| `do_sample` | `True` | Enable sampling |
+| `max_new_tokens` | `30 – 80` | Response length |
+> 💡 Base models generally benefit from higher temperature values compared to instruct variants, since there is no fine-tuning to constrain the output distribution.
+---
+## 🧪 Intended Use Cases
+| Use Case | Suitability |
+|---|---|
+| 🧠 Fine-tuning (chat, instruction, roleplay) | ✅ Ideal |
+| 🎮 Prompt playground & experimentation | ✅ Ideal |
+| 🔬 Research on tiny LLMs | ✅ Ideal |
+| 📉 Benchmarking small architectures | ✅ Ideal |
+| ⚡ Local / CPU-only applications | ✅ Ideal |
+| 🏭 Critical production environments | ❌ Not recommended |
+---
+## ⚠️ Disclaimer
+- Extremely small model (~900K parameters)
+- Limited world knowledge and weak generalization
+- No safety or alignment measures
+- **Not suitable for production use**
+---
+## 🔮 Future Work
+- [x] 🎯 Instruction-tuned version → [`MiniBot-0.9M-Instruct`](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Instruct)
+- [ ] 📚 Larger and more diverse dataset
+- [ ] 🔤 Tokenizer improvements
+- [ ] 📈 Scaling to 1M–10M parameters
+- [ ] 🧠 Experimental reasoning fine-tuning
+---
+## 📜 License
+Distributed under the **MIT License**. See [`LICENSE`](LICENSE) for more details.
+---
+## 👤 Author
+Developed by **[AxionLab](https://huggingface.co/AxionLab-official)** 🔬
+---
+<div align="center">
+  <sub>MiniBot-0.9M-Base · AxionLab · MIT License</sub>
+</div>