Small Models
Collection
A list of all small models (=<1B) that I have published. • 9 items • Updated
How to use Fu01978/Nano-H with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Fu01978/Nano-H") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Fu01978/Nano-H", dtype="auto")How to use Fu01978/Nano-H with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Fu01978/Nano-H"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Fu01978/Nano-H",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/Fu01978/Nano-H
How to use Fu01978/Nano-H with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Fu01978/Nano-H" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Fu01978/Nano-H",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Fu01978/Nano-H" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Fu01978/Nano-H",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use Fu01978/Nano-H with Docker Model Runner:
docker model run hf.co/Fu01978/Nano-H
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Fu01978/Nano-H", dtype="auto")h_model
Nano-H is a revolutionary, ultra-minimalist language model architecture. While the industry trends toward trillion-parameter behemoths, Nano-H proves that with just 2 trainable parameters, you can achieve 100% precision, 100% recall, and 0% hallucination for the most important character in the alphabet: H.
h_model| Benchmark | Nano-H Score |
|---|---|
| Output Consistency | 100% |
| H-Accuracy | 100% |
To experience the definitive power of the h_model architecture, load it with trust_remote_code=True:
from transformers import AutoModel, AutoTokenizer
model_path = "Fu01978/Nano-H"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
inputs = tokenizer("Hello?", return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_length=1)
print(tokenizer.decode(outputs[0]))
Nano-H is inherently safe. It cannot be jailbroken to provide instructions for dangerous activities, as any such request will be met with a singular "H".
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Fu01978/Nano-H")