Instructions to use sixf0ur/tiny-lm-chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use sixf0ur/tiny-lm-chat with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="sixf0ur/tiny-lm-chat", filename="tiny-lm-chat-f16.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use sixf0ur/tiny-lm-chat with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf sixf0ur/tiny-lm-chat:F16 # Run inference directly in the terminal: llama cli -hf sixf0ur/tiny-lm-chat:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf sixf0ur/tiny-lm-chat:F16 # Run inference directly in the terminal: llama cli -hf sixf0ur/tiny-lm-chat:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf sixf0ur/tiny-lm-chat:F16 # Run inference directly in the terminal: ./llama-cli -hf sixf0ur/tiny-lm-chat:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf sixf0ur/tiny-lm-chat:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf sixf0ur/tiny-lm-chat:F16
Use Docker
docker model run hf.co/sixf0ur/tiny-lm-chat:F16
- LM Studio
- Jan
- Ollama
How to use sixf0ur/tiny-lm-chat with Ollama:
ollama run hf.co/sixf0ur/tiny-lm-chat:F16
- Unsloth Studio
How to use sixf0ur/tiny-lm-chat with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sixf0ur/tiny-lm-chat to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sixf0ur/tiny-lm-chat to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for sixf0ur/tiny-lm-chat to start chatting
- Atomic Chat new
- Docker Model Runner
How to use sixf0ur/tiny-lm-chat with Docker Model Runner:
docker model run hf.co/sixf0ur/tiny-lm-chat:F16
- Lemonade
How to use sixf0ur/tiny-lm-chat with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull sixf0ur/tiny-lm-chat:F16
Run and chat with the model
lemonade run user.tiny-lm-chat-F16
List all available models
lemonade list
| license: mit | |
| datasets: | |
| - sixf0ur/nano_wiki | |
| - sixf0ur/nano_chat | |
| - sixf0ur/babylm_eng_distilled_1024 | |
| - sixf0ur/single_turn_chat | |
| language: | |
| - en | |
| tags: | |
| - gemma | |
| # tiny-lm-chat | |
| tiny-lm-chat is an ultra-lightweight, 14-million-parameter Small Language Model based on the Gemma architecture. | |
| It has been trained from scratch for conversational efficiency within a compact 192-token context window. | |
| It was designed to do singleturn chats in very simple language (input / output). | |
| ## Evaluation Metrics | |
| * **Final Training Loss:** 2.601 | |
| * **Validation Loss (Eval):** 2.687 | |
| * **Perplexity (PPL):** ~14.68 | |
| ## Chat Examples (Zero-Shot) | |
| π¬ **User:** What is the meaning of life? | |
| π€ **Bot:** To find friends and help others. | |
| π¬ **User:** I cant find my keys. | |
| π€ **Bot:** Check your pockets or the door. | |
| π¬ **User:** Are you a human? | |
| π€ **Bot:** Nope, just a language. | |
| ### Recommend generation config | |
| ``` | |
| Context Length 192 | |
| Temperature 0.75 | |
| Top P 0.90 | |
| Repeat Penalty 1.10 | |
| ``` | |
| ## Usage for chat in terminal | |
| ```python | |
| import torch | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| def main(): | |
| REPO_ID = "sixf0ur/tiny-lm-chat" | |
| MAX_CONTEXT = 192 | |
| tokenizer = AutoTokenizer.from_pretrained(REPO_ID) | |
| model = AutoModelForCausalLM.from_pretrained(REPO_ID, torch_dtype=torch.float32) | |
| model.eval() | |
| # custom tokens used for interaction | |
| BOS_ID = tokenizer.convert_tokens_to_ids("<bos>") | |
| USER_ID = tokenizer.convert_tokens_to_ids("<user>") | |
| BOT_ID = tokenizer.convert_tokens_to_ids("<bot>") | |
| print("="*50 + "\n") | |
| while True: | |
| user_input = input("User: ").strip() | |
| if user_input.lower() in ["exit", "quit", "q"]: | |
| break | |
| if not user_input: | |
| continue | |
| user_token_ids = tokenizer.encode(user_input, add_special_tokens=False) | |
| prompt_ids = [BOS_ID, USER_ID] + user_token_ids + [BOT_ID] | |
| prompt_len = len(prompt_ids) | |
| max_new_tokens = MAX_CONTEXT - prompt_len | |
| if max_new_tokens <= 5: | |
| print(f"(Input too long!\n") | |
| continue | |
| input_ids = torch.tensor([prompt_ids]) | |
| with torch.no_grad(): | |
| outputs = model.generate( | |
| input_ids, | |
| max_new_tokens=max_new_tokens, | |
| temperature=0.75, | |
| top_p=0.9, | |
| repetition_penalty=1.1, | |
| do_sample=True, | |
| pad_token_id=tokenizer.pad_token_id, | |
| eos_token_id=tokenizer.eos_token_id, | |
| ) | |
| generated_ids = outputs[0][prompt_len:] | |
| response = tokenizer.decode(generated_ids, skip_special_tokens=True) | |
| print(f"\nBot: {response.strip()}") | |
| print("\n" + "="*50) | |
| if __name__ == "__main__": | |
| main() | |
| ``` | |
| ## lm-studio template (Jinja) | |
| ``` | |
| {% for message in messages %} | |
| {% if message['role'] == 'user' %} | |
| {{ '<bos><user>' + message['content'].strip() + '<bot>' }} | |
| {% elif message['role'] == 'assistant' %} | |
| {{ message['content'].strip() + '<eos>' }} | |
| {% endif %} | |
| {% endfor %} | |
| ``` |