Instructions to use Dmitriy-Zemskov/CalmaCatLM-1.5-mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Dmitriy-Zemskov/CalmaCatLM-1.5-mini with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Dmitriy-Zemskov/CalmaCatLM-1.5-mini")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Dmitriy-Zemskov/CalmaCatLM-1.5-mini", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Dmitriy-Zemskov/CalmaCatLM-1.5-mini with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Dmitriy-Zemskov/CalmaCatLM-1.5-mini" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Dmitriy-Zemskov/CalmaCatLM-1.5-mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Dmitriy-Zemskov/CalmaCatLM-1.5-mini
- SGLang
How to use Dmitriy-Zemskov/CalmaCatLM-1.5-mini with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Dmitriy-Zemskov/CalmaCatLM-1.5-mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Dmitriy-Zemskov/CalmaCatLM-1.5-mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Dmitriy-Zemskov/CalmaCatLM-1.5-mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Dmitriy-Zemskov/CalmaCatLM-1.5-mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Dmitriy-Zemskov/CalmaCatLM-1.5-mini with Docker Model Runner:
docker model run hf.co/Dmitriy-Zemskov/CalmaCatLM-1.5-mini
| language: | |
| - en | |
| - ru | |
| license: mit | |
| tags: | |
| - causal-lm | |
| - text-generation | |
| - chatbot | |
| - experimental | |
| model_type: gpt | |
| datasets: | |
| ... | |
| library_name: transformers | |
| (discord https://discord.gg/DUzP7CXqJt , https://discord.gg/jzwR7jFfSB) | |
| Website: https://calmacatai.draklor.ru | |
| ## License | |
| This model is licensed under the MIT License. | |
| # CalmaCatLM-1.5-mini | |
| π§ **Experimental Under-Training Model** (~**12**M parameters) **based on a custom 12-layer/12-head Transformer architecture.** | |
| **Primarily supports English** π¬π§. **This is my third model.** | |
| ## π Description | |
| CalmaCatLM is an **experimental generative language model** designed for text generation and dialogue tasks. | |
| The main goal of this project is to test the full pipeline: **from implementing the architecture and training from scratch** to uploading models to the Hugging Face Hub. | |
| ## βοΈ Model Details | |
| - **Architecture: Custom Transformer Decoder (6 layers, 6 attention heads)** | |
| - **Model size: ~12M parameters** # | |
| - **Training Approach: Pre-trained from scratch on My dataset** | |
| - **Languages: Primarily Russian** | |
| - **License: MIT** | |
| ## ποΈ Training Details | |
| - **Dataset:** `My` | |
| - **Hardware:** **Single** AMD **RX 7700 XT** (12GB VRAM) | |
| - **Training Status: Very early checkpoint (Under-trained)** | |
| - **Epochs:** 100 | |
| - **Batch size:** 32 | |
| - **Optimizer:** AdamW, lr = 3e-4 | |
| - **Max sequence length:** 128 tokens |