Instructions to use MrDevCoder01/TrainedModels with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use MrDevCoder01/TrainedModels with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="MrDevCoder01/TrainedModels", filename="PTM1-1B-Q8.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use MrDevCoder01/TrainedModels with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MrDevCoder01/TrainedModels # Run inference directly in the terminal: llama-cli -hf MrDevCoder01/TrainedModels
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MrDevCoder01/TrainedModels # Run inference directly in the terminal: llama-cli -hf MrDevCoder01/TrainedModels
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf MrDevCoder01/TrainedModels # Run inference directly in the terminal: ./llama-cli -hf MrDevCoder01/TrainedModels
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf MrDevCoder01/TrainedModels # Run inference directly in the terminal: ./build/bin/llama-cli -hf MrDevCoder01/TrainedModels
Use Docker
docker model run hf.co/MrDevCoder01/TrainedModels
- LM Studio
- Jan
- vLLM
How to use MrDevCoder01/TrainedModels with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MrDevCoder01/TrainedModels" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MrDevCoder01/TrainedModels", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MrDevCoder01/TrainedModels
- Ollama
How to use MrDevCoder01/TrainedModels with Ollama:
ollama run hf.co/MrDevCoder01/TrainedModels
- Unsloth Studio new
How to use MrDevCoder01/TrainedModels with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MrDevCoder01/TrainedModels to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MrDevCoder01/TrainedModels to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MrDevCoder01/TrainedModels to start chatting
- Docker Model Runner
How to use MrDevCoder01/TrainedModels with Docker Model Runner:
docker model run hf.co/MrDevCoder01/TrainedModels
- Lemonade
How to use MrDevCoder01/TrainedModels with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull MrDevCoder01/TrainedModels
Run and chat with the model
lemonade run user.TrainedModels-{{QUANT_TAG}}List all available models
lemonade list
| license: apache-2.0 | |
| language: | |
| - en | |
| base_model: [] | |
| pipeline_tag: text-generation | |
| datasets: | |
| - HuggingFaceTB/cosmopedia | |
| - tiiuae/falcon-refinedweb | |
| library_name: gguf | |
| tags: | |
| - text-generation | |
| - gguf | |
| - quantized | |
| - 1b | |
| - llama-cpp | |
| # PT1S-1B-Q8.gguf | |
| This model is a 1-billion parameter text generation model trained on a high-quality mixture of synthetic and web-crawled data. It is optimized for efficiency and performance in a small footprint. | |
| ## Model Details | |
| - **Model Type:** Text Generation | |
| - **Parameters:** 1B | |
| - **Quantization:** Q8_0 (8-bit quantization for high precision with reduced memory) | |
| - **Training Data:** | |
| - [HuggingFaceTB/cosmopedia](https://huggingface.co/datasets/HuggingFaceTB/cosmopedia) | |
| - [tiiuae/falcon-refinedweb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | |
| - **Language(s):** English | |
| - **License:** Apache 2.0 | |
| ## Training Information | |
| The model was trained on a curated blend of: | |
| 1. **Cosmopedia**: A large-scale synthetic dataset designed to provide high-quality educational content across various domains. | |
| 2. **Falcon RefinedWeb**: A massive, filtered web dataset that provides broad world knowledge and linguistic diversity. | |
| This combination allows the model to have both structured knowledge from synthetic sources and a natural "web-aware" conversational style. | |
| ## Usage | |
| ### llama.cpp | |
| You can use this model with [llama.cpp](https://github.com/ggerganov/llama.cpp) by running: | |
| ```bash | |
| ./main -m PT1S-1B-Q8.gguf -p "Once upon a time," -n 128 | |
| ``` | |
| ### Python (via llama-cpp-python) | |
| ```python | |
| from llama_cpp import Llama | |
| llm = Llama(model_path="./PT1S-1B-Q8.gguf") | |
| output = llm("Q: What is the importance of cosmopedia dataset? A:", max_tokens=100) | |
| print(output) | |
| ``` | |
| ## Intended Use | |
| This model is ideal for: | |
| - Lightweight text generation tasks. | |
| - Educational applications. | |
| - On-device inference where memory is limited. | |
| - Research into small language models (SLMs). | |
| ## Limitations and Bias | |
| While trained on filtered data, small models may still exhibit biases or generate incorrect information (hallucinations). Users should always verify the output of the model for critical applications. | |