Instructions to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF", dtype="auto")

llama-cpp-python

How to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF",
	filename="DragonAI-Python-SmolLM2-1.7B-Instruct.IQ4_XS.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M

Use Docker

docker model run hf.co/mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with Ollama:
```
ollama run hf.co/mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M
```

Unsloth Studio new

How to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF to start chatting

Docker Model Runner
How to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with Docker Model Runner:
```
docker model run hf.co/mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M
```

Lemonade

How to use mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull mradermacher/DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.DragonAI-Python-SmolLM2-1.7B-Instruct-GGUF-Q4_K_M

List all available models

lemonade list

DragonAI-Python-SmolLM2_model.py???

by MartialTerran - opened Dec 2, 2024

Discussion

MartialTerran

Dec 2, 2024

Can you provide a functional pytorch model.py and train.py that supports at least inference mode of this model using the hyperparameters in the config.json. There is no way to do model architecture research or educational experimentation on "autotransformer" which conceals the actual python scripts. Further, Huggingface sometimes vandalizes the hidden autotransformer model.py making it inoperable. For example the SmolLM2 weights are already useless for research because of a change in "autotransformers" that makes them have a size mismatch in the projections (k and v) apparently because of 3x re-use of k and v projection matrices. See e.g., https://huggingface.co/HuggingFaceTB/SmolLM2-360M/discussions The published SmolLM2 weights have already become unusable and unstudyable without providing a fixed and definite model.py and train.py to document how to implement the unusual config hyperparameters. Please help keep models usable and help the independent research community by providing working pytorch model.py and train.py for each of your model variants.

mradermacher

Owner Dec 3, 2024

Hi, this repo only contains gguf files, not a transformers model. You probably meant to ask this on the original model.

mradermacher changed discussion status to closed Dec 3, 2024

MartialTerran

Dec 8, 2024

Yes, I already asked THERE (original model) for Huggingface to publish a standalone model.py (pytorch only, not "autotransformers"), but the request is ignored. This behavior hinders research and innovation and model optimization. The mentality seems to be that the only interest should be training new models and then deployment for Application Development, not modifying and optimizing or porting the underlying model.py code. Here is an audio discussion about the disappointing direction of the current fixation on training and scaling without re-examination of model architecture. https://huggingface.co/MartialTerran/Toy_GPTs_LLMs_for_CPU_Educational
specifically:
https://huggingface.co/MartialTerran/Toy_GPTs_LLMs_for_CPU_Educational/blob/main/The%20AI%20Revolution_%20A%20Debate.wav

mradermacher

Owner Dec 8, 2024

You will not reach huggingface either there or here, unfortunately, these are just pages for a specific model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment