Instructions to use RichardErkhov/chrisnic_-_Python_Ass-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="RichardErkhov/chrisnic_-_Python_Ass-gguf",
	filename="Python_Ass.IQ3_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Use Docker

docker model run hf.co/RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

LM Studio
Jan
Ollama
How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with Ollama:
```
ollama run hf.co/RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M
```

Unsloth Studio new

How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RichardErkhov/chrisnic_-_Python_Ass-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RichardErkhov/chrisnic_-_Python_Ass-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for RichardErkhov/chrisnic_-_Python_Ass-gguf to start chatting

Pi new

How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with Docker Model Runner:
```
docker model run hf.co/RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M
```

Lemonade

How to use RichardErkhov/chrisnic_-_Python_Ass-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull RichardErkhov/chrisnic_-_Python_Ass-gguf:Q4_K_M

Run and chat with the model

lemonade run user.chrisnic_-_Python_Ass-gguf-Q4_K_M

List all available models

lemonade list

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Quantization made by Richard Erkhov.

Github

Discord

Request more models

Python_Ass - GGUF

Model creator: https://huggingface.co/chrisnic/
Original model: https://huggingface.co/chrisnic/Python_Ass/

Name	Quant method	Size
Python_Ass.Q2_K.gguf	Q2_K	2.96GB
Python_Ass.IQ3_XS.gguf	IQ3_XS	3.28GB
Python_Ass.IQ3_S.gguf	IQ3_S	3.43GB
Python_Ass.Q3_K_S.gguf	Q3_K_S	3.41GB
Python_Ass.IQ3_M.gguf	IQ3_M	3.52GB
Python_Ass.Q3_K.gguf	Q3_K	3.74GB
Python_Ass.Q3_K_M.gguf	Q3_K_M	3.74GB
Python_Ass.Q3_K_L.gguf	Q3_K_L	4.03GB
Python_Ass.IQ4_XS.gguf	IQ4_XS	4.18GB
Python_Ass.Q4_0.gguf	Q4_0	4.34GB
Python_Ass.IQ4_NL.gguf	IQ4_NL	4.38GB
Python_Ass.Q4_K_S.gguf	Q4_K_S	4.37GB
Python_Ass.Q4_K.gguf	Q4_K	4.58GB
Python_Ass.Q4_K_M.gguf	Q4_K_M	4.58GB
Python_Ass.Q4_1.gguf	Q4_1	4.78GB
Python_Ass.Q5_0.gguf	Q5_0	5.21GB
Python_Ass.Q5_K_S.gguf	Q5_K_S	5.21GB
Python_Ass.Q5_K.gguf	Q5_K	5.34GB
Python_Ass.Q5_K_M.gguf	Q5_K_M	5.34GB
Python_Ass.Q5_1.gguf	Q5_1	5.65GB
Python_Ass.Q6_K.gguf	Q6_K	6.14GB
Python_Ass.Q8_0.gguf	Q8_0	7.95GB

Original model description:

license: llama3.1 language: - en - it base_model: - meta-llama/Llama-3.1-8B pipeline_tag: text-generation library_name: transformers tags: - code

Python Code Assistant based on LLaMA 3.1

This model is a specialized Python coding assistant, fine-tuned from LLaMA 3.1 8B Instruct using a two-stage training approach with carefully curated Python programming datasets.

Model Description

The model has been trained to assist with Python programming tasks through a progressive fine-tuning approach:

First Training Stage

Base Model: LLaMA 3.1 8B Instruct
Dataset: iamtarun/python_code_instructions_18k_alpaca
Training Focus: Understanding Python programming instructions and generating appropriate code responses

Second Training Stage

Dataset: flytech/python-codes-25k
Focus: Enhancing code generation capabilities and understanding of advanced Python concepts

Training Methodology

The model employs several advanced training techniques to ensure optimal performance:

LoRA Fine-tuning Parameters:
- Rank (r): 8
- Alpha: 16
- Dropout: 0.1
- Target Modules: Query and Value Projections
Training Optimizations:
- 4-bit quantization (NF4 format)
- Gradient checkpointing
- Dynamic learning rate adjustment
- Early stopping with patience=3
- Adaptive batch processing
- Memory-efficient training with automated cleanup

Model Architecture

Base Architecture: LLaMA 3.1 8B Instruct
Training Format: 4-bit quantization with double quantization
Memory Efficient: Optimized for deployment with reduced memory footprint

Intended Uses

This model is designed for:

Generating Python code from natural language descriptions
Assisting with code completion and suggestions
Explaining Python concepts and best practices
Helping with code debugging and optimization
Supporting Python development tasks

Training Data

The model was trained on a combination of:

18,000 Python programming instructions and implementations from the Alpaca dataset
25,000 Python code examples and explanations

Performance and Limitations

Strengths

Specialized in Python programming tasks
Memory-efficient implementation
Trained with gradient stability monitoring
Optimized for practical coding assistance

Limitations

Limited to Python programming language
Based on LLaMA 3.1's knowledge cutoff
May require context for complex programming tasks

Usage Tips

To get the best results from this model:

Provide clear and specific instructions
Include relevant context when asking for code
Specify any particular Python version or library requirements
Mention any performance or style preferences

Training Hardware Requirements

The model was trained using:

GPU RTX4090 24GB VRAM
CUDA compatibility
Optimized for memory efficiency through 4-bit quantization

License and Usage Rights

Base model: LLaMA 3.1 license applies
Additional training: [Specify your license]

Citation and Contact

[christiannicoletti75@gmail.com]

Downloads last month: 162

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support