Instructions to use Andycurrent/Llama-3-8B-Lexi-Uncensored with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Andycurrent/Llama-3-8B-Lexi-Uncensored with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Andycurrent/Llama-3-8B-Lexi-Uncensored",
	filename="Llama-3-8B-Lexi-Uncensored_F16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Andycurrent/Llama-3-8B-Lexi-Uncensored with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M

Use Docker

docker model run hf.co/Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M

LM Studio
Jan

vLLM

How to use Andycurrent/Llama-3-8B-Lexi-Uncensored with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Andycurrent/Llama-3-8B-Lexi-Uncensored"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Andycurrent/Llama-3-8B-Lexi-Uncensored",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M

Ollama
How to use Andycurrent/Llama-3-8B-Lexi-Uncensored with Ollama:
```
ollama run hf.co/Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M
```

Unsloth Studio new

How to use Andycurrent/Llama-3-8B-Lexi-Uncensored with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Andycurrent/Llama-3-8B-Lexi-Uncensored to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Andycurrent/Llama-3-8B-Lexi-Uncensored to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Andycurrent/Llama-3-8B-Lexi-Uncensored to start chatting

Docker Model Runner
How to use Andycurrent/Llama-3-8B-Lexi-Uncensored with Docker Model Runner:
```
docker model run hf.co/Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M
```

Lemonade

How to use Andycurrent/Llama-3-8B-Lexi-Uncensored with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Andycurrent/Llama-3-8B-Lexi-Uncensored:Q4_K_M

Run and chat with the model

lemonade run user.Llama-3-8B-Lexi-Uncensored-Q4_K_M

List all available models

lemonade list

Llama-3-8B-Lexi-Uncensored – Adaptive Conversational Model

The Llama-3-8B-Lexi-Uncensored project delivers an 8-billion-parameter conversational model tuned for users who prefer high-responsiveness, minimal automated moderation, and a flexible instruction-following style suitable for self-hosted environments and research workflows.

Model Overview

Model Name: Llama-3-8B-Lexi-Uncensored
Base Model: Meta Llama-3-8B
Author / Maintainer: Orenguteng
Training Method: Dialogue-centric fine-tuning focused on open instruction patterns
License: Follows the licensing terms of the underlying Llama-3 release (check base model for details)
Primary Intent: A customizable assistant for experimentation, private deployments, and alignment research

Dialogue Format

The model works best with a structured chat pattern consistent with modern instruction models, such as:

<|system|>
System context or behavioral instructions
<|user|>
Your prompt or message
<|assistant|>

This helps maintain clarity throughout extended exchanges and supports consistent instruction execution.

Capabilities

Follows instructions reliably across coding, reasoning, and analytical tasks
Reduced filtering enables deeper exploration during alignment or RLHF research
Capable of maintaining coherent multi-step chains of thought
Performs well in creative writing, drafting, role-play, and idea development
Effective in local inference setups, including quantized runtimes
Designed for sustained, multi-turn conversations without drifting

Recommended Use Cases

Local AI assistant scenarios – brainstorming, drafting, explaining concepts
Developer tooling – code generation, review, technical guides
Research & experimentation – probing model behavior, tuning, alignment studies
Privacy-sensitive workflows – running locally without external dependencies
Creative tasks– story building, character simulation, world design

Important Considerations

The model intentionally avoids strong automated moderation.
Users are fully responsible for operating it responsibly and legally.
Recommended for individuals familiar with LLM deployment, prompt engineering, and governance.
Not intended for deployment in unsupervised public-facing applications.

Acknowledgements

Appreciation goes to Meta for releasing Llama-3, the open-source community for tools enabling fine-tuning and evaluation, and all contributors who support accessible research into instruction-oriented language models. Inspiration for structural formatting was derived from the reference README.

Downloads last month: 242

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for Andycurrent/Llama-3-8B-Lexi-Uncensored

Base model

Orenguteng/Llama-3-8B-Lexi-Uncensored

Quantized

(21)

this model

Collection including Andycurrent/Llama-3-8B-Lexi-Uncensored

UNCENSORED MODELS

Collection

14 items • Updated 7 days ago • 18