Instructions to use Ansah-AI/E1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Ansah-AI/E1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Ansah-AI/E1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Ansah-AI/E1")
model = AutoModelForCausalLM.from_pretrained("Ansah-AI/E1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Ansah-AI/E1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Ansah-AI/E1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ansah-AI/E1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Ansah-AI/E1

SGLang

How to use Ansah-AI/E1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Ansah-AI/E1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ansah-AI/E1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Ansah-AI/E1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ansah-AI/E1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Ansah-AI/E1 with Docker Model Runner:
```
docker model run hf.co/Ansah-AI/E1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

 Ansah E1: Fine-Tuned Customer Support Model

 Model Overview
Ansah E1 is a fine-tuned version of Meta’s LLaMA 1B-intruct,
 built for automating customer support across industries.
It provides fast, accurate, and context-aware responses, 
making it ideal for businesses seeking AI-driven support solutions.  

While it is highly optimized for e-commerce,
 it can also be used for SaaS, IT support, and enterprise service automation.
Unlike traditional cloud-based models, Ansah E1 runs locally, 
ensuring data privacy, lower operational costs, and reduced latency.  



 Key Features
- Accurate and context-aware responses  
  - Understands structured and unstructured customer queries  
  - Maintains conversation memory for multi-turn interactions  

- Automated ticket escalation when used with langchain  or other frameworks
  - Detects critical cases and escalates them intelligently  
  - Reduces workload by handling repetitive issues autonomously  

- Local deployment and data privacy  
  - Runs entirely on-premises for full data control  
  - Eliminates external cloud dependencies, ensuring security  

- Optimized for efficient performance  
  - Works smoothly on consumer-grade GPUs and high-performance CPUs  
  - Available in 4-bit GGUF format for lightweight, optimized deployment  

- Seamless API and tool integration  
  - Can integrate with e-commerce platforms, SaaS tools, and IT support systems  
  - Supports tool-calling functions to automate business workflows  



 Model Details
- Base Model: Meta LLaMA 1B  
- Fine-Tuned Data: Customer support logs, e-commerce transactions, and business service inquiries  
- Primary Use Cases:  
  - E-Commerce: Order tracking, refunds, cancellations, and payment assistance  
  - IT and SaaS Support: AI-powered help desks and troubleshooting  
  - Enterprise Automation: On-prem AI assistants for business operations  
- Hardware Compatibility:  
  - Optimized for local GPU and CPU deployment  
  - Available in GGUF format for lightweight, high-speed inference  



 Available Model Formats  
 Full Precision Model (Hugging Face Transformers)
Repository: [Ansah E1](https://huggingface.co/Ansah-AI/E1)  
- Best suited for high-accuracy, real-time inference  
- Runs efficiently with 4-bit or 8-bit quantization for optimal performance  

 4-Bit GGUF Model for Lightweight Deployment
Repository: [Ansah E1 - 4bit GGUF](https://huggingface.co/dheerajdasari/E1-Q4_K_M-GGUF)  
- Designed for low-resource environments  
- Ideal for Llama.cpp, KoboldAI, and local AI inference engines  



 How to Use  

 Using the Full Precision Model
python
from transformers import AutoTokenizer, AutoModelForCausalLM

 Load the fine-tuned model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Ansah-AI/E1")
model = AutoModelForCausalLM.from_pretrained("Ansah-AI/E1")

- For optimized inference, use 4-bit or 8-bit quantization via bitsandbytes  



 Using the GGUF 4-Bit Model (For Llama.cpp and Local Inference)
bash
 Download the GGUF model
wget https://huggingface.co/dheerajdasari/E1-Q4_K_M-GGUF/resolve/main/E1-Q4_K_M.gguf

 Run using Llama.cpp
./main -m E1-Q4_K_M.gguf -p "Hello, how can I assist you?"

- Works with Llama.cpp, KoboldAI, and other local inference frameworks  
- Perfect for low-power devices or edge deployment  



 Conclusion  
Ansah E1 is a scalable, private, and efficient AI model designed to automate customer support across multiple industries. It eliminates cloud dependencies, ensuring cost-effective and secure deployment while providing fast, intelligent, and reliable support automation.  

Try it now:  
[Ansah E1 (Full Model)](https://huggingface.co/Ansah-AI/E1)  
[Ansah E1 - 4bit GGUF](https://huggingface.co/dheerajdasari/E1-Q4_K_M-GGUF)

Downloads last month: 4

Safetensors

Model size

1B params

Tensor type

F16

Model tree for Ansah-AI/E1

Base model

meta-llama/Llama-3.2-1B-Instruct

Finetuned

(1728)

this model

Quantizations

2 models