Text Generation
Transformers
Safetensors
English
llama
customer-support
LLaMA
fine-tuned
AI
chatbot
ai-agent
conversational
text-generation-inference
Instructions to use Ansah-AI/E1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Ansah-AI/E1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Ansah-AI/E1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Ansah-AI/E1") model = AutoModelForCausalLM.from_pretrained("Ansah-AI/E1") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Ansah-AI/E1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Ansah-AI/E1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ansah-AI/E1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Ansah-AI/E1
- SGLang
How to use Ansah-AI/E1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Ansah-AI/E1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ansah-AI/E1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Ansah-AI/E1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ansah-AI/E1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Ansah-AI/E1 with Docker Model Runner:
docker model run hf.co/Ansah-AI/E1
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Ansah-AI/E1")
model = AutoModelForCausalLM.from_pretrained("Ansah-AI/E1")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))Quick Links
Ansah E1: Fine-Tuned Customer Support Model
Model Overview
Ansah E1 is a fine-tuned version of Meta’s LLaMA 1B-intruct,
built for automating customer support across industries.
It provides fast, accurate, and context-aware responses,
making it ideal for businesses seeking AI-driven support solutions.
While it is highly optimized for e-commerce,
it can also be used for SaaS, IT support, and enterprise service automation.
Unlike traditional cloud-based models, Ansah E1 runs locally,
ensuring data privacy, lower operational costs, and reduced latency.
Key Features
- Accurate and context-aware responses
- Understands structured and unstructured customer queries
- Maintains conversation memory for multi-turn interactions
- Automated ticket escalation when used with langchain or other frameworks
- Detects critical cases and escalates them intelligently
- Reduces workload by handling repetitive issues autonomously
- Local deployment and data privacy
- Runs entirely on-premises for full data control
- Eliminates external cloud dependencies, ensuring security
- Optimized for efficient performance
- Works smoothly on consumer-grade GPUs and high-performance CPUs
- Available in 4-bit GGUF format for lightweight, optimized deployment
- Seamless API and tool integration
- Can integrate with e-commerce platforms, SaaS tools, and IT support systems
- Supports tool-calling functions to automate business workflows
Model Details
- Base Model: Meta LLaMA 1B
- Fine-Tuned Data: Customer support logs, e-commerce transactions, and business service inquiries
- Primary Use Cases:
- E-Commerce: Order tracking, refunds, cancellations, and payment assistance
- IT and SaaS Support: AI-powered help desks and troubleshooting
- Enterprise Automation: On-prem AI assistants for business operations
- Hardware Compatibility:
- Optimized for local GPU and CPU deployment
- Available in GGUF format for lightweight, high-speed inference
Available Model Formats
Full Precision Model (Hugging Face Transformers)
Repository: [Ansah E1](https://huggingface.co/Ansah-AI/E1)
- Best suited for high-accuracy, real-time inference
- Runs efficiently with 4-bit or 8-bit quantization for optimal performance
4-Bit GGUF Model for Lightweight Deployment
Repository: [Ansah E1 - 4bit GGUF](https://huggingface.co/dheerajdasari/E1-Q4_K_M-GGUF)
- Designed for low-resource environments
- Ideal for Llama.cpp, KoboldAI, and local AI inference engines
How to Use
Using the Full Precision Model
python
from transformers import AutoTokenizer, AutoModelForCausalLM
Load the fine-tuned model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Ansah-AI/E1")
model = AutoModelForCausalLM.from_pretrained("Ansah-AI/E1")
- For optimized inference, use 4-bit or 8-bit quantization via bitsandbytes
Using the GGUF 4-Bit Model (For Llama.cpp and Local Inference)
bash
Download the GGUF model
wget https://huggingface.co/dheerajdasari/E1-Q4_K_M-GGUF/resolve/main/E1-Q4_K_M.gguf
Run using Llama.cpp
./main -m E1-Q4_K_M.gguf -p "Hello, how can I assist you?"
- Works with Llama.cpp, KoboldAI, and other local inference frameworks
- Perfect for low-power devices or edge deployment
Conclusion
Ansah E1 is a scalable, private, and efficient AI model designed to automate customer support across multiple industries. It eliminates cloud dependencies, ensuring cost-effective and secure deployment while providing fast, intelligent, and reliable support automation.
Try it now:
[Ansah E1 (Full Model)](https://huggingface.co/Ansah-AI/E1)
[Ansah E1 - 4bit GGUF](https://huggingface.co/dheerajdasari/E1-Q4_K_M-GGUF)
- Downloads last month
- 4
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Ansah-AI/E1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)