Text Generation
Transformers
Safetensors
English
llama
coder
Text-Generation
Transformers
HelpingAI
conversational
Eval Results (legacy)
text-generation-inference
How to use from
SGLangUse Docker images
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "OEvortex/HelpingAI-Lite" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "OEvortex/HelpingAI-Lite",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'Quick Links
HelpingAI-Lite
Subscribe to my YouTube channel
GGUF version here
HelpingAI-Lite is a lite version of the HelpingAI model that can assist with coding tasks. It's trained on a diverse range of datasets and fine-tuned to provide accurate and helpful responses.
License
This model is licensed under MIT.
Datasets
The model was trained on the following datasets:
- cerebras/SlimPajama-627B
- bigcode/starcoderdata
- HuggingFaceH4/ultrachat_200k
- HuggingFaceH4/ultrafeedback_binarized
Language
The model supports English language.
Usage
CPU and GPU code
from transformers import pipeline
from accelerate import Accelerator
# Initialize the accelerator
accelerator = Accelerator()
# Initialize the pipeline
pipe = pipeline("text-generation", model="OEvortex/HelpingAI-Lite", device=accelerator.device)
# Define the messages
messages = [
{
"role": "system",
"content": "You are a chatbot who can help code!",
},
{
"role": "user",
"content": "Write me a function to calculate the first 10 digits of the fibonacci sequence in Python and print it out to the CLI.",
},
]
# Prepare the prompt
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Generate predictions
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
# Print the generated text
print(outputs[0]["generated_text"])
- Downloads last month
- 173
Model tree for OEvortex/HelpingAI-Lite
Datasets used to train OEvortex/HelpingAI-Lite
Evaluation results
- Epochself-reported3.000
- Eval Logits/Chosenself-reported-2.707
- Eval Logits/Rejectedself-reported-2.657
- Eval Logps/Chosenself-reported-370.130
- Eval Logps/Rejectedself-reported-296.074
- Eval Lossself-reported0.514
- Eval Rewards/Accuraciesself-reported0.738
- Eval Rewards/Chosenself-reported-0.027
- Eval Rewards/Marginsself-reported1.009
- Eval Rewards/Rejectedself-reported-1.036
- Eval Runtimeself-reported93.591
- Eval Samplesself-reported2000.000
- Eval Samples per Secondself-reported21.370
- Eval Steps per Secondself-reported0.673
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "OEvortex/HelpingAI-Lite" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OEvortex/HelpingAI-Lite", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'