Instructions to use deepparag/Aeona-Beta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepparag/Aeona-Beta with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="deepparag/Aeona-Beta") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("deepparag/Aeona-Beta") model = AutoModelForCausalLM.from_pretrained("deepparag/Aeona-Beta") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use deepparag/Aeona-Beta with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deepparag/Aeona-Beta" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepparag/Aeona-Beta", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/deepparag/Aeona-Beta
- SGLang
How to use deepparag/Aeona-Beta with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deepparag/Aeona-Beta" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepparag/Aeona-Beta", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deepparag/Aeona-Beta" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepparag/Aeona-Beta", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use deepparag/Aeona-Beta with Docker Model Runner:
docker model run hf.co/deepparag/Aeona-Beta
Aeona | Chatbot
An generative AI made using microsoft/DialoGPT-small.
Recommended to use along with an AIML Chatbot to reduce load, get better replies, add name and personality to your bot. Using an AIML Chatbot will allow you to hardcode some replies also.
AEONA
Aeona is an chatbot which hope's to be able to talk with humans as if its an friend! It's main target platform is discord. You can invite the bot here.
To learn more about this project and chat with the ai, you can use this website.
Aeona works why using context of the previous messages and guessing the personality of the human who is talking with it and adapting its own personality to better talk with the user.
Goals
The goal is to create an AI which will work with AIML in order to create the most human like AI.
Why not an AI on its own?
For AI it is not possible (realistically) to learn about the user and store data on them, when compared to an AIML which can even execute code! The goal of the AI is to generate responses where the AIML fails.
Hence the goals becomes to make an AI which has a wide variety of knowledge, yet be as small as possible! So we use 3 dataset:-
- Movielines The movie lines promote longer and more thought out responses but it can be very random. About 200k lines!
- Discord Messages The messages are on a wide variety of topics filtered and removed spam which makes the AI highly random but gives it a very random response to every days questions! about 120 million messages!
- Custom dataset scrapped from my messages, These messages are very narrow teaching this dataset and sending a random reply will make the AI say sorry loads of time!
Training
The Discord Messages Dataset simply dwarfs the other datasets, Hence the data sets are repeated. This leads to them covering each others issues!
The AI has a context of 6 messages which means it will reply until the 4th message from user. Example
Tips for Hugging Face interference
I recommend send the user input,
previous 3 AI and human responses.
Using more context than this will lead to useless responses but using less is alright but the responses may be random.
Evaluation
Below is a comparison of Aeona vs. other baselines on the mixed dataset given above using automatic evaluation metrics.
| Model | Perplexity |
|---|---|
| Seq2seq Baseline [3] | 29.8 |
| Wolf et al. [5] | 16.3 |
| GPT-2 baseline | 99.5 |
| DialoGPT baseline | 56.6 |
| DialoGPT finetuned | 11.4 |
| PersonaGPT | 10.2 |
| Aeona | 7.9 |
Usage
Example:
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("deepparag/Aeona")
model = AutoModelWithLMHead.from_pretrained("deepparag/Aeona")
# Let's chat for 4 lines
for step in range(4):
# encode the new user input, add the eos_token and return a tensor in Pytorch
new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt')
# print(new_user_input_ids)
# append the new user input tokens to the chat history
bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids
# generated a response while limiting the total chat history to 1000 tokens,
chat_history_ids = model.generate(
bot_input_ids, max_length=200,
pad_token_id=tokenizer.eos_token_id,
no_repeat_ngram_size=4,
do_sample=True,
top_k=100,
top_p=0.7,
temperature=0.8
)
# pretty print last ouput tokens from bot
print("Aeona: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
- Downloads last month
- 23
