README: Chatbot Training with BART
Overview
This project trains a chatbot using the facebook/bart-large-cnn model from Hugging Face's Transformers library. The chatbot is trained on a dataset of question-answer pairs and is capable of generating responses to user queries.
Dependencies
Ensure you have the following libraries installed before running the script:
pip install transformers datasets torch
Dataset
The chatbot is trained on a CSV dataset (dataset.csv) containing two columns:
question: The input question.answer: The corresponding answer.
The dataset is loaded using the Hugging Face datasets library.
Training Process
Tokenization:
- Uses
AutoTokenizerto process text. - Truncates and pads input to a maximum length of 256 tokens.
- Uses
Data Splitting:
- The dataset is split into a training set (80%) and an evaluation set (20%).
Training Configuration:
- Uses
TrainerAPI for fine-tuning. - Trains for 10 epochs with a batch size of 12.
- Saves checkpoints every epoch.
- Loads the best model at the end.
- Uses
Model Saving:
- The trained model and tokenizer are saved in
./saved_model.
- The trained model and tokenizer are saved in
Inference (Generating Responses)
After training, you can generate responses using the generate_text() function. It supports parameters like:
temperature: Controls randomness of responses.top_p: Nucleus sampling for response diversity.repetition_penalty: Prevents excessive repetition.
Interactive Chatbot Mode
The script includes an interactive mode where users can input queries:
python chatbot.py
To exit, type exit.
Model Storage
- Trained model is stored in
./saved_model. - Training logs and checkpoints are stored in
./resultsand./logs.
Future Improvements
- Train on a larger dataset.
- Use a larger model like
facebook/bart-large-xsum. - Integrate a web-based frontend.
Author
This project was created for research and development in chatbot training using transformer-based models.
- Downloads last month
- 3
Model tree for hoangkha1810/bart-mathematics
Base model
facebook/bart-large-cnn