README: Chatbot Training with BART

Overview

This project trains a chatbot using the facebook/bart-large-cnn model from Hugging Face's Transformers library. The chatbot is trained on a dataset of question-answer pairs and is capable of generating responses to user queries.

Dependencies

Ensure you have the following libraries installed before running the script:

pip install transformers datasets torch

Dataset

The chatbot is trained on a CSV dataset (dataset.csv) containing two columns:

question: The input question.
answer: The corresponding answer.

The dataset is loaded using the Hugging Face datasets library.

Training Process

Tokenization:
- Uses AutoTokenizer to process text.
- Truncates and pads input to a maximum length of 256 tokens.
Data Splitting:
- The dataset is split into a training set (80%) and an evaluation set (20%).
Training Configuration:
- Uses Trainer API for fine-tuning.
- Trains for 10 epochs with a batch size of 12.
- Saves checkpoints every epoch.
- Loads the best model at the end.
Model Saving:
- The trained model and tokenizer are saved in ./saved_model.

Inference (Generating Responses)

After training, you can generate responses using the generate_text() function. It supports parameters like:

temperature: Controls randomness of responses.
top_p: Nucleus sampling for response diversity.
repetition_penalty: Prevents excessive repetition.

Interactive Chatbot Mode

The script includes an interactive mode where users can input queries:

python chatbot.py

To exit, type exit.

Model Storage

Trained model is stored in ./saved_model.
Training logs and checkpoints are stored in ./results and ./logs.

Future Improvements

Train on a larger dataset.
Use a larger model like facebook/bart-large-xsum.
Integrate a web-based frontend.

Author

This project was created for research and development in chatbot training using transformer-based models.

Downloads last month: 3

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hoangkha1810/bart-mathematics

Base model

facebook/bart-large-cnn

Finetuned

(413)

this model