| # Question Difficulty Classification Model | |
| ## Introduction | |
| This project aims to classify question answer pairs based on it's difficulty as easy,Medium or hard.You can pass a single question-answer pair seperated by comma or a list of question-answer pairs to the model. | |
| I have fine tuned [bert-base-cased](https://huggingface.co/bert-base-cased) model with pre-trained parameter on [Question-Answer Dataset](https://www.kaggle.com/datasets/rtatman/questionanswer-dataset) by [Carnegie Mellon University](https://www.cmu.edu/) for this task | |
| ## Table of Contents | |
| - [Model Details](#model-details) | |
| - [How to Get Started With the Model](#how-to-get-started-with-the-model) | |
| - [Dependencies](#dependencies) | |
| - [Risks, Limitations and Biases](#risks-limitations-and-biases) | |
| - [Training](#training) | |
| ## Model Details | |
| **Model Description:** This model is a fine-tune checkpoint of [bert-base-cased](https://huggingface.co/bert-base-cased),pretrained on a large corpus of English data in a self-supervised fashion. . | |
| This model reaches an accuracy of 95 on the dev set (for comparison, Bert bert-base-uncased version reaches an accuracy of 97). | |
| - **Developed by:** Hugging Face | |
| - **Model Type:** Text Classification | |
| - **Language(s):** English | |
| - **License:** Apache-2.0 | |
| - **Parent Model:** For more details about lBERT, we encourage users to check out [this model card](https://huggingface.co/bert-base-cased). | |
| - **Resources for more information:** | |
| - [Model Documentation](https://huggingface.co/docs/transformers/main/en/model_doc/distilbert#transformers.DistilBertForSequenceClassification) | |
| ## Dependencies | |
| - Transformer | |
| - Python 3.7.13 | |
| - Numpy | |
| ## How to use the model | |
| 1. Import Essential Libraries | |
| | |
| ```python | |
| from transformers import TFBertModel | |
| from transformers import BertTokenizer | |
| import tensorflow as tf | |
| ``` | |
| 2. Load the Model and Tokenizer | |
| ```python | |
| questionclassification_model = tf.keras.models.load_model(<path to the model>) | |
| tokenizer = BertTokenizer.from_pretrained('bert-base-cased') | |
| ``` | |
| 3. Essential Functions | |
| ```python | |
| def prepare_data(input_text): | |
| token = tokenizer.batch_encode_plus( | |
| input_text, | |
| max_length=256, | |
| truncation=True, | |
| padding='max_length', | |
| add_special_tokens=True, | |
| return_tensors='tf' | |
| ) | |
| return { | |
| 'input_ids': tf.cast(token['input_ids'], tf.float64), | |
| 'attention_mask': tf.cast(token['attention_mask'], tf.float64) | |
| } | |
| def make_prediction(model, processed_data, classes=['Easy', 'Medium', 'Hard']): | |
| outcls=[] | |
| probs = model.predict(processed_data) | |
| s=probs.argmax(axis=1) | |
| for i in range(0,len(probs)): | |
| outcls.append(classes[s[i]]) | |
| return outcls,probs; | |
| ``` | |
| 3.Make predictions on the list of questions-answer pairs | |
| ```python | |
| input_text = ["What is gandhi commonly considered to be?,Father of the nation in india","What is the long-term warming of the planets overall temperature called?, Global Warming"] | |
| processed_data = prepare_data(input_text) | |
| result,prob = make_prediction(questionclassification_model, processed_data=processed_data) | |
| for i in range (len(result)): | |
| print(f"{result[i]} : {max(prob[i])}") | |
| ``` | |
| ## Risks, Limitations and Biases | |
| - The predicted outputs have only very less easy category questions. | |
| - 90% of the easy questions in the dataset are yes/no type questions. | |
| - Very few datasets are available in public for question difficulty classification. | |
| - People who are experts in a specific subject can only create a dataset for this task.Otherwise,The model will generate wrong results. | |
| # Training | |
| #### Training Data | |
| I used [Question-Answer Dataset](https://www.kaggle.com/datasets/rtatman/questionanswer-dataset) by [Carnegie Mellon University](https://www.cmu.edu/) for this task | |
| #### Training Procedure | |
| ###### Fine-tuning hyper-parameters | |
| - learning_rate = 1e-5 | |
| - decay = 1e-6 | |
| - optimizer = adam | |
| - loss function = categorical cross entropy | |
| - max_length = 256 | |
| - num_train_epochs = 10 | |