Update README.md

586b353 verified 16 days ago

2.9 kB

language:
  - es
metrics:
  - accuracy
library_name: keras
tags:
  - code

Model to detect Chat Intention

This model was trained for academic purposes to detect the intention of the user while chatting with a bot

Classes

As part of a college project about a HealthCare Org Chatbot the classes are:

0: Normal Conversation
1: Patient Information
2: Administrative Questions

IMPORTANT: The model was trained with Spanish Sentences

Accuracy

We ended up with a 0.85 percent of accuracy.

Classification Report:
                          precision    recall  f1-score   support

     Normal conversation       0.87      0.82      0.85        40
     Patient information       0.83      0.85      0.84        40
Administrative questions       0.85      0.88      0.86        40

                accuracy                           0.85       120
               macro avg       0.85      0.85      0.85       120
            weighted avg       0.85      0.85      0.85       120

How to use it?

Use the following script:


import json
import numpy as np
import tensorflow as tf
from huggingface_hub import hf_hub_download
from tensorflow.keras.preprocessing.text import tokenizer_from_json
from tensorflow.keras.preprocessing.sequence import pad_sequences

repo_id = "pollitoconpapass/intent_classification_model"
tokenizer_path = hf_hub_download(repo_id=repo_id, filename="tokenizer.json")

# with open(tokenizer_path, 'r', encoding='utf-8') as f:
#     loaded_tokenizer_config = json.load(f)
#     loaded_tokenizer = tokenizer_from_json(loaded_tokenizer_config)

with open(tokenizer_path, 'r', encoding='utf-8') as f:
    loaded_tokenizer_config = json.load(f)
    loaded_max_len = loaded_tokenizer_config['config']['max_len']

    del loaded_tokenizer_config['config']['max_len']
    loaded_tokenizer = tokenizer_from_json(json.dumps(loaded_tokenizer_config))

model_file_path = hf_hub_download(repo_id=repo_id, filename="intent_classification_model.keras")
loaded_model = tf.keras.models.load_model(model_file_path)

INTENT_MAP = {
    0: "Normal conversation",
    1: "Patient information",
    2: "Administrative questions"
}

def predict_single_sentence(sentence, max_len) -> tuple[str, float]:
    # Preprocess the whole sentence
    sequence = loaded_tokenizer.texts_to_sequences([sentence])
    # Use the loaded_max_len for padding
    padded_sequence = pad_sequences(sequence, maxlen=loaded_max_len, padding='post')

    prediction = loaded_model.predict(padded_sequence, verbose=0)[0] # -> get 1st prediction

    # Prediction + confidence
    predicted_class = np.argmax(prediction)
    confidence = prediction[predicted_class] * 100

    intent = INTENT_MAP[predicted_class]
    return intent, confidence


sentence = "Holaaaa"
intent, confidence = predict_single_sentence(sentence, 10)
print(f"Intent: {intent} (Confidence: {confidence:.2f}%)")