pollitoconpapass's picture
Update README.md
586b353 verified
|
Raw
History Blame Contribute Delete
2.9 kB
---
language:
- es
metrics:
- accuracy
library_name: keras
tags:
- code
---
# Model to detect Chat Intention
This model was trained for academic purposes to detect the intention of the user while chatting with a bot
## Classes
As part of a college project about a HealthCare Org Chatbot the classes are:
- 0: Normal Conversation
- 1: Patient Information
- 2: Administrative Questions
IMPORTANT: The model was trained with Spanish Sentences
## Accuracy
We ended up with a 0.85 percent of accuracy.
```sh
Classification Report:
precision recall f1-score support
Normal conversation 0.87 0.82 0.85 40
Patient information 0.83 0.85 0.84 40
Administrative questions 0.85 0.88 0.86 40
accuracy 0.85 120
macro avg 0.85 0.85 0.85 120
weighted avg 0.85 0.85 0.85 120
```
## How to use it?
Use the following script:
```py
import json
import numpy as np
import tensorflow as tf
from huggingface_hub import hf_hub_download
from tensorflow.keras.preprocessing.text import tokenizer_from_json
from tensorflow.keras.preprocessing.sequence import pad_sequences
repo_id = "pollitoconpapass/intent_classification_model"
tokenizer_path = hf_hub_download(repo_id=repo_id, filename="tokenizer.json")
# with open(tokenizer_path, 'r', encoding='utf-8') as f:
# loaded_tokenizer_config = json.load(f)
# loaded_tokenizer = tokenizer_from_json(loaded_tokenizer_config)
with open(tokenizer_path, 'r', encoding='utf-8') as f:
loaded_tokenizer_config = json.load(f)
loaded_max_len = loaded_tokenizer_config['config']['max_len']
del loaded_tokenizer_config['config']['max_len']
loaded_tokenizer = tokenizer_from_json(json.dumps(loaded_tokenizer_config))
model_file_path = hf_hub_download(repo_id=repo_id, filename="intent_classification_model.keras")
loaded_model = tf.keras.models.load_model(model_file_path)
INTENT_MAP = {
0: "Normal conversation",
1: "Patient information",
2: "Administrative questions"
}
def predict_single_sentence(sentence, max_len) -> tuple[str, float]:
# Preprocess the whole sentence
sequence = loaded_tokenizer.texts_to_sequences([sentence])
# Use the loaded_max_len for padding
padded_sequence = pad_sequences(sequence, maxlen=loaded_max_len, padding='post')
prediction = loaded_model.predict(padded_sequence, verbose=0)[0] # -> get 1st prediction
# Prediction + confidence
predicted_class = np.argmax(prediction)
confidence = prediction[predicted_class] * 100
intent = INTENT_MAP[predicted_class]
return intent, confidence
sentence = "Holaaaa"
intent, confidence = predict_single_sentence(sentence, 10)
print(f"Intent: {intent} (Confidence: {confidence:.2f}%)")
```