pollitoconpapass
/

intent_classification_model

Model card Files Files and versions

intent_classification_model / README.md

pollitoconpapass's picture

pollitoconpapass

Update README.md

586b353 verified 16 days ago

|

History Blame Contribute Delete

2.9 kB

	---
	language:
	- es
	metrics:
	- accuracy
	library_name: keras
	tags:
	- code
	---
	# Model to detect Chat Intention

	This model was trained for academic purposes to detect the intention of the user while chatting with a bot

	## Classes
	As part of a college project about a HealthCare Org Chatbot the classes are:
	- 0: Normal Conversation
	- 1: Patient Information
	- 2: Administrative Questions

	IMPORTANT: The model was trained with Spanish Sentences

	## Accuracy
	We ended up with a 0.85 percent of accuracy.

	```sh
	Classification Report:
	precision recall f1-score support

	Normal conversation 0.87 0.82 0.85 40
	Patient information 0.83 0.85 0.84 40
	Administrative questions 0.85 0.88 0.86 40

	accuracy 0.85 120
	macro avg 0.85 0.85 0.85 120
	weighted avg 0.85 0.85 0.85 120

	```


	## How to use it?
	Use the following script:
	```py

	import json
	import numpy as np
	import tensorflow as tf
	from huggingface_hub import hf_hub_download
	from tensorflow.keras.preprocessing.text import tokenizer_from_json
	from tensorflow.keras.preprocessing.sequence import pad_sequences

	repo_id = "pollitoconpapass/intent_classification_model"
	tokenizer_path = hf_hub_download(repo_id=repo_id, filename="tokenizer.json")

	# with open(tokenizer_path, 'r', encoding='utf-8') as f:
	# loaded_tokenizer_config = json.load(f)
	# loaded_tokenizer = tokenizer_from_json(loaded_tokenizer_config)

	with open(tokenizer_path, 'r', encoding='utf-8') as f:
	loaded_tokenizer_config = json.load(f)
	loaded_max_len = loaded_tokenizer_config['config']['max_len']

	del loaded_tokenizer_config['config']['max_len']
	loaded_tokenizer = tokenizer_from_json(json.dumps(loaded_tokenizer_config))

	model_file_path = hf_hub_download(repo_id=repo_id, filename="intent_classification_model.keras")
	loaded_model = tf.keras.models.load_model(model_file_path)

	INTENT_MAP = {
	0: "Normal conversation",
	1: "Patient information",
	2: "Administrative questions"
	}

	def predict_single_sentence(sentence, max_len) -> tuple[str, float]:
	# Preprocess the whole sentence
	sequence = loaded_tokenizer.texts_to_sequences([sentence])
	# Use the loaded_max_len for padding
	padded_sequence = pad_sequences(sequence, maxlen=loaded_max_len, padding='post')

	prediction = loaded_model.predict(padded_sequence, verbose=0)[0] # -> get 1st prediction

	# Prediction + confidence
	predicted_class = np.argmax(prediction)
	confidence = prediction[predicted_class] * 100

	intent = INTENT_MAP[predicted_class]
	return intent, confidence


	sentence = "Holaaaa"
	intent, confidence = predict_single_sentence(sentence, 10)
	print(f"Intent: {intent} (Confidence: {confidence:.2f}%)")
	```