vluz
/

toxmodel20

Model card Files Files and versions

toxmodel20 / README.md

vluz's picture

Update README.md

b224257 over 2 years ago

|

history blame contribute delete

2.18 kB

	---
	license: cc0-1.0
	---

	Note: Due to nature of toxic comments data and code contain explicit language.

	Data is from kaggle, the Toxic Comment Classification Challenge
	<br>
	https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge/data?select=train.csv.zip

	A copy of the data exists on the `data` directory.

	Trained over 20 epoch in a runpod

	### 🤗 Running demo here:
	https://huggingface.co/spaces/vluz/Tox

	<hr>

	Code requires pandas, tensorflow, and streamlit. All can be installed via `pip`.

	```python
	import os
	import pickle
	import streamlit as st
	import tensorflow as tf
	from tensorflow.keras.layers import TextVectorization


	@st.cache_resource
	def load_model():
	model = tf.keras.models.load_model(os.path.join("model", "toxmodel.keras"))
	return model


	@st.cache_resource
	def load_vectorizer():
	from_disk = pickle.load(open(os.path.join("model", "vectorizer.pkl"), "rb"))
	new_v = TextVectorization.from_config(from_disk['config'])
	new_v.adapt(tf.data.Dataset.from_tensor_slices(["xyz"])) # fix for Keras bug
	new_v.set_weights(from_disk['weights'])
	return new_v


	st.title("Toxic Comment Test")
	st.divider()
	model = load_model()
	vectorizer = load_vectorizer()
	default_prompt = "i love you man, but fuck you!"
	input_text = st.text_area("Comment:", default_prompt, height=150).lower()
	if st.button("Test"):
	if not input_text:
	st.write("⚠ Warning: Empty prompt.")
	elif len(input_text) < 15:
	st.write("⚠ Warning: Model is far less accurate with a small prompt.")
	if input_text == default_prompt:
	st.write("Expected results from default prompt are positive for 0 and 2")
	with st.spinner("Testing..."):
	inputv = vectorizer([input_text])
	output = model.predict(inputv)
	res = (output > 0.5)
	st.write(["toxic","severe toxic","obscene","threat","insult","identity hate"], res)
	st.write(output)
	```


	Put `toxmodel.keras` and `vectorizer.pkl` into the `model` dir.

	Then do:
	```
	stramlit run toxtest.py
	```

	Expected result from default prompt is 0 and 2

	<hr>

	Full code can be found here:
	<br>
	https://github.com/vluz/ToxTest/
	<br>