kingabzpro
/

Phi-3.5-mini-instruct-Ecommerce-Text-Classification

Text Generation

text-generation-inference

Model card Files Files and versions

Phi-3.5-mini-instruct-Ecommerce-Text-Classification / README.md

kingabzpro's picture

Update README.md

49e97ab verified over 1 year ago

|

history blame contribute delete

2.62 kB

	---
	datasets:
	- saurabhshahane/ecommerce-text-classification
	language:
	- en
	library_name: transformers
	license: apache-2.0
	metrics:
	- accuracy
	- f1
	pipeline_tag: text-generation
	tags:
	- Ecommerce
	- Phi-3.5
	- Fine-tuned
	---

	## Phi-3.5-mini-instruct-Ecommerce-Text-Classification
	This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on an [saurabhshahane/ecommerce-text-classification](https://www.kaggle.com/datasets/saurabhshahane/ecommerce-text-classification) dataset.

	## Tutorial

	Customize Phi-3.5-mini-instruct model to predict various Ecommerce Categories from the text.

	## Use with Transformers

	```python
	from transformers import AutoTokenizer,AutoModelForCausalLM,pipeline
	import torch

	model_id = "kingabzpro/Phi-3.5-mini-instruct-Ecommerce-Text-Classification"

	tokenizer = AutoTokenizer.from_pretrained(model_id)

	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	return_dict=True,
	low_cpu_mem_usage=True,
	torch_dtype=torch.float16,
	device_map="auto",
	trust_remote_code=True,
	)

	text = "Inalsa Dazzle Glass Top, 3 Burner Gas Stove with Rust Proof Powder Coated Body, Black Toughened Glass Top, 2 Medium and 1 Small High Efficiency Brass Burners, Aluminum Mixing Tubes, Powder Coated Body, Inbuilt Stainless Steel Drip Trays, 360 degree Swivel Nozzle,Bigger Legs to Facilitate Cleaning Under Cooktop"
	prompt = f"""Classify the E-commerce text into Electronics, Household, Books and Clothing.
	text: {text}
	label: """.strip()

	pipe = pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	outputs = pipe(prompt, max_new_tokens=4, do_sample=True, temperature=0.1)

	print(outputs[0]["generated_text"].split("label: ")[-1].strip())

	# Household
	```
	## Results

	```bash
	Accuracy: 0.860
	Accuracy for label Electronics: 0.825
	Accuracy for label Household: 0.926
	Accuracy for label Books: 0.683
	Accuracy for label Clothing: 0.947
	```
	Classification Report:

	```bash
	precision recall f1-score support

	Electronics 0.97 0.82 0.89 40
	Household 0.88 0.93 0.90 81
	Books 0.90 0.68 0.78 41
	Clothing 0.88 0.95 0.91 38

	micro avg 0.90 0.86 0.88 200
	macro avg 0.91 0.85 0.87 200
	weighted avg 0.90 0.86 0.88 200
	```

	Confusion Matrix:

	```bash
	[[33 6 1 0]
	[ 1 75 2 3]
	[ 0 3 28 2]
	[ 0 1 0 36]]
	```