Halfotter
/

flud

Text Classification

xlm_steel_classifier

integrated-model

Model card Files Files and versions

flud / UPLOAD_GUIDE.md

Halfotter's picture

Upload 16 files

14ebc37 verified 7 months ago

|

history blame contribute delete

2.57 kB

	# Steel Material Classification Model Upload Guide

	## Step 1: Get Hugging Face Token

	1. Go to https://huggingface.co/settings/tokens
	2. Click "New token"
	3. Give it a name (e.g., "model-upload-token")
	4. Select "Write" role
	5. Copy the token

	## Step 2: Login to Hugging Face

	```bash
	huggingface-cli login
	# Enter your token when prompted
	```

	## Step 3: Create Model Repository

	```bash
	huggingface-cli repo create steel-material-classifier --type model
	```

	## Step 4: Upload Model

	```bash
	# Clone the repository
	git clone https://huggingface.co/YOUR_USERNAME/steel-material-classifier
	cd steel-material-classifier

	# Copy all files from model_v24 directory
	# Then commit and push
	git add .
	git commit -m "Initial commit: Steel material classification model"
	git push
	```

	## Alternative: Direct Upload

	```bash
	# From the model_v24 directory
	huggingface-cli upload YOUR_USERNAME/steel-material-classifier . --include ".json,.safetensors,.pkl,.md,.txt,.py"
	```

	## Files to Upload

	### Required Files:
	- ✅ config.json
	- ✅ model.safetensors
	- ✅ tokenizer.json
	- ✅ tokenizer_config.json
	- ✅ special_tokens_map.json
	- ✅ label_mapping.json

	### Optional Files:
	- ✅ classifier.pkl
	- ✅ label_embeddings.pkl
	- ✅ label_embeddings.pkl.backup

	### Documentation Files:
	- ✅ README.md
	- ✅ requirements.txt
	- ✅ inference.py
	- ✅ preprocessor.py
	- ✅ model_card.md
	- ✅ usage.md

	## Model Information

	- Model Name: steel-material-classifier
	- Base Model: XLM-RoBERTa
	- Task: Sequence Classification
	- Labels: 66 steel industry materials
	- Languages: Korean, English
	- Model Size: ~1GB

	## Usage After Upload

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model
	model_name = "YOUR_USERNAME/steel-material-classifier"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	# Predict
	text = "철광석을 고로에서 환원하여 선철을 제조하는 과정"
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

	with torch.no_grad():
	outputs = model(**inputs)
	predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
	predicted_class = torch.argmax(predictions, dim=1).item()

	label = model.config.id2label[predicted_class]
	confidence = predictions[0][predicted_class].item()
	print(f"Predicted: {label} (Confidence: {confidence:.4f})")
	```