Spaces:

songhieng
/

MLOps-Platforms

Sleeping

App Files Files Community

MLOps-Platforms / Docs /STARTUP_GUIDE.md

songhieng

Upload 72 files

7e825f9 verified about 1 month ago

preview code

raw

history blame contribute delete

5.22 kB

	# 🚀 MLOps Platform Startup Guide

	Welcome to the MLOps Training Platform! This guide will help you get started quickly.

	## ⚡ Quick Launch

	### Option 1: Streamlit Web Interface (Recommended)

	```bash
	# Activate your virtual environment
	# Windows:
	venv\Scripts\activate
	# Linux/Mac:
	source venv/bin/activate

	# Launch the Streamlit app
	streamlit run streamlit_app.py

	# The app will open in your browser at http://localhost:8501
	```

	### Option 2: Programmatic Usage

	```bash
	# Run the example script
	python example_usage.py
	```

	### Option 3: FastAPI Backend (Original)

	```bash
	# Run the FastAPI server
	python -m uvicorn app.main:app --reload

	# API will be available at http://localhost:8000
	# Interactive docs at http://localhost:8000/docs
	```

	## 📦 First-Time Setup Checklist

	- [ ] Python 3.8+ installed
	- [ ] Virtual environment created (`python -m venv venv`)
	- [ ] Virtual environment activated
	- [ ] Dependencies installed (`pip install -r requirements.txt`)
	- [ ] At least 4GB RAM available
	- [ ] Internet connection (for downloading models)

	## 🎯 Your First Training Session

	### 1. Prepare Your Data

	Create a CSV file with these columns:
	- `text` - Your text samples
	- `label` - Binary labels (0 or 1)

	Example: phishing_data.csv
	```csv
	text,label
	"Legitimate business email content",0
	"URGENT: Click here to claim prize!",1
	"Meeting scheduled for tomorrow",0
	"Your account is compromised! Act now!",1
	```

	### 2. Launch the Platform

	```bash
	streamlit run streamlit_app.py
	```

	### 3. Follow the Workflow

	1. Data Upload Tab
	- Upload your CSV file
	- Or click "Sample" button to load example data
	- Verify data structure and class distribution

	2. Training Config Tab
	- Select target language (English, Chinese, Khmer)
	- Choose model architecture (start with DistilBERT for CPU)
	- Adjust hyperparameters:
	- Epochs: 3-5 for most tasks
	- Batch size: 8-16 for CPU, 32-64 for GPU
	- Learning rate: 2e-5 (default is good)

	3. Training Tab
	- Click "Start Training"
	- Monitor progress in real-time
	- Watch metrics update live

	4. Evaluation Tab
	- Review final metrics
	- Test model with new text
	- Download trained model

	## 🌍 Language-Specific Tips

	### English 🇬🇧
	- Use RoBERTa or DistilBERT
	- Standard preprocessing works well
	- Fast training on CPU

	### Chinese 🇨🇳
	- Use mBERT or XLM-RoBERTa
	- Automatic word segmentation with jieba
	- May need more training time

	### Khmer 🇰🇭
	- Use mBERT or XLM-RoBERTa
	- Unicode normalization applied
	- Ensure UTF-8 encoding in CSV

	## 💡 Pro Tips

	### For CPU Training
	```python
	# In Training Config:
	- Model: distilbert-base-multilingual-cased
	- Batch size: 8
	- Max length: 128
	- Epochs: 3
	```

	### For GPU Training
	```python
	# In Training Config:
	- Model: xlm-roberta-base
	- Batch size: 32
	- Max length: 256
	- Epochs: 5
	```

	### Dealing with Imbalanced Data
	- Ensure both classes have sufficient samples (min 20-30 each)
	- Consider using stratified sampling
	- Monitor precision and recall, not just accuracy

	## 🐛 Common Issues & Solutions

	### Issue: "Out of Memory"
	Solutions:
	- Reduce batch size to 4 or 8
	- Use DistilBERT instead of larger models
	- Reduce max sequence length to 128

	### Issue: "Model download fails"
	Solutions:
	- Check internet connection
	- Try with VPN if blocked
	- Manually download model from Hugging Face Hub

	### Issue: "Training too slow"
	Solutions:
	- Use smaller model (DistilBERT)
	- Reduce dataset size for testing
	- Check if GPU is available: `torch.cuda.is_available()`

	### Issue: "Low accuracy"
	Solutions:
	- Increase number of epochs
	- Try different learning rate (3e-5 or 5e-5)
	- Ensure data quality and labels are correct
	- Use more training data

	## 📊 Understanding Metrics

	\| Metric \| What it means \| When to focus on it \|
	\|--------\|---------------\|---------------------\|
	\| Accuracy \| Overall correct predictions \| Balanced datasets \|
	\| Precision \| Of predicted positives, how many are correct \| Minimize false alarms \|
	\| Recall \| Of actual positives, how many found \| Don't miss any positives \|
	\| F1 Score \| Balance of precision and recall \| General performance \|

	## 🔗 Useful Resources

	- [Transformers Documentation](https://huggingface.co/docs/transformers)
	- [Streamlit Documentation](https://docs.streamlit.io)
	- [PyTorch Tutorials](https://pytorch.org/tutorials/)

	## 🆘 Getting Help

	1. Check the troubleshooting section in MLOPS_README.md
	2. Review the logs in the training tab
	3. Run `example_usage.py` to test programmatically
	4. Check console output for detailed error messages

	## 🎉 Next Steps

	After successfully training your first model:

	1. Export Model: Download from Evaluation tab
	2. Deploy: Use with FastAPI backend or integrate elsewhere
	3. Iterate: Try different languages, models, hyperparameters
	4. Scale: Train on larger datasets with GPU

	---

	Happy Training! 🚀

	For detailed documentation, see [MLOPS_README.md](MLOPS_README.md)