Spaces:

harismlnaslm
/

Textilindo-AI

Sleeping

App Files Files Community

harismlnaslm commited on Oct 27, 2025

Commit

b1b57bb

1 Parent(s): f67dde9

Fix README.md YAML frontmatter configuration for HF Spaces

Browse files

Files changed (1) hide show

README.md +82 -206

README.md CHANGED Viewed

@@ -1,243 +1,119 @@
-# Base LLM Setup - Llama 3.1 8B dengan LoRA
-Setup lengkap untuk fine-tuning model Llama 3.1 8B menggunakan LoRA (Low-Rank Adaptation).
-## 🚀 Fitur
-- **Base Model**: Llama 3.1 8B Instruct
-- **Fine-tuning**: LoRA untuk efisiensi memory
-- **Format Data**: JSONL (JSON Lines)
-- **Environment**: Virtual environment dengan Python
-- **Inference**: vLLM untuk serving model
-- **Monitoring**: Logs dan metrics
-## 📁 Struktur Direktori
-```
-base-llm-setup/
-├── models/                 # Model weights
-├── data/                   # Training datasets (JSONL)
-├── scripts/                # Python scripts
-│   ├── download_model.py   # Download base model
-│   ├── finetune_lora.py    # LoRA fine-tuning
-│   ├── test_model.py       # Test fine-tuned model
-│   └── create_sample_dataset.py # Create sample data
-├── configs/                # Configuration files
-├── logs/                   # Training logs
-├── venv/                   # Virtual environment
-├── requirements.txt         # Python dependencies
-├── setup.sh                # Setup script
-├── docker-compose.yml      # Docker services
-└── README.md               # This file
-```
-## 🛠️ Prerequisites
-- Python 3.8+
-- CUDA-compatible GPU (untuk training)
-- Docker & Docker Compose
-- HuggingFace account dan token
-## ⚡ Quick Start
-### 1. Setup Environment
 ```bash
-# Clone atau buat folder
-cd base-llm-setup
-# Jalankan setup script
-chmod +x setup.sh
-./setup.sh
 ```
-### 2. Aktifkan Virtual Environment
 ```bash
-source venv/bin/activate
 ```
-### 3. Set HuggingFace Token
 ```bash
-export HUGGINGFACE_TOKEN="your_token_here"
 ```
-### 4. Download Base Model
 ```bash
-python scripts/download_model.py
 ```
-### 5. Buat Dataset (JSONL)
 ```bash
-python scripts/create_sample_dataset.py
 ```
-### 6. Fine-tuning dengan LoRA
-```bash
-python scripts/finetune_lora.py
-```
-### 7. Test Model
-```bash
-python scripts/test_model.py
-```
-## 📊 Format Dataset JSONL
-Dataset harus dalam format JSONL (JSON Lines) dengan struktur:
-```jsonl
-{"text": "Apa itu machine learning?", "category": "education", "language": "id"}
-{"text": "Jelaskan tentang deep learning", "category": "education", "language": "id"}
-{"text": "Bagaimana cara kerja neural network?", "category": "education", "language": "id"}
-```
-**Field yang diperlukan:**
-- `text`: Teks untuk training (wajib)
-- `category`: Kategori data (opsional)
-- `language`: Bahasa (opsional, default: "id")
-## 🔧 Konfigurasi
-### Model Configuration (`configs/llama_config.yaml`)
-```yaml
-model_name: "meta-llama/Llama-3.1-8B-Instruct"
-model_path: "./models/llama-3.1-8b-instruct"
-max_length: 8192
-temperature: 0.7
-top_p: 0.9
-top_k: 40
-repetition_penalty: 1.1
-# LoRA Configuration
-lora_config:
-  r: 16
-  lora_alpha: 32
-  lora_dropout: 0.1
-  target_modules: ["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
-# Training Configuration
-training_config:
-  learning_rate: 2e-4
-  batch_size: 4
-  gradient_accumulation_steps: 4
-  num_epochs: 3
-  warmup_steps: 100
-  save_steps: 500
-  eval_steps: 500
-```
-### Docker Configuration
 ```bash
-# Start vLLM service
-docker-compose up -d vllm
-# Check status
-docker-compose ps
-# View logs
-docker-compose logs -f vllm
 ```
-## 🧪 Testing
-### Interactive Mode
-```bash
-python scripts/test_model.py
-# Pilih opsi 1 untuk interactive chat
-```
-### Batch Testing
-```bash
-python scripts/test_model.py
-# Pilih opsi 2 untuk batch testing
-```
-### Custom Prompt
-```bash
-python scripts/test_model.py
-# Pilih opsi 3 untuk custom prompt
-```
-## 📈 Monitoring
-### Training Logs
-- Logs tersimpan di folder `logs/`
-- Monitor GPU usage dengan `nvidia-smi`
-- Check training progress di console
-### Model Performance
-- Loss metrics selama training
-- Model checkpoints tersimpan setiap `save_steps`
-- Evaluation metrics setiap `eval_steps`
-## 🔍 Troubleshooting
-### Common Issues
-1. **CUDA Out of Memory**
-   - Kurangi `batch_size`
-   - Kurangi `max_length`
-   - Gunakan gradient accumulation
-2. **Model Download Failed**
-   - Check HuggingFace token
-   - Verify internet connection
-   - Check disk space
-3. **Training Slow**
-   - Increase `batch_size` jika memory cukup
-   - Optimize data loading
-   - Use mixed precision training
-### Performance Tips
-- Gunakan SSD untuk dataset besar
-- Monitor GPU temperature
-- Use appropriate learning rate scheduling
-- Regular checkpointing untuk recovery
-## 📚 Dependencies
-Lihat `requirements.txt` untuk daftar lengkap dependencies:
-- **Core**: torch, transformers, peft, datasets
-- **Inference**: vllm, openai
-- **Utils**: numpy, pandas, pyyaml
-- **Dev**: pytest, black, flake8
-## 🤝 Contributing
-1. Fork repository
-2. Create feature branch
-3. Commit changes
-4. Push to branch
-5. Create Pull Request
-## 📄 License
-MIT License - lihat LICENSE file untuk detail.
-## 🆘 Support
-Jika ada masalah atau pertanyaan:
-1. Check troubleshooting section
-2. Review logs di folder `logs/`
-3. Open issue di repository
-4. Contact maintainer
 ---
-**Happy Fine-tuning! 🚀**

+---
+title: Textilindo AI Assistant
+emoji: 🤖
+colorFrom: blue
+colorTo: green
+sdk: docker
+sdk_version: "4.0.0"
+app_file: app.py
+pinned: false
+license: mit
+short_description: AI Assistant for Textilindo textile company with training capabilities
+---
+# 🤖 Textilindo AI Assistant
+An intelligent AI assistant for Textilindo textile company with advanced training capabilities, built with FastAPI and Hugging Face Transformers.
+## ✨ Features
+- **Intelligent Chat Interface**: Natural language conversations in Indonesian
+- **Company Knowledge**: Trained on Textilindo's specific information
+- **Model Training**: Train custom models with your data
+- **Fast Response**: Optimized for quick customer service
+- **Mobile Friendly**: Responsive web interface
+- **API Ready**: RESTful API for integration
+## 🚀 Quick Start
+### Chat Interface
+Visit the main page to start chatting with the AI assistant. Ask questions about:
+- Company location and hours
+- Product information
+- Ordering and shipping
+- Sample requests
+- Pricing and terms
+### Training API
+#### Start Training
 ```bash
+curl -X POST "https://harismlnaslm-Textilindo-AI.hf.space/api/train/start" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model_name": "distilgpt2",
+    "dataset_path": "data/lora_dataset_20250910_145055.jsonl",
+    "config_path": "configs/training_config.yaml",
+    "max_samples": 10,
+    "epochs": 1,
+    "batch_size": 1,
+    "learning_rate": 5e-5
+  }'
 ```
+#### Check Training Status
 ```bash
+curl "https://harismlnaslm-Textilindo-AI.hf.space/api/train/status"
 ```
+#### Test Trained Model
 ```bash
+curl -X POST "https://harismlnaslm-Textilindo-AI.hf.space/api/train/test"
 ```
+#### Get Training Data Info
 ```bash
+curl "https://harismlnaslm-Textilindo-AI.hf.space/api/train/data"
 ```
+#### Check GPU Availability
 ```bash
+curl "https://harismlnaslm-Textilindo-AI.hf.space/api/train/gpu"
 ```
+## 🛠️ Technical Details
+### Architecture
+- **Framework**: FastAPI with Uvicorn
+- **AI Model**: Llama 3.1 8B Instruct (via Hugging Face)
+- **Training**: PyTorch with Transformers
+- **Language**: Indonesian (Bahasa Indonesia)
+- **Deployment**: Docker on Hugging Face Spaces
+### API Endpoints
+#### Chat Endpoints
+- `GET /` - Main chat interface
+- `POST /chat` - Chat API endpoint
+- `GET /health` - Health check
+- `GET /info` - Application information
+#### Training Endpoints
+- `POST /api/train/start` - Start model training
+- `GET /api/train/status` - Check training progress
+- `GET /api/train/data` - Get training data information
+- `GET /api/train/gpu` - Check GPU availability
+- `POST /api/train/test` - Test trained model
+### Environment Variables
+Set these in your space settings:
 ```bash
+# Required: Hugging Face API Key
+HUGGINGFACE_API_KEY=your_api_key_here
+# Optional: Model selection
+DEFAULT_MODEL=meta-llama/Llama-3.1-8B-Instruct
 ```
+## 📞 Support
+For technical issues:
+1. Check the `/health` endpoint
+2. Review space logs
+3. Verify environment variables
+4. Test with mock responses
 ---
+*Built with ❤️ for Textilindo customers*