init

Browse files

Files changed (3) hide show

README.md +0 -121
scripts/upload_instructions.md +0 -97
scripts/upload_to_huggingface.py → upload_to_huggingface.py +1 -1

README.md DELETED Viewed

@@ -1,121 +0,0 @@
-# Handwriting Recognition
-Complete handwriting recognition system using CNN-BiLSTM-CTC on the IAM dataset.
-## 📁 Files
-### 1. **analysis.ipynb** - Dataset Analysis
-- Exploratory Data Analysis (EDA)
-- 5 detailed charts saved to `charts/` folder
-- Run locally or on Colab (no GPU needed)
-### 2. **train_colab.ipynb** - Model Training (GPU)
-- **⚡ Google Colab GPU compatible**
-- Full training pipeline
-- CNN-BiLSTM-CTC model (~9.1M parameters)
-- Automatic model saving
-- Download trained model for deployment
-## 🚀 Quick Start
-### Option 1: Analyze Dataset (Local/Colab)
-```bash
-jupyter notebook analysis.ipynb
-```
-- No GPU needed
-- Generates 5 EDA charts
-- Fast (~2 minutes)
-### Option 2: Train Model (Google Colab GPU)
-1. **Upload `train_colab.ipynb` to Google Colab**
-2. **Change runtime to GPU:**
-   - Runtime → Change runtime type → GPU (T4 recommended)
-3. **Run all cells**
-4. **Download trained model** (last cell)
-**Training Time:** ~1-2 hours for 20 epochs on T4 GPU
-## 📊 Charts Generated
-From `analysis.ipynb`:
-1. `charts/01_sample_images.png` - 10 sample handwritten texts
-2. `charts/02_text_length_distribution.png` - Text statistics
-3. `charts/03_image_dimensions.png` - Image analysis
-4. `charts/04_character_frequency.png` - Character distribution
-5. `charts/05_summary_statistics.png` - Summary table
-## 🎯 Model Details
-**Architecture:**
-- **CNN**: 7 convolutional blocks (feature extraction)
-- **BiLSTM**: 2 layers, 256 hidden units (sequence modeling)
-- **CTC Loss**: Alignment-free training
-**Dataset:** Teklia/IAM-line (Hugging Face)
-- Train: 6,482 samples
-- Validation: 976 samples
-- Test: 2,915 samples
-**Metrics:**
-- **CER** (Character Error Rate)
-- **WER** (Word Error Rate)
-## 💾 Model Files
-After training in Colab:
-- `best_model.pth` - Trained model weights
-- `training_history.png` - Loss/CER/WER plots
-- `predictions.png` - Sample predictions
-## 📦 Requirements
-```
-torch>=2.0.0
-datasets>=2.14.0
-pillow>=9.5.0
-numpy>=1.24.0
-matplotlib>=3.7.0
-seaborn>=0.13.0
-jupyter>=1.0.0
-jiwer>=3.0.0
-```
-## 🔧 Usage
-### Load Trained Model
-```python
-import torch
-# Load checkpoint
-checkpoint = torch.load('best_model.pth')
-char_mapper = checkpoint['char_mapper']
-# Create model
-from train_colab import CRNN  # Copy model class
-model = CRNN(num_chars=len(char_mapper.chars))
-model.load_state_dict(checkpoint['model_state_dict'])
-model.eval()
-# Predict
-# ... (preprocessing + inference)
-```
-## 📝 Notes
-- **GPU strongly recommended** for training (use Colab T4)
-- Training on CPU will be extremely slow (~20x slower)
-- Colab free tier: 12-hour limit, sufficient for 20 epochs
-- Model checkpoint includes character mapper for deployment
-## 🎓 Training Tips
-1. **Start with fewer epochs** (5-10) to test
-2. **Monitor CER/WER** - stop if not improving
-3. **Increase epochs** if still improving (up to 50)
-4. **Save checkpoint** before Colab disconnects
-5. **Download model immediately** after training
-## 📄 License
-Dataset: IAM Database (research use)

scripts/upload_instructions.md DELETED Viewed

@@ -1,97 +0,0 @@
-# Upload Model to Hugging Face Hub
-## Quick Start (3 Steps)
-### 1. Install Hugging Face Hub
-```bash
-pip install huggingface_hub
-```
-### 2. Login to Hugging Face
-```bash
-huggingface-cli login
-```
-Enter your Hugging Face token when prompted. Get your token from: https://huggingface.co/settings/tokens
-### 3. Run Upload Script
-```bash
-python upload_to_huggingface.py
-```
----
-## Alternative: Manual Upload via Web Interface
-1. Go to https://huggingface.co/new
-2. Create a new model repository (e.g., `handwriting-recognition-iam`)
-3. Click "Files" → "Add file" → "Upload files"
-4. Upload:
-   - `best_model.pth`
-   - `README.md`
-   - `requirements.txt`
-   - `train_colab.ipynb`
-   - `training_history.png`
----
-## Alternative: Upload from Python (Colab/Script)
-```python
-from huggingface_hub import HfApi, create_repo, upload_file
-# Login first (in Colab)
-from huggingface_hub import notebook_login
-notebook_login()
-# Create repository
-api = HfApi()
-repo_id = "your-username/handwriting-recognition-iam"
-create_repo(repo_id, repo_type="model", exist_ok=True)
-# Upload model
-upload_file(
-    path_or_fileobj="best_model.pth",
-    path_in_repo="best_model.pth",
-    repo_id=repo_id,
-    repo_type="model"
-)
-print(f"✓ Uploaded! View at: https://huggingface.co/{repo_id}")
-```
----
-## What Gets Uploaded
-- ✅ `best_model.pth` - Trained model checkpoint (105MB)
-- ✅ `README.md` - Project documentation
-- ✅ `requirements.txt` - Dependencies
-- ✅ `train_colab.ipynb` - Training notebook
-- ✅ `training_history.png` - Training metrics visualization
----
-## Customization
-Edit `upload_to_huggingface.py` to change:
-- `REPO_NAME` - Your preferred repository name
-- `private=False` - Set to `True` for private repository
-- `FILES_TO_UPLOAD` - Add/remove files to upload
----
-## Troubleshooting
-### "Authentication required"
-```bash
-huggingface-cli login
-```
-### "Repository already exists"
-- The script uses `exist_ok=True`, so it will update existing repo
-- Or change `REPO_NAME` to create a new one
-### Large file upload fails
-- Hugging Face supports files up to 50GB
-- Your model (105MB) should upload fine
-- If it fails, try uploading via web interface

scripts/upload_to_huggingface.py → upload_to_huggingface.py RENAMED Viewed

@@ -8,7 +8,7 @@ from huggingface_hub import HfApi, create_repo, upload_file
 # Configuration
 MODEL_PATH = "best_model.pth"
 REPO_NAME = "handwriting-recognition-iam"  # Change this to your preferred name
-USERNAME = None  # Will use your HF username automatically
 # Files to upload
 FILES_TO_UPLOAD = [

 # Configuration
 MODEL_PATH = "best_model.pth"
 REPO_NAME = "handwriting-recognition-iam"  # Change this to your preferred name
+USERNAME = "IsmatS"  # Will use your HF username automatically
 # Files to upload
 FILES_TO_UPLOAD = [