| --- |
| language: |
| - multilingual |
| tags: |
| - few-shot-learning |
| - siamese-network |
| - image-similarity |
| - omniglot |
| - contrastive-learning |
| - pytorch |
| - onnx |
| datasets: |
| - omniglot |
| metrics: |
| - accuracy |
| model-index: |
| - name: siamese-few-shot |
| results: |
| - task: |
| type: image-classification |
| dataset: |
| name: Omniglot |
| type: omniglot |
| metrics: |
| - type: accuracy |
| value: 97.07 |
| name: 5-way 5-shot accuracy |
| license: apache-2.0 |
| --- |
| |
| # Siamese Network for Few-Shot Image Recognition |
|
|
| Few-shot image recognition using a Siamese Network trained on Omniglot. |
| Recognises new character classes from as little as a single example. |
|
|
| ## Results |
|
|
|  |
|
|
| | Configuration | Accuracy | |
| |----------------|----------| |
| | 5-way 1-shot | 95.10% | |
| | 5-way 5-shot | 97.07% | |
| | 10-way 1-shot | 90.05% | |
| | 10-way 5-shot | 94.83% | |
|
|
| Evaluated on 145 unseen test classes (never seen during training). |
|
|
| ## Architecture |
|
|
| - Backbone: ResNet-18 pretrained, final FC stripped β 512-d features |
| - Embedding head: Linear(512β256) β BN β ReLU β Linear(256β128) β L2 norm |
| - Loss: Contrastive loss with margin=1.0 |
| - Distance: Cosine similarity on unit-sphere embeddings |
|
|
| ## Project Structure |
|
|
| siamese-few-shot/ |
| βββ src/ |
| β βββ dataset.py # SiamesePairDataset + EpisodeDataset |
| β βββ model.py # EmbeddingNet + SiameseNet |
| β βββ loss.py # ContrastiveLoss |
| β βββ train.py # Training + validation loop |
| β βββ run_training.py # Main training entry point |
| β βββ eval.py # N-way K-shot episodic evaluation |
| β βββ demo.py # Gradio demo |
| βββ checkpoints/ |
| β βββ best.pt |
| β βββ siamese_embedding.onnx |
| βββ data/ |
| β βββ class_split.json |
| βββ requirements.txt |
| βββ README.md |
| |
| ## Quickstart |
|
|
| git clone https://huggingface.co/<your-username>/siamese-few-shot |
| cd siamese-few-shot |
| pip install -r requirements.txt |
| |
| # Run Gradio demo |
| cd src && python demo.py |
| |
| # Run episodic evaluation |
| cd src && python eval.py |
| |
| # Retrain from scratch |
| cd src && python run_training.py |
| |
| ## Training Details |
|
|
| - Dataset: Omniglot (background split, 964 classes) |
| - Train / val / test split: 70% / 15% / 15% of classes |
| - Epochs: 30 |
| - Batch size: 32 |
| - Optimiser: Adam lr=1e-3 |
| - Scheduler: CosineAnnealingLR |
| - Augmentation: RandomCrop, HorizontalFlip, ColorJitter |
|
|
| ## Requirements |
|
|
| torch>=2.0 |
| torchvision>=0.15 |
| timm |
| gradio |
| onnx |
| onnxruntime-gpu |
| pillow |
| numpy |
| matplotlib |
| scikit-learn |
| tqdm |
| wandb |
| |
| ## Demo |
|
|
| Upload any two handwritten character images. The model returns a |
| cosine similarity score and a same / different class decision. |
|
|
| Trained on Latin, Greek, Cyrillic, Japanese, and 25 other alphabets |
| via the Omniglot dataset. Also tested on Indian script characters |
| (Tamil, Hindi, Telugu, Kannada, Bengali, Malayalam, Gujarati, Punjabi). |