File size: 3,062 Bytes
3af64fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
02ac88d
 
 
 
 
 
 
1cea53b
 
02ac88d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
language:
- multilingual
tags:
- few-shot-learning
- siamese-network
- image-similarity
- omniglot
- contrastive-learning
- pytorch
- onnx
datasets:
- omniglot
metrics:
- accuracy
model-index:
- name: siamese-few-shot
  results:
  - task:
      type: image-classification
    dataset:
      name: Omniglot
      type: omniglot
    metrics:
    - type: accuracy
      value: 97.07
      name: 5-way 5-shot accuracy
license: apache-2.0
---

# Siamese Network for Few-Shot Image Recognition

Few-shot image recognition using a Siamese Network trained on Omniglot.
Recognises new character classes from as little as a single example.

## Results

![Few-shot evaluation results](eval_results.png)

| Configuration  | Accuracy |
|----------------|----------|
| 5-way 1-shot   | 95.10%   |
| 5-way 5-shot   | 97.07%   |
| 10-way 1-shot  | 90.05%   |
| 10-way 5-shot  | 94.83%   |

Evaluated on 145 unseen test classes (never seen during training).

## Architecture

- Backbone: ResNet-18 pretrained, final FC stripped β†’ 512-d features
- Embedding head: Linear(512β†’256) β†’ BN β†’ ReLU β†’ Linear(256β†’128) β†’ L2 norm
- Loss: Contrastive loss with margin=1.0
- Distance: Cosine similarity on unit-sphere embeddings

## Project Structure

    siamese-few-shot/
    β”œβ”€β”€ src/
    β”‚   β”œβ”€β”€ dataset.py        # SiamesePairDataset + EpisodeDataset
    β”‚   β”œβ”€β”€ model.py          # EmbeddingNet + SiameseNet
    β”‚   β”œβ”€β”€ loss.py           # ContrastiveLoss
    β”‚   β”œβ”€β”€ train.py          # Training + validation loop
    β”‚   β”œβ”€β”€ run_training.py   # Main training entry point
    β”‚   β”œβ”€β”€ eval.py           # N-way K-shot episodic evaluation
    β”‚   └── demo.py           # Gradio demo
    β”œβ”€β”€ checkpoints/
    β”‚   β”œβ”€β”€ best.pt
    β”‚   └── siamese_embedding.onnx
    β”œβ”€β”€ data/
    β”‚   └── class_split.json
    β”œβ”€β”€ requirements.txt
    └── README.md

## Quickstart

    git clone https://huggingface.co/<your-username>/siamese-few-shot
    cd siamese-few-shot
    pip install -r requirements.txt

    # Run Gradio demo
    cd src && python demo.py

    # Run episodic evaluation
    cd src && python eval.py

    # Retrain from scratch
    cd src && python run_training.py

## Training Details

- Dataset: Omniglot (background split, 964 classes)
- Train / val / test split: 70% / 15% / 15% of classes
- Epochs: 30
- Batch size: 32
- Optimiser: Adam lr=1e-3
- Scheduler: CosineAnnealingLR
- Augmentation: RandomCrop, HorizontalFlip, ColorJitter

## Requirements

    torch>=2.0
    torchvision>=0.15
    timm
    gradio
    onnx
    onnxruntime-gpu
    pillow
    numpy
    matplotlib
    scikit-learn
    tqdm
    wandb

## Demo

Upload any two handwritten character images. The model returns a
cosine similarity score and a same / different class decision.

Trained on Latin, Greek, Cyrillic, Japanese, and 25 other alphabets
via the Omniglot dataset. Also tested on Indian script characters
(Tamil, Hindi, Telugu, Kannada, Bengali, Malayalam, Gujarati, Punjabi).