S3CUR's picture
Add hero image block with demo outputs
f08b4ed verified
---
license: cc-by-4.0
datasets:
- S3CUR/badger-55-watermeter
language:
- en
library_name: pytorch
base_model: facebook/dinov2-small
tags:
- water-meter
- meter-reading
- badger-model-55
- dinov2
- angular-regression
- digit-classification
- ocr
pipeline_tag: image-classification
---
# badger-55-meterreader
A standalone, clean-room reference reader for the
[**S3CUR/badger-55-watermeter**](https://huggingface.co/datasets/S3CUR/badger-55-watermeter)
dataset. End-to-end pipeline: full-frame meter photo β†’ deskew β†’ rectify
the digit strip β†’ DINOv2 features β†’ two trained heads β†’ 8-digit reading.
The repo demonstrates that the published dataset is self-sufficient: with
no other inputs you can `pip install -r requirements.txt`, run `train.py`,
and reproduce the same per-slot accuracy reported in the dataset card.
<p align="center">
<img src="demo_sample_midsnap_d6.jpg" width="900" alt="annotated reading: 06672429, d6 mid-snap"/><br/>
<em><code>06672429</code> β€” d6 at ΞΈ=97Β° (just past park-2, mid-snap stress case)</em>
</p>
<p align="center">
<img src="demo_sample_rollover_d4.jpg" width="900" alt="annotated reading: 06629789, d4 mid-roll"/><br/>
<em><code>06629789</code> β€” d4 = 9 mid-roll (single rollover edge)</em>
</p>
<p align="center">
<img src="demo_sample_rollover_d6d7.jpg" width="900" alt="annotated reading: 06615699, double rollover"/><br/>
<em><code>06615699</code> β€” d6 = 9 <strong>and</strong> d7 = 9 (double rollover, hardest case)</em>
</p>
<p align="center"><sub>Each card shows the chosen digit, ΞΈ, the consensus confidence, and both voters (CLA + P90). A grey informational chip means that head disagreed with the consensus.</sub></p>
## Layout
```
badger-55-meterreader/
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ rectifier.py # deskew + digit-window detect + affine warp
β”œβ”€β”€ models.py # DINOv2 wrapper + head architectures
β”œβ”€β”€ train.py # download dataset, train three heads
β”œβ”€β”€ demo.py # one-image rectify + infer + annotate
β”œβ”€β”€ weights/
β”‚ β”œβ”€β”€ digit_classifier.pt # 10-class digit head (pooled d4-d7)
β”‚ β”œβ”€β”€ d4d5_predictor90.pt # 90-bin angular head, slots 4+5
β”‚ └── d6d7_predictor90.pt # 90-bin angular head, slots 6+7 (incl. platinum atlas)
β”œβ”€β”€ sample_midsnap_d6.jpg # d6 slightly past park-2; v0 stress case
β”œβ”€β”€ sample_rollover_d4.jpg # d4 = 9 mid-roll
β”œβ”€β”€ sample_rollover_d6d7.jpg # d6 = 9 AND d7 = 9 (double rollover)
└── demo_sample_*.jpg # reference annotated outputs for each input
```
## Install
```bash
python3 -m venv venv
./venv/bin/pip install -U pip
./venv/bin/pip install -r requirements.txt
```
## Train
```bash
./venv/bin/python3 train.py
```
First run pulls `S3CUR/badger-55-watermeter` (just `slots.parquet`,
~29 MB β€” bytes are embedded inline, no per-file rate-limit dance) and
`facebook/dinov2-small` (~85 MB) into `~/.cache/huggingface/`. Then:
1. Extracts DINOv2-small CLS features for each slot crop in the dataset.
2. Trains `d4d5_predictor90.pt` on slot 4+5 angular labels (KL on
wrapped-Gaussian soft targets).
3. Trains `d6d7_predictor90.pt` on slot 6+7 angular labels (includes
the platinum d7 atlas β€” continuous-rotation ground truth on the
one drum that genuinely sweeps every angle).
4. Trains `digit_classifier.pt` (10-way digit, pooled across slots 4-7).
Total wall-clock: **~60 seconds** on a recent CUDA GPU. Weights land in
`./weights/`. The per-head recipe (epoch counts, learning rates) was
swept upstream and baked into `train.py`'s `RECIPE` dict β€” override with
`--epochs N` only if you want to experiment.
CLI flags:
```bash
./venv/bin/python3 train.py --epochs 120 # override default recipe
./venv/bin/python3 train.py --skip-classifier # angular heads only
./venv/bin/python3 train.py --device cpu # CPU-only (much slower)
./venv/bin/python3 train.py --local-parquet PATH # skip HF, use a local slots.parquet
```
## Demo
```bash
./venv/bin/python3 demo.py --image sample_midsnap_d6.jpg
```
Or pick a frame straight from the dataset with no `--image`:
```bash
./venv/bin/python3 demo.py
```
Without `--image`, the demo pulls `captures.parquet` (~1 GB, also bytes
embedded) and picks a clean test-split frame to decode. Writes
`demo_output.jpg` showing:
- The input frame (scaled)
- The rectified strip (175 Γ— 736)
- Eight per-slot voter cards with the chosen digit, theta, confidence,
and which head produced the digit
### Three included test images
| File | Expected reading | What it stresses |
|------|------------------|------------------|
| `sample_midsnap_d6.jpg` | `06672429` (667,242.9 gal) | d6 sits at ΞΈ=97Β° β€” just past park-2; the v0-contamination stress case |
| `sample_rollover_d4.jpg` | `06629789` (662,978.9 gal) | d4 = 9 mid-roll (single rollover edge) |
| `sample_rollover_d6d7.jpg` | `06615699` (661,569.9 gal) | d6 = 9 **and** d7 = 9 (double rollover, hardest case) |
All three should match digit-for-digit. Annotated reference outputs are
shown at the top of this card.
## How the per-slot decision is made
Two heads vote on every slot. When they agree, the consensus is taken
with `max(p)`. When they disagree:
| Slot | Tiebreak | Why |
|------|----------|-----|
| d0 | classifier | The source meter's d0 was always 0; the classifier learned a hard constant and the angular head on a constant has no real ΞΈ signal. |
| d1–d7 | predictor90 | The angular head's ΞΈ disambiguates mid-snap and mid-roll cases the classifier wobbles on. P90 trained on slots {4,5,6,7} generalizes to the upper drums cleanly when those drums *visually* sweep through digit faces (e.g. mid-snap). |
The classifier is still rendered in the voter row of every card β€”
disagreements show as a grey informational chip β€” so you can see at a
glance where it diverges from P90.
### Why the classifier is only a backup voter on d1–d3
The published dataset has gold crops for slots 4-7 only β€” the source
meter's upper drums (d0-d3) never moved during the collection window
(reading stayed in the 66XXXX range), so there's no `gold_d0`/`gold_d1`/
`gold_d2`/`gold_d3` pool. The classifier is trained on the d4-d7 pool and
then applied to d0-d4 at inference. It works for d4 (in-distribution)
and d0 (hard constant), but d1-d3 are near-OOD generalization. P90's ΞΈ
direction resolves those slots more reliably.
In production this is papered over by an SDR radio anchor that pins
d0-d5 β€” the cleanroom doesn't have that signal so the model has to
stand on its own.
## License
Code released under [**CC-BY-4.0**](https://creativecommons.org/licenses/by/4.0/),
matching the dataset's license. Use it, modify it, ship it.
## Attribution
> *badger-55-meterreader*. Three, 2026.
No author name, email, or institutional affiliation is associated with
this release.