Add hero image block with demo outputs

f08b4ed verified 19 days ago

7.04 kB

	---
	license: cc-by-4.0
	datasets:
	- S3CUR/badger-55-watermeter
	language:
	- en
	library_name: pytorch
	base_model: facebook/dinov2-small
	tags:
	- water-meter
	- meter-reading
	- badger-model-55
	- dinov2
	- angular-regression
	- digit-classification
	- ocr
	pipeline_tag: image-classification
	---

	# badger-55-meterreader

	A standalone, clean-room reference reader for the
	[S3CUR/badger-55-watermeter](https://huggingface.co/datasets/S3CUR/badger-55-watermeter)
	dataset. End-to-end pipeline: full-frame meter photo → deskew → rectify
	the digit strip → DINOv2 features → two trained heads → 8-digit reading.

	The repo demonstrates that the published dataset is self-sufficient: with
	no other inputs you can `pip install -r requirements.txt`, run `train.py`,
	and reproduce the same per-slot accuracy reported in the dataset card.

	<p align="center">
	<img src="demo_sample_midsnap_d6.jpg" width="900" alt="annotated reading: 06672429, d6 mid-snap"/><br/>
	<em><code>06672429</code> — d6 at θ=97° (just past park-2, mid-snap stress case)</em>
	</p>

	<p align="center">
	<img src="demo_sample_rollover_d4.jpg" width="900" alt="annotated reading: 06629789, d4 mid-roll"/><br/>
	<em><code>06629789</code> — d4 = 9 mid-roll (single rollover edge)</em>
	</p>

	<p align="center">
	<img src="demo_sample_rollover_d6d7.jpg" width="900" alt="annotated reading: 06615699, double rollover"/><br/>
	<em><code>06615699</code> — d6 = 9 <strong>and</strong> d7 = 9 (double rollover, hardest case)</em>
	</p>

	<p align="center"><sub>Each card shows the chosen digit, θ, the consensus confidence, and both voters (CLA + P90). A grey informational chip means that head disagreed with the consensus.</sub></p>

	## Layout

	```
	badger-55-meterreader/
	├── README.md
	├── requirements.txt
	├── rectifier.py # deskew + digit-window detect + affine warp
	├── models.py # DINOv2 wrapper + head architectures
	├── train.py # download dataset, train three heads
	├── demo.py # one-image rectify + infer + annotate
	├── weights/
	│ ├── digit_classifier.pt # 10-class digit head (pooled d4-d7)
	│ ├── d4d5_predictor90.pt # 90-bin angular head, slots 4+5
	│ └── d6d7_predictor90.pt # 90-bin angular head, slots 6+7 (incl. platinum atlas)
	├── sample_midsnap_d6.jpg # d6 slightly past park-2; v0 stress case
	├── sample_rollover_d4.jpg # d4 = 9 mid-roll
	├── sample_rollover_d6d7.jpg # d6 = 9 AND d7 = 9 (double rollover)
	└── demo_sample_*.jpg # reference annotated outputs for each input
	```

	## Install

	```bash
	python3 -m venv venv
	./venv/bin/pip install -U pip
	./venv/bin/pip install -r requirements.txt
	```

	## Train

	```bash
	./venv/bin/python3 train.py
	```

	First run pulls `S3CUR/badger-55-watermeter` (just `slots.parquet`,
	~29 MB — bytes are embedded inline, no per-file rate-limit dance) and
	`facebook/dinov2-small` (~85 MB) into `~/.cache/huggingface/`. Then:

	1. Extracts DINOv2-small CLS features for each slot crop in the dataset.
	2. Trains `d4d5_predictor90.pt` on slot 4+5 angular labels (KL on
	wrapped-Gaussian soft targets).
	3. Trains `d6d7_predictor90.pt` on slot 6+7 angular labels (includes
	the platinum d7 atlas — continuous-rotation ground truth on the
	one drum that genuinely sweeps every angle).
	4. Trains `digit_classifier.pt` (10-way digit, pooled across slots 4-7).

	Total wall-clock: ~60 seconds on a recent CUDA GPU. Weights land in
	`./weights/`. The per-head recipe (epoch counts, learning rates) was
	swept upstream and baked into `train.py`'s `RECIPE` dict — override with
	`--epochs N` only if you want to experiment.

	CLI flags:

	```bash
	./venv/bin/python3 train.py --epochs 120 # override default recipe
	./venv/bin/python3 train.py --skip-classifier # angular heads only
	./venv/bin/python3 train.py --device cpu # CPU-only (much slower)
	./venv/bin/python3 train.py --local-parquet PATH # skip HF, use a local slots.parquet
	```

	## Demo

	```bash
	./venv/bin/python3 demo.py --image sample_midsnap_d6.jpg
	```

	Or pick a frame straight from the dataset with no `--image`:

	```bash
	./venv/bin/python3 demo.py
	```

	Without `--image`, the demo pulls `captures.parquet` (~1 GB, also bytes
	embedded) and picks a clean test-split frame to decode. Writes
	`demo_output.jpg` showing:

	- The input frame (scaled)
	- The rectified strip (175 × 736)
	- Eight per-slot voter cards with the chosen digit, theta, confidence,
	and which head produced the digit

	### Three included test images

	\| File \| Expected reading \| What it stresses \|
	\|------\|------------------\|------------------\|
	\| `sample_midsnap_d6.jpg` \| `06672429` (667,242.9 gal) \| d6 sits at θ=97° — just past park-2; the v0-contamination stress case \|
	\| `sample_rollover_d4.jpg` \| `06629789` (662,978.9 gal) \| d4 = 9 mid-roll (single rollover edge) \|
	\| `sample_rollover_d6d7.jpg` \| `06615699` (661,569.9 gal) \| d6 = 9 and d7 = 9 (double rollover, hardest case) \|

	All three should match digit-for-digit. Annotated reference outputs are
	shown at the top of this card.

	## How the per-slot decision is made

	Two heads vote on every slot. When they agree, the consensus is taken
	with `max(p)`. When they disagree:

	\| Slot \| Tiebreak \| Why \|
	\|------\|----------\|-----\|
	\| d0 \| classifier \| The source meter's d0 was always 0; the classifier learned a hard constant and the angular head on a constant has no real θ signal. \|
	\| d1–d7 \| predictor90 \| The angular head's θ disambiguates mid-snap and mid-roll cases the classifier wobbles on. P90 trained on slots {4,5,6,7} generalizes to the upper drums cleanly when those drums visually sweep through digit faces (e.g. mid-snap). \|

	The classifier is still rendered in the voter row of every card —
	disagreements show as a grey informational chip — so you can see at a
	glance where it diverges from P90.

	### Why the classifier is only a backup voter on d1–d3

	The published dataset has gold crops for slots 4-7 only — the source
	meter's upper drums (d0-d3) never moved during the collection window
	(reading stayed in the 66XXXX range), so there's no `gold_d0`/`gold_d1`/
	`gold_d2`/`gold_d3` pool. The classifier is trained on the d4-d7 pool and
	then applied to d0-d4 at inference. It works for d4 (in-distribution)
	and d0 (hard constant), but d1-d3 are near-OOD generalization. P90's θ
	direction resolves those slots more reliably.

	In production this is papered over by an SDR radio anchor that pins
	d0-d5 — the cleanroom doesn't have that signal so the model has to
	stand on its own.

	## License

	Code released under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/),
	matching the dataset's license. Use it, modify it, ship it.

	## Attribution

	> badger-55-meterreader. Three, 2026.

	No author name, email, or institutional affiliation is associated with
	this release.