rewrite / todo_registry.md
morpheuslord's picture
Add files using upload-large-folder tool
3df5819 verified
# TODO Registry β€” Implementation Checklist
> **97 TODOs** across 26 files β€” βœ… **ALL IMPLEMENTED**
---
## src/preprocessing/ β€” 16 TODOs βœ…
### [spell_corrector.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/preprocessing/spell_corrector.py)
| Line | TODO | Status |
|------|------|--------|
| 36 | Implement initialisation (SpellChecker + LanguageTool) | βœ… DONE |
| 41 | Implement phonetic pass (regex substitution from `DYSLEXIC_PHONETIC_MAP`) | βœ… DONE |
| 46 | Implement spellcheck pass (pyspellchecker token-level) | βœ… DONE |
| 51 | Implement LanguageTool pass (context-aware, reverse-offset correction) | βœ… DONE |
| 56 | Implement full correction pipeline (chain all 3 passes) | βœ… DONE |
| 61 | Implement cleanup (`self.tool.close()`) | βœ… DONE |
### [sentence_segmenter.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/preprocessing/sentence_segmenter.py)
| Line | TODO | Status |
|------|------|--------|
| 15 | Implement initialisation (load spaCy model) | βœ… DONE |
| 20 | Implement sentence segmentation | βœ… DONE |
### [dependency_parser.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/preprocessing/dependency_parser.py)
| Line | TODO | Status |
|------|------|--------|
| 16 | Implement initialisation | βœ… DONE |
| 21 | Implement dependency parsing | βœ… DONE |
| 26 | Implement SVO extraction | βœ… DONE |
### [ner_tagger.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/preprocessing/ner_tagger.py)
| Line | TODO | Status |
|------|------|--------|
| 24 | Implement initialisation | βœ… DONE |
| 29 | Implement NER tagging | βœ… DONE |
| 34 | Implement protected span extraction | βœ… DONE |
### [dyslexia_simulator.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/preprocessing/dyslexia_simulator.py)
| Line | TODO | Status |
|------|------|--------|
| 35 | Implement initialisation (set error_rate, seed) | βœ… DONE |
| 40 | Implement letter transposition | βœ… DONE |
| 45 | Implement letter omission | βœ… DONE |
| 50 | Implement letter doubling | βœ… DONE |
| 55 | Implement letter reversal (b/d, p/q) | βœ… DONE |
| 60 | Implement word corruption (random error selection) | βœ… DONE |
| 65 | Implement full simulation (corrupt + word merge) | βœ… DONE |
### [pipeline.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/preprocessing/pipeline.py)
| Line | TODO | Status |
|------|------|--------|
| 38 | Implement initialisation (load spaCy + spell corrector) | βœ… DONE |
| 43 | Implement readability extraction (Flesch-Kincaid, Gunning Fog, SMOG, ARI) | βœ… DONE |
| 48 | Implement dependency tree extraction (SVO per sentence) | βœ… DONE |
| 53 | Implement full pipeline (7-step: spellβ†’parseβ†’segmentβ†’NERβ†’depsβ†’POSβ†’readability) | βœ… DONE |
---
## src/style/ β€” 14 TODOs βœ…
### [fingerprinter.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/style/fingerprinter.py)
| Line | TODO | Status |
|------|------|--------|
| 64 | Implement MLP layers (Linearβ†’LayerNormβ†’GELUβ†’Dropoutβ†’Linearβ†’LayerNorm) | βœ… DONE |
| 68 | Implement forward pass (MLP projection) | βœ… DONE |
| 76 | Implement initialisation (spaCy + AWL + projection MLP) | βœ… DONE |
| 81 | Implement AWL loading from file | βœ… DONE |
| 86 | Implement passive voice detection (nsubjpass/auxpass dep labels) | βœ… DONE |
| 91 | Implement avg dependency tree depth | βœ… DONE |
| 96 | Implement lexical density (content words / total) | βœ… DONE |
| 101 | Implement raw feature extraction (~40 features) | βœ… DONE |
| 106 | Implement vector extraction (raw features β†’ pad/truncate to 40 β†’ MLP β†’ 512-dim) | βœ… DONE |
| 120 | Implement vector blending with L2 normalisation | βœ… DONE |
### [formality_classifier.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/style/formality_classifier.py)
| Line | TODO | Status |
|------|------|--------|
| 14 | Implement initialisation | βœ… DONE |
| 19 | Implement formality scoring (0-1 scale) | βœ… DONE |
### [emotion_classifier.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/style/emotion_classifier.py)
| Line | TODO | Status |
|------|------|--------|
| 14 | Implement initialisation | βœ… DONE |
| 19 | Implement emotion classification (distribution over register categories) | βœ… DONE |
### [style_vector.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/style/style_vector.py)
| Line | TODO | Status |
|------|------|--------|
| 12 | Implement cosine similarity | βœ… DONE |
| 18 | Implement vector averaging | βœ… DONE |
| 24 | Implement save to disk | βœ… DONE |
| 30 | Implement load from disk | βœ… DONE |
---
## src/model/ β€” 5 TODOs βœ…
### [base_model.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/model/base_model.py)
| Line | TODO | Status |
|------|------|--------|
| 39 | Implement model loading (tokenizer + model + quantization + LoRA wrapping) | βœ… DONE |
### [lora_adapter.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/model/lora_adapter.py)
| Line | TODO | Status |
|------|------|--------|
| 20 | Implement LoRA config creation | βœ… DONE |
| 26 | Implement LoRA application to model | βœ… DONE |
| 32 | Implement weight merging for inference | βœ… DONE |
### [style_conditioner.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/model/style_conditioner.py)
| Line | TODO | Status |
|------|------|--------|
| 27 | Implement projection layers (Linear β†’ Tanh) | βœ… DONE |
| 37 | Implement forward pass (project + reshape) | βœ… DONE |
| 53 | Implement prefix prepending (torch.cat along seq dim) | βœ… DONE |
### [generation_utils.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/model/generation_utils.py)
| Line | TODO | Status |
|------|------|--------|
| 20 | Implement generation with beam search | βœ… DONE |
| 30 | Implement batch generation | βœ… DONE |
---
## src/training/ β€” 22 TODOs βœ…
### [dataset.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/training/dataset.py)
| Line | TODO | Status |
|------|------|--------|
| 54 | Implement initialisation and data loading | βœ… DONE |
| 59 | Implement JSONL loading | βœ… DONE |
| 64 | Implement synthetic data augmentation | βœ… DONE |
| 68 | Implement `__len__` | βœ… DONE |
| 73 | Implement `__getitem__` | βœ… DONE |
### [loss_functions.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/training/loss_functions.py)
| Line | TODO | Status |
|------|------|--------|
| 34 | Implement V1 initialisation | βœ… DONE |
| 43 | Implement style loss (1 - cosine_similarity) | βœ… DONE |
| 52 | Implement semantic loss | βœ… DONE |
| 65 | Implement combined loss V1 | βœ… DONE |
| 82 | Implement V2 initialisation with frozen classifier | βœ… DONE |
| 87 | Implement human pattern loss (1 - human_score) | βœ… DONE |
| 100 | Implement combined loss V2 | βœ… DONE |
### [trainer.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/training/trainer.py)
| Line | TODO | Status |
|------|------|--------|
| 17 | Store loss function, fingerprinter, and tokenizer | βœ… DONE |
| 22 | Implement custom `compute_loss` | βœ… DONE |
### [callbacks.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/training/callbacks.py)
| Line | TODO | Status |
|------|------|--------|
| 14 | Implement evaluation-time style metric logging | βœ… DONE |
| 22 | Implement early stopping initialisation | βœ… DONE |
| 26 | Implement early stopping check | βœ… DONE |
### [human_pattern_extractor.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/training/human_pattern_extractor.py)
| Line | TODO | Status |
|------|------|--------|
| 68 | Implement initialisation (spaCy + GPT-2) | βœ… DONE |
| 73 | Implement GPT-2 perplexity calculation | βœ… DONE |
| 78 | Implement burstiness | βœ… DONE |
| 83 | Implement sentence starter diversity | βœ… DONE |
| 88 | Implement n-gram novelty | βœ… DONE |
| 93 | Implement AI marker density | βœ… DONE |
| 98 | Implement discourse density | βœ… DONE |
| 103 | Implement punctuation patterns | βœ… DONE |
| 108 | Implement full 17-dim feature extraction | βœ… DONE |
| 125 | Implement KaggleHumanPatternDataset loading | βœ… DONE |
| 129 | Implement `__len__` | βœ… DONE |
| 133 | Implement `__getitem__` | βœ… DONE |
| 148 | Implement HumanPatternClassifier MLP layers | βœ… DONE |
| 153 | Implement forward pass | βœ… DONE |
| 158 | Implement single-text scoring | βœ… DONE |
---
## src/vocabulary/ β€” 10 TODOs βœ…
### [awl_loader.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/vocabulary/awl_loader.py)
| Line | TODO | Status |
|------|------|--------|
| 21 | Implement initialisation | βœ… DONE |
| 26 | Implement word list file loading | βœ… DONE |
| 31 | Implement synonym JSON loading | βœ… DONE |
| 36 | Implement `is_academic()` | βœ… DONE |
| 41 | Implement `get_academic_synonyms()` | βœ… DONE |
| 47 | Implement `all_words` property | βœ… DONE |
### [lexical_substitution.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/vocabulary/lexical_substitution.py)
| Line | TODO | Status |
|------|------|--------|
| 41 | Implement initialisation | βœ… DONE |
| 46 | Implement contextual semantic similarity | βœ… DONE |
| 51 | Implement AWL substitution generation | βœ… DONE |
| 56 | Implement vocabulary elevation | βœ… DONE |
| 106 | Implement register filtering | βœ… DONE |
### [register_filter.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/vocabulary/register_filter.py)
| Line | TODO | Status |
|------|------|--------|
| 14 | Implement initialisation | βœ… DONE |
| 19 | Implement nominalisation | βœ… DONE |
| 24 | Implement hedging | βœ… DONE |
| 29 | Implement formality check | βœ… DONE |
---
## src/evaluation/ β€” 7 TODOs βœ…
### [gleu_scorer.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/evaluation/gleu_scorer.py)
| Line | TODO | Status |
|------|------|--------|
| 20 | Implement corpus-level GLEU scoring | βœ… DONE |
| 29 | Implement BERTScore computation | βœ… DONE |
### [errant_evaluator.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/evaluation/errant_evaluator.py)
| Line | TODO | Status |
|------|------|--------|
| 15 | Implement initialisation (ERRANT annotator) | βœ… DONE |
| 23 | Implement ERRANT evaluation | βœ… DONE |
### [style_metrics.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/evaluation/style_metrics.py)
| Line | TODO | Status |
|------|------|--------|
| 19 | Implement style similarity | βœ… DONE |
| 24 | Implement AWL coverage | βœ… DONE |
| 33 | Implement batch evaluation | βœ… DONE |
### [authorship_verifier.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/evaluation/authorship_verifier.py)
| Line | TODO | Status |
|------|------|--------|
| 14 | Implement initialisation (load model) | βœ… DONE |
| 19 | Implement authorship verification | βœ… DONE |
---
## src/inference/ β€” 3 TODOs βœ…
### [corrector.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/inference/corrector.py)
| Line | TODO | Status |
|------|------|--------|
| 39 | Implement initialisation | βœ… DONE |
| 52 | Implement full correction pipeline | βœ… DONE |
### [postprocessor.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/inference/postprocessor.py)
| Line | TODO | Status |
|------|------|--------|
| 14 | Implement initialisation | βœ… DONE |
| 19 | Implement text cleanup | βœ… DONE |
| 27 | Implement entity restoration | βœ… DONE |
| 32 | Implement final formatting | βœ… DONE |
---
## src/api/ β€” 2 TODOs βœ…
### [main.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/api/main.py)
| Line | TODO | Status |
|------|------|--------|
| 22 | Load config and initialise corrector on startup | βœ… DONE |
| 31 | Implement `/correct` endpoint | βœ… DONE |
### [middleware.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/src/api/middleware.py)
| Line | TODO | Status |
|------|------|--------|
| 14 | Implement request logging (timing, path, status) | βœ… DONE |
| 22 | Implement rate limiter state | βœ… DONE |
| 26 | Implement rate limiting logic | βœ… DONE |
---
## scripts/ β€” 5 TODOs βœ…
### [train.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/scripts/train.py)
| Line | TODO | Status |
|------|------|--------|
| 24 | Implement training pipeline (10 steps) | βœ… DONE |
### [evaluate.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/scripts/evaluate.py)
| Line | TODO | Status |
|------|------|--------|
| 19 | Implement evaluation pipeline | βœ… DONE |
### [run_inference.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/scripts/run_inference.py)
| Line | TODO | Status |
|------|------|--------|
| 21 | Implement inference pipeline | βœ… DONE |
### [pretrain_human_pattern_classifier.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/scripts/pretrain_human_pattern_classifier.py)
| Line | TODO | Status |
|------|------|--------|
| 23 | Implement classifier pre-training | βœ… DONE |
---
## tests/ β€” 18 TODOs βœ…
### [test_preprocessing.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/tests/test_preprocessing.py) β€” 7 tests βœ…
### [test_style.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/tests/test_style.py) β€” 4 tests βœ…
### [test_model.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/tests/test_model.py) β€” 2 tests + 3 new βœ…
### [test_vocabulary.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/tests/test_vocabulary.py) β€” 4 tests βœ…
### [test_evaluation.py](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/tests/test_evaluation.py) β€” 4 tests βœ…
---
## Shell Scripts βœ…
| Script | Purpose |
|--------|---------|
| [train.sh](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/train.sh) | Multi-stage training with Skip/Redo/Continue checkpoint system |
| [start.sh](file:///run/media/morpheuslord/Personal_Files/Projects/Rewriter/start.sh) | Inference launcher (CLI REPL or API server) |
---
## Summary by Package
| Package | TODOs | Status |
|---------|-------|--------|
| `src/preprocessing/` | 16 | βœ… ALL DONE |
| `src/style/` | 14 | βœ… ALL DONE |
| `src/model/` | 5 | βœ… ALL DONE |
| `src/training/` | 22 | βœ… ALL DONE |
| `src/vocabulary/` | 10 | βœ… ALL DONE |
| `src/evaluation/` | 7 | βœ… ALL DONE |
| `src/inference/` | 3 | βœ… ALL DONE |
| `src/api/` | 2 | βœ… ALL DONE |
| `scripts/` | 5 | βœ… ALL DONE |
| `tests/` | 18 | βœ… ALL DONE |
| **Total** | **97** | βœ… **ALL DONE** |