Edwin Jose Palathinkal commited on
Commit
a099fb0
·
1 Parent(s): 0b0d6ed

Add CHANGELOG, HuggingFace README, and upload script for v2.0

Browse files
Files changed (3) hide show
  1. CHANGELOG.md +56 -0
  2. README_HF.md +114 -0
  3. upload.sh +71 -0
CHANGELOG.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Changelog
2
+
3
+ All notable changes to the Namer project will be documented in this file.
4
+
5
+ ## [2.0.0] - 2025-05-09
6
+
7
+ ### Added
8
+ - Support for numbers up to 999,999,999,999 (trillions) - increased from 999,999
9
+ - Stratified sampling during training for balanced representation across number scales
10
+ - Extended max output length from 20 to 25 tokens
11
+ - Extended max sequence length from 20 to 25 tokens
12
+ - Special case handling for zero in inference
13
+ - New test cases for billion and trillion ranges
14
+
15
+ ### Changed
16
+ - `InfiniteNamerDataset` now uses stratified sampling by default
17
+ - Default `max_int` changed from 999,999 to 999,999,999,999
18
+ - Training now samples equally across: units, thousands, millions, billions, trillions
19
+ - Model architecture unchanged but supports longer outputs
20
+
21
+ ### Fixed
22
+ - Small numbers (under 1M) now work correctly with large-range model
23
+ - Zero is now handled as a special case to prevent token repetition
24
+
25
+ ### Technical Details
26
+ - Training uses 5 stratified buckets (20% each):
27
+ - 0-999 (units)
28
+ - 1,000-999,999 (thousands)
29
+ - 1M-999M (millions)
30
+ - 1B-999B (billions)
31
+ - 1T-999T (trillions)
32
+ - Validation accuracy: >99.9%
33
+ - Model parameters: ~869K
34
+
35
+ ## [1.0.0] - 2025-05-08
36
+
37
+ ### Added
38
+ - Initial release
39
+ - Support for numbers 0-999,999 (millions)
40
+ - Transformer-based sequence-to-sequence model
41
+ - HuggingFace Transformers integration
42
+ - PyTorch native model format
43
+ - Interactive inference mode
44
+ - Training pipeline with infinite dataset
45
+
46
+ ### Features
47
+ - 41-token vocabulary (number words + EOS)
48
+ - 20-token max output length
49
+ - 20-digit max input sequence length
50
+ - 4-layer transformer encoder
51
+ - Cross-attention mechanism with learned queries
52
+
53
+ ---
54
+
55
+ [2.0.0]: https://github.com/edwinhere/namer/compare/v1.0.0...v2.0.0
56
+ [1.0.0]: https://github.com/edwinhere/namer/releases/tag/v1.0.0
README_HF.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ library_name: pytorch
5
+ tags:
6
+ - text-generation
7
+ - number-to-text
8
+ - pytorch
9
+ - transformer
10
+ - stratified-sampling
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ # Namer
15
+
16
+ A PyTorch transformer model that converts **integers to their English names** — now supporting numbers up to **999,999,999,999** (nearly one trillion)!
17
+
18
+ ## Quick Start
19
+
20
+ ```python
21
+ from transformers import AutoModel
22
+ from namer import NamerPipeline
23
+
24
+ # Load model
25
+ model = AutoModel.from_pretrained(
26
+ "edwinhere/namer",
27
+ trust_remote_code=True
28
+ )
29
+
30
+ # Create pipeline
31
+ pipe = NamerPipeline(model)
32
+
33
+ # Generate number names
34
+ print(pipe.generate(42)) # "forty two"
35
+ print(pipe.generate(1234567890)) # "one billion two hundred thirty four million..."
36
+ print(pipe.generate(999999999999)) # "nine hundred ninety nine billion..."
37
+ ```
38
+
39
+ ## Model Description
40
+
41
+ Namer is a sequence-to-sequence transformer trained to read digits of a number and generate the corresponding English textual representation.
42
+
43
+ ### Key Features
44
+
45
+ - 🎯 **Stratified Training**: Balanced sampling across number scales ensures accurate performance on both small and large numbers
46
+ - 📈 **Large Range**: Handles numbers from 0 to ~1 trillion (12 digits)
47
+ - 🚀 **Fast Inference**: Single forward pass, no autoregressive generation needed
48
+ - 🎓 **High Accuracy**: >99.9% validation accuracy
49
+
50
+ ### Example Conversions
51
+
52
+ | Integer | English Name |
53
+ |---------|-------------|
54
+ | 0 | zero |
55
+ | 42 | forty two |
56
+ | 123 | one hundred twenty three |
57
+ | 1000 | one thousand |
58
+ | 999999 | nine hundred ninety nine thousand nine hundred ninety nine |
59
+ | 1234567890 | one billion two hundred thirty four million five hundred sixty seven thousand eight hundred ninety |
60
+ | 999999999999 | nine hundred ninety nine billion nine hundred ninety nine million nine hundred ninety nine thousand nine hundred ninety nine |
61
+
62
+ ## Architecture
63
+
64
+ - **Type**: Transformer encoder with learned queries and cross-attention
65
+ - **Parameters**: ~869K
66
+ - **Vocabulary**: 41 tokens (number words + EOS)
67
+ - **Max Output Length**: 25 tokens
68
+ - **Input**: Digit sequences (0-9 + padding)
69
+
70
+ ## Training Details
71
+
72
+ - **Dataset**: Infinite stratified sampling across 5 scales (units, thousands, millions, billions, trillions)
73
+ - **Optimizer**: Adam (lr=0.001)
74
+ - **Epochs**: 30 with early stopping (patience=10)
75
+ - **Hardware**: NVIDIA RTX 3070
76
+ - **Validation Accuracy**: >99.9%
77
+
78
+ ### Why Stratified Sampling?
79
+
80
+ With uniform random sampling from 0-1T, 99.9% of samples would be >1M, causing the model to fail on small numbers. Stratified sampling gives each magnitude equal representation (20% each), ensuring robust performance across the entire range.
81
+
82
+ ## Version History
83
+
84
+ **v2.0 (Current)**
85
+ - Range: 0 to 999,999,999,999 (trillions)
86
+ - Stratified sampling for balanced training
87
+ - Max output length: 25 tokens
88
+
89
+ **v1.0**
90
+ - Range: 0 to 999,999 (millions)
91
+ - Uniform random sampling
92
+ - Max output length: 20 tokens
93
+
94
+ ## Limitations
95
+
96
+ - Maximum: 999,999,999,999 (12 digits)
97
+ - No negative numbers (uses absolute value)
98
+ - No decimal/fractional numbers
99
+
100
+ ## Citation
101
+
102
+ ```bibtex
103
+ @software{namer,
104
+ author = {Edwin Jose Palathinkal},
105
+ title = {Namer: Integer to English Name Converter},
106
+ url = {https://huggingface.co/edwinhere/namer},
107
+ year = {2025}
108
+ }
109
+ ```
110
+
111
+ ## Links
112
+
113
+ - GitHub: https://github.com/edwinhere/namer
114
+ - HuggingFace: https://huggingface.co/edwinhere/namer
upload.sh ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Upload script for Namer model v2.0
3
+ # This script pushes the updated model to GitHub and HuggingFace
4
+
5
+ set -e
6
+
7
+ echo "=== Namer v2.0 Upload Script ==="
8
+ echo ""
9
+
10
+ # Colors for output
11
+ GREEN='\033[0;32m'
12
+ YELLOW='\033[1;33m'
13
+ NC='\033[0m' # No Color
14
+
15
+ # Step 1: Verify files exist
16
+ echo -e "${YELLOW}Step 1: Verifying files...${NC}"
17
+ required_files=("README.md" "CHANGELOG.md" "config.json" "model.safetensors" "modeling_namer.py" "namer_model.pt")
18
+ for file in "${required_files[@]}"; do
19
+ if [ ! -f "$file" ]; then
20
+ echo "ERROR: Required file '$file' not found!"
21
+ exit 1
22
+ fi
23
+ echo " ✓ $file"
24
+ done
25
+
26
+ # Step 2: Run tests
27
+ echo ""
28
+ echo -e "${YELLOW}Step 2: Running tests...${NC}"
29
+ source .venv/bin/activate
30
+ python -m namer test
31
+
32
+ # Step 3: Copy HF README
33
+ echo ""
34
+ echo -e "${YELLOW}Step 3: Preparing HuggingFace README...${NC}"
35
+ cp README_HF.md README.md.tmp
36
+ cp README.md README.md.git
37
+ cp README_HF.md README.md
38
+ echo " ✓ Copied README_HF.md to README.md for HF upload"
39
+
40
+ # Step 4: Commit and push to GitHub
41
+ echo ""
42
+ echo -e "${YELLOW}Step 4: Pushing to GitHub...${NC}"
43
+ git add -A
44
+ git commit -m "Namer v2.0: Support for trillions with stratified training
45
+
46
+ - Extended range from millions to trillions (0-999,999,999,999)
47
+ - Added stratified sampling for balanced training across scales
48
+ - Increased max_output_len from 20 to 25 tokens
49
+ - Updated documentation and added CHANGELOG
50
+ - All tests passing"
51
+ git push origin main
52
+ echo " ✓ Pushed to GitHub"
53
+
54
+ # Step 5: Push to HuggingFace
55
+ echo ""
56
+ echo -e "${YELLOW}Step 5: Pushing to HuggingFace...${NC}"
57
+ git push hf main
58
+ echo " ✓ Pushed to HuggingFace"
59
+
60
+ # Step 6: Restore GitHub README
61
+ echo ""
62
+ echo -e "${YELLOW}Step 6: Restoring GitHub README...${NC}"
63
+ mv README.md.tmp README.md
64
+ echo " ✓ Restored"
65
+
66
+ echo ""
67
+ echo -e "${GREEN}=== Upload Complete! ===${NC}"
68
+ echo ""
69
+ echo "Model is now available at:"
70
+ echo " - GitHub: https://github.com/edwinhere/namer"
71
+ echo " - HuggingFace: https://huggingface.co/edwinhere/namer"