Edwin Jose Palathinkal commited on
Commit
d46a06a
·
1 Parent(s): 4b0dbc8

Add AGENTS.md with publishing procedures for GitHub and HuggingFace

Browse files
Files changed (1) hide show
  1. AGENTS.md +216 -0
AGENTS.md ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Agent Publishing Guide
2
+
3
+ This document explains how to publish changes to both GitHub and HuggingFace repositories.
4
+
5
+ ## Repository Structure
6
+
7
+ This project is mirrored on two platforms:
8
+
9
+ | Platform | Remote Name | URL | Purpose |
10
+ |----------|-------------|-----|---------|
11
+ | **GitHub** | `origin` | `git@github.com:edwinhere/namer.git` | Source code, issues, development |
12
+ | **HuggingFace** | `hf` | `https://huggingface.co/edwinhere/namer` | Model distribution, inference API |
13
+
14
+ ## Initial Setup
15
+
16
+ ```bash
17
+ # Clone from GitHub (primary development repo)
18
+ git clone git@github.com:edwinhere/namer.git
19
+ cd namer
20
+
21
+ # Add HuggingFace as a second remote
22
+ git remote add hf https://huggingface.co/edwinhere/namer
23
+
24
+ # Verify remotes
25
+ git remote -v
26
+ ```
27
+
28
+ ## File Storage Configuration
29
+
30
+ Different files use different storage backends:
31
+
32
+ | File | GitHub | HuggingFace |
33
+ |------|--------|-------------|
34
+ | `*.py`, `*.md`, `*.json` | Git | Git |
35
+ | `namer_model.pt` | Git LFS | Git LFS |
36
+ | `model.safetensors` | Git LFS | Xet (via LFS pointer) |
37
+
38
+ ### Git Attributes (`.gitattributes`)
39
+
40
+ ```gitattributes
41
+ # For GitHub: namer_model.pt uses git-lfs
42
+ *.pt filter=lfs diff=lfs merge=lfs -text
43
+
44
+ # For HuggingFace: model.safetensors uses Xet for faster downloads
45
+ model.safetensors filter=lfs diff=lfs merge=lfs -text
46
+ ```
47
+
48
+ ## Publishing Workflow
49
+
50
+ ### 1. Make Changes
51
+
52
+ Edit files normally, then commit:
53
+
54
+ ```bash
55
+ git add <files>
56
+ git commit -m "Description of changes"
57
+ ```
58
+
59
+ ### 2. Push to GitHub (Origin)
60
+
61
+ ```bash
62
+ git push origin main
63
+ ```
64
+
65
+ This uploads:
66
+ - All code files to GitHub
67
+ - LFS objects (`namer_model.pt`, `model.safetensors`)
68
+
69
+ ### 3. Push to HuggingFace
70
+
71
+ ```bash
72
+ GIT_LFS_SKIP_SMUDGE=1 git push hf main
73
+ ```
74
+
75
+ **Why `GIT_LFS_SKIP_SMUDGE=1`?**
76
+
77
+ - HuggingFace uses **Xet** storage for `model.safetensors` (faster than LFS)
78
+ - Without this flag, git tries to download LFS objects from HF that may not exist
79
+ - The flag skips the smudge filter, pushing only the LFS pointer file
80
+ - HF's Xet backend then serves the actual file content
81
+
82
+ ### 4. Upload Safetensors to HuggingFace (if updated)
83
+
84
+ If `model.safetensors` changed, use the HF CLI for Xet upload:
85
+
86
+ ```bash
87
+ # Upload via HF CLI (uses Xet for fast transfers)
88
+ hf upload edwinhere/namer model.safetensors model.safetensors \
89
+ --commit-message "Update model weights vX.Y"
90
+ ```
91
+
92
+ ## Complete Publishing Example
93
+
94
+ ```bash
95
+ # 1. Make changes to code
96
+ cd /big/home/edwin/dev/namer
97
+ vim namer/data.py
98
+
99
+ # 2. Commit
100
+ git add namer/data.py
101
+ git commit -m "Fix edge case handling for numbers with many zeros"
102
+
103
+ # 3. Push code to both platforms
104
+ git push origin main
105
+ GIT_LFS_SKIP_SMUDGE=1 git push hf main
106
+
107
+ # 4. If model weights changed, upload safetensors
108
+ hf upload edwinhere/namer model.safetensors model.safetensors \
109
+ --commit-message "Update model weights with improved training"
110
+ ```
111
+
112
+ ## One-Line Push to Both
113
+
114
+ For convenience, push to both in one command:
115
+
116
+ ```bash
117
+ git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main
118
+ ```
119
+
120
+ Or with explicit checks:
121
+
122
+ ```bash
123
+ # Push to GitHub
124
+ git push origin main
125
+
126
+ # Push to HuggingFace (skip LFS smudge to avoid download issues)
127
+ GIT_LFS_SKIP_SMUDGE=1 git push hf main
128
+ ```
129
+
130
+ ## Troubleshooting
131
+
132
+ ### Diverged Branches
133
+
134
+ If `hf` and `origin` have diverged:
135
+
136
+ ```bash
137
+ # Pull from HuggingFace first (skipping LFS downloads)
138
+ GIT_LFS_SKIP_SMUDGE=1 git pull hf main --rebase
139
+
140
+ # Then push back
141
+ git push origin main
142
+ GIT_LFS_SKIP_SMUDGE=1 git push hf main
143
+ ```
144
+
145
+ ### LFS Object Not Found
146
+
147
+ If you see "Object does not exist on the server":
148
+
149
+ ```bash
150
+ # Skip smudge filter to avoid downloading missing objects
151
+ GIT_LFS_SKIP_SMUDGE=1 git pull hf main
152
+ ```
153
+
154
+ ### Force Push (Use with Caution)
155
+
156
+ If history was rewritten and you need to force sync:
157
+
158
+ ```bash
159
+ # Force push to GitHub
160
+ git push origin main --force-with-lease
161
+
162
+ # Force push to HuggingFace
163
+ GIT_LFS_SKIP_SMUDGE=1 git push hf main --force-with-lease
164
+ ```
165
+
166
+ ## Verification
167
+
168
+ After publishing, verify on both platforms:
169
+
170
+ ```bash
171
+ # Check GitHub latest commit
172
+ git log origin/main --oneline -3
173
+
174
+ # Check HuggingFace latest commit
175
+ git log hf/main --oneline -3
176
+
177
+ # Both should show the same commits
178
+ ```
179
+
180
+ ## Model Files Reference
181
+
182
+ | File | Size | Purpose | Platform |
183
+ |------|------|---------|----------|
184
+ | `namer_model.pt` | 3.6 MB | PyTorch checkpoint (training/inference) | GitHub (LFS) |
185
+ | `model.safetensors` | 3.5 MB | Safetensors format (HF compatible) | HuggingFace (Xet) |
186
+
187
+ ## Commands Cheat Sheet
188
+
189
+ ```bash
190
+ # View remotes
191
+ git remote -v
192
+
193
+ # View LFS tracked files
194
+ git lfs ls-files
195
+
196
+ # View LFS files with sizes
197
+ git lfs ls-files --size
198
+
199
+ # Check status on both platforms
200
+ git fetch origin && git fetch hf
201
+ git log --oneline --graph --decorate --all -5
202
+
203
+ # Push to both
204
+ git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main
205
+ ```
206
+
207
+ ## Badges (Cross-Platform Links)
208
+
209
+ The README maintains badges linking both platforms:
210
+
211
+ ```markdown
212
+ [![HuggingFace](https://img.shields.io/badge/🤗_HuggingFace-Model_Card-yellow)](https://huggingface.co/edwinhere/namer)
213
+ [![GitHub](https://img.shields.io/badge/🐙_GitHub-Source_Code-blue)](https://github.com/edwinhere/namer)
214
+ ```
215
+
216
+ These should always point to each other regardless of which platform the user is viewing from.