File size: 5,133 Bytes
d46a06a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
# Agent Publishing Guide

This document explains how to publish changes to both GitHub and HuggingFace repositories.

## Repository Structure

This project is mirrored on two platforms:

| Platform | Remote Name | URL | Purpose |
|----------|-------------|-----|---------|
| **GitHub** | `origin` | `git@github.com:edwinhere/namer.git` | Source code, issues, development |
| **HuggingFace** | `hf` | `https://huggingface.co/edwinhere/namer` | Model distribution, inference API |

## Initial Setup

```bash
# Clone from GitHub (primary development repo)
git clone git@github.com:edwinhere/namer.git
cd namer

# Add HuggingFace as a second remote
git remote add hf https://huggingface.co/edwinhere/namer

# Verify remotes
git remote -v
```

## File Storage Configuration

Different files use different storage backends:

| File | GitHub | HuggingFace |
|------|--------|-------------|
| `*.py`, `*.md`, `*.json` | Git | Git |
| `namer_model.pt` | Git LFS | Git LFS |
| `model.safetensors` | Git LFS | Xet (via LFS pointer) |

### Git Attributes (`.gitattributes`)

```gitattributes
# For GitHub: namer_model.pt uses git-lfs
*.pt filter=lfs diff=lfs merge=lfs -text

# For HuggingFace: model.safetensors uses Xet for faster downloads
model.safetensors filter=lfs diff=lfs merge=lfs -text
```

## Publishing Workflow

### 1. Make Changes

Edit files normally, then commit:

```bash
git add <files>
git commit -m "Description of changes"
```

### 2. Push to GitHub (Origin)

```bash
git push origin main
```

This uploads:
- All code files to GitHub
- LFS objects (`namer_model.pt`, `model.safetensors`)

### 3. Push to HuggingFace

```bash
GIT_LFS_SKIP_SMUDGE=1 git push hf main
```

**Why `GIT_LFS_SKIP_SMUDGE=1`?**

- HuggingFace uses **Xet** storage for `model.safetensors` (faster than LFS)
- Without this flag, git tries to download LFS objects from HF that may not exist
- The flag skips the smudge filter, pushing only the LFS pointer file
- HF's Xet backend then serves the actual file content

### 4. Upload Safetensors to HuggingFace (if updated)

If `model.safetensors` changed, use the HF CLI for Xet upload:

```bash
# Upload via HF CLI (uses Xet for fast transfers)
hf upload edwinhere/namer model.safetensors model.safetensors \
    --commit-message "Update model weights vX.Y"
```

## Complete Publishing Example

```bash
# 1. Make changes to code
cd /big/home/edwin/dev/namer
vim namer/data.py

# 2. Commit
git add namer/data.py
git commit -m "Fix edge case handling for numbers with many zeros"

# 3. Push code to both platforms
git push origin main
GIT_LFS_SKIP_SMUDGE=1 git push hf main

# 4. If model weights changed, upload safetensors
hf upload edwinhere/namer model.safetensors model.safetensors \
    --commit-message "Update model weights with improved training"
```

## One-Line Push to Both

For convenience, push to both in one command:

```bash
git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main
```

Or with explicit checks:

```bash
# Push to GitHub
git push origin main

# Push to HuggingFace (skip LFS smudge to avoid download issues)
GIT_LFS_SKIP_SMUDGE=1 git push hf main
```

## Troubleshooting

### Diverged Branches

If `hf` and `origin` have diverged:

```bash
# Pull from HuggingFace first (skipping LFS downloads)
GIT_LFS_SKIP_SMUDGE=1 git pull hf main --rebase

# Then push back
git push origin main
GIT_LFS_SKIP_SMUDGE=1 git push hf main
```

### LFS Object Not Found

If you see "Object does not exist on the server":

```bash
# Skip smudge filter to avoid downloading missing objects
GIT_LFS_SKIP_SMUDGE=1 git pull hf main
```

### Force Push (Use with Caution)

If history was rewritten and you need to force sync:

```bash
# Force push to GitHub
git push origin main --force-with-lease

# Force push to HuggingFace
GIT_LFS_SKIP_SMUDGE=1 git push hf main --force-with-lease
```

## Verification

After publishing, verify on both platforms:

```bash
# Check GitHub latest commit
git log origin/main --oneline -3

# Check HuggingFace latest commit
git log hf/main --oneline -3

# Both should show the same commits
```

## Model Files Reference

| File | Size | Purpose | Platform |
|------|------|---------|----------|
| `namer_model.pt` | 3.6 MB | PyTorch checkpoint (training/inference) | GitHub (LFS) |
| `model.safetensors` | 3.5 MB | Safetensors format (HF compatible) | HuggingFace (Xet) |

## Commands Cheat Sheet

```bash
# View remotes
git remote -v

# View LFS tracked files
git lfs ls-files

# View LFS files with sizes
git lfs ls-files --size

# Check status on both platforms
git fetch origin && git fetch hf
git log --oneline --graph --decorate --all -5

# Push to both
git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main
```

## Badges (Cross-Platform Links)

The README maintains badges linking both platforms:

```markdown
[![HuggingFace](https://img.shields.io/badge/🤗_HuggingFace-Model_Card-yellow)](https://huggingface.co/edwinhere/namer)
[![GitHub](https://img.shields.io/badge/🐙_GitHub-Source_Code-blue)](https://github.com/edwinhere/namer)
```

These should always point to each other regardless of which platform the user is viewing from.