File size: 5,133 Bytes
d46a06a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 | # Agent Publishing Guide
This document explains how to publish changes to both GitHub and HuggingFace repositories.
## Repository Structure
This project is mirrored on two platforms:
| Platform | Remote Name | URL | Purpose |
|----------|-------------|-----|---------|
| **GitHub** | `origin` | `git@github.com:edwinhere/namer.git` | Source code, issues, development |
| **HuggingFace** | `hf` | `https://huggingface.co/edwinhere/namer` | Model distribution, inference API |
## Initial Setup
```bash
# Clone from GitHub (primary development repo)
git clone git@github.com:edwinhere/namer.git
cd namer
# Add HuggingFace as a second remote
git remote add hf https://huggingface.co/edwinhere/namer
# Verify remotes
git remote -v
```
## File Storage Configuration
Different files use different storage backends:
| File | GitHub | HuggingFace |
|------|--------|-------------|
| `*.py`, `*.md`, `*.json` | Git | Git |
| `namer_model.pt` | Git LFS | Git LFS |
| `model.safetensors` | Git LFS | Xet (via LFS pointer) |
### Git Attributes (`.gitattributes`)
```gitattributes
# For GitHub: namer_model.pt uses git-lfs
*.pt filter=lfs diff=lfs merge=lfs -text
# For HuggingFace: model.safetensors uses Xet for faster downloads
model.safetensors filter=lfs diff=lfs merge=lfs -text
```
## Publishing Workflow
### 1. Make Changes
Edit files normally, then commit:
```bash
git add <files>
git commit -m "Description of changes"
```
### 2. Push to GitHub (Origin)
```bash
git push origin main
```
This uploads:
- All code files to GitHub
- LFS objects (`namer_model.pt`, `model.safetensors`)
### 3. Push to HuggingFace
```bash
GIT_LFS_SKIP_SMUDGE=1 git push hf main
```
**Why `GIT_LFS_SKIP_SMUDGE=1`?**
- HuggingFace uses **Xet** storage for `model.safetensors` (faster than LFS)
- Without this flag, git tries to download LFS objects from HF that may not exist
- The flag skips the smudge filter, pushing only the LFS pointer file
- HF's Xet backend then serves the actual file content
### 4. Upload Safetensors to HuggingFace (if updated)
If `model.safetensors` changed, use the HF CLI for Xet upload:
```bash
# Upload via HF CLI (uses Xet for fast transfers)
hf upload edwinhere/namer model.safetensors model.safetensors \
--commit-message "Update model weights vX.Y"
```
## Complete Publishing Example
```bash
# 1. Make changes to code
cd /big/home/edwin/dev/namer
vim namer/data.py
# 2. Commit
git add namer/data.py
git commit -m "Fix edge case handling for numbers with many zeros"
# 3. Push code to both platforms
git push origin main
GIT_LFS_SKIP_SMUDGE=1 git push hf main
# 4. If model weights changed, upload safetensors
hf upload edwinhere/namer model.safetensors model.safetensors \
--commit-message "Update model weights with improved training"
```
## One-Line Push to Both
For convenience, push to both in one command:
```bash
git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main
```
Or with explicit checks:
```bash
# Push to GitHub
git push origin main
# Push to HuggingFace (skip LFS smudge to avoid download issues)
GIT_LFS_SKIP_SMUDGE=1 git push hf main
```
## Troubleshooting
### Diverged Branches
If `hf` and `origin` have diverged:
```bash
# Pull from HuggingFace first (skipping LFS downloads)
GIT_LFS_SKIP_SMUDGE=1 git pull hf main --rebase
# Then push back
git push origin main
GIT_LFS_SKIP_SMUDGE=1 git push hf main
```
### LFS Object Not Found
If you see "Object does not exist on the server":
```bash
# Skip smudge filter to avoid downloading missing objects
GIT_LFS_SKIP_SMUDGE=1 git pull hf main
```
### Force Push (Use with Caution)
If history was rewritten and you need to force sync:
```bash
# Force push to GitHub
git push origin main --force-with-lease
# Force push to HuggingFace
GIT_LFS_SKIP_SMUDGE=1 git push hf main --force-with-lease
```
## Verification
After publishing, verify on both platforms:
```bash
# Check GitHub latest commit
git log origin/main --oneline -3
# Check HuggingFace latest commit
git log hf/main --oneline -3
# Both should show the same commits
```
## Model Files Reference
| File | Size | Purpose | Platform |
|------|------|---------|----------|
| `namer_model.pt` | 3.6 MB | PyTorch checkpoint (training/inference) | GitHub (LFS) |
| `model.safetensors` | 3.5 MB | Safetensors format (HF compatible) | HuggingFace (Xet) |
## Commands Cheat Sheet
```bash
# View remotes
git remote -v
# View LFS tracked files
git lfs ls-files
# View LFS files with sizes
git lfs ls-files --size
# Check status on both platforms
git fetch origin && git fetch hf
git log --oneline --graph --decorate --all -5
# Push to both
git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main
```
## Badges (Cross-Platform Links)
The README maintains badges linking both platforms:
```markdown
[](https://huggingface.co/edwinhere/namer)
[](https://github.com/edwinhere/namer)
```
These should always point to each other regardless of which platform the user is viewing from.
|