namer / AGENTS.md
Edwin Jose Palathinkal
Add AGENTS.md with publishing procedures for GitHub and HuggingFace
d46a06a

Agent Publishing Guide

This document explains how to publish changes to both GitHub and HuggingFace repositories.

Repository Structure

This project is mirrored on two platforms:

Platform Remote Name URL Purpose
GitHub origin git@github.com:edwinhere/namer.git Source code, issues, development
HuggingFace hf https://huggingface.co/edwinhere/namer Model distribution, inference API

Initial Setup

# Clone from GitHub (primary development repo)
git clone git@github.com:edwinhere/namer.git
cd namer

# Add HuggingFace as a second remote
git remote add hf https://huggingface.co/edwinhere/namer

# Verify remotes
git remote -v

File Storage Configuration

Different files use different storage backends:

File GitHub HuggingFace
*.py, *.md, *.json Git Git
namer_model.pt Git LFS Git LFS
model.safetensors Git LFS Xet (via LFS pointer)

Git Attributes (.gitattributes)

# For GitHub: namer_model.pt uses git-lfs
*.pt filter=lfs diff=lfs merge=lfs -text

# For HuggingFace: model.safetensors uses Xet for faster downloads
model.safetensors filter=lfs diff=lfs merge=lfs -text

Publishing Workflow

1. Make Changes

Edit files normally, then commit:

git add <files>
git commit -m "Description of changes"

2. Push to GitHub (Origin)

git push origin main

This uploads:

  • All code files to GitHub
  • LFS objects (namer_model.pt, model.safetensors)

3. Push to HuggingFace

GIT_LFS_SKIP_SMUDGE=1 git push hf main

Why GIT_LFS_SKIP_SMUDGE=1?

  • HuggingFace uses Xet storage for model.safetensors (faster than LFS)
  • Without this flag, git tries to download LFS objects from HF that may not exist
  • The flag skips the smudge filter, pushing only the LFS pointer file
  • HF's Xet backend then serves the actual file content

4. Upload Safetensors to HuggingFace (if updated)

If model.safetensors changed, use the HF CLI for Xet upload:

# Upload via HF CLI (uses Xet for fast transfers)
hf upload edwinhere/namer model.safetensors model.safetensors \
    --commit-message "Update model weights vX.Y"

Complete Publishing Example

# 1. Make changes to code
cd /big/home/edwin/dev/namer
vim namer/data.py

# 2. Commit
git add namer/data.py
git commit -m "Fix edge case handling for numbers with many zeros"

# 3. Push code to both platforms
git push origin main
GIT_LFS_SKIP_SMUDGE=1 git push hf main

# 4. If model weights changed, upload safetensors
hf upload edwinhere/namer model.safetensors model.safetensors \
    --commit-message "Update model weights with improved training"

One-Line Push to Both

For convenience, push to both in one command:

git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main

Or with explicit checks:

# Push to GitHub
git push origin main

# Push to HuggingFace (skip LFS smudge to avoid download issues)
GIT_LFS_SKIP_SMUDGE=1 git push hf main

Troubleshooting

Diverged Branches

If hf and origin have diverged:

# Pull from HuggingFace first (skipping LFS downloads)
GIT_LFS_SKIP_SMUDGE=1 git pull hf main --rebase

# Then push back
git push origin main
GIT_LFS_SKIP_SMUDGE=1 git push hf main

LFS Object Not Found

If you see "Object does not exist on the server":

# Skip smudge filter to avoid downloading missing objects
GIT_LFS_SKIP_SMUDGE=1 git pull hf main

Force Push (Use with Caution)

If history was rewritten and you need to force sync:

# Force push to GitHub
git push origin main --force-with-lease

# Force push to HuggingFace
GIT_LFS_SKIP_SMUDGE=1 git push hf main --force-with-lease

Verification

After publishing, verify on both platforms:

# Check GitHub latest commit
git log origin/main --oneline -3

# Check HuggingFace latest commit
git log hf/main --oneline -3

# Both should show the same commits

Model Files Reference

File Size Purpose Platform
namer_model.pt 3.6 MB PyTorch checkpoint (training/inference) GitHub (LFS)
model.safetensors 3.5 MB Safetensors format (HF compatible) HuggingFace (Xet)

Commands Cheat Sheet

# View remotes
git remote -v

# View LFS tracked files
git lfs ls-files

# View LFS files with sizes
git lfs ls-files --size

# Check status on both platforms
git fetch origin && git fetch hf
git log --oneline --graph --decorate --all -5

# Push to both
git push origin main && GIT_LFS_SKIP_SMUDGE=1 git push hf main

Badges (Cross-Platform Links)

The README maintains badges linking both platforms:

[![HuggingFace](https://img.shields.io/badge/🤗_HuggingFace-Model_Card-yellow)](https://huggingface.co/edwinhere/namer)
[![GitHub](https://img.shields.io/badge/🐙_GitHub-Source_Code-blue)](https://github.com/edwinhere/namer)

These should always point to each other regardless of which platform the user is viewing from.