CTAIAnimalClassifierFastApi / LARGER_FILES.MD
IDS75912's picture
md added
06eb1ae
Quick summary — recommended approach
Use Git LFS to track large binary files (models, datasets, etc.). It replaces file contents in the Git history with small pointers and stores the real file content separately.
If the large file is already committed to history, use git lfs migrate to rewrite history and convert those file entries to LFS pointers (requires force-push).
Alternatives: GitHub Releases (for single large assets), external object storage (S3, Azure Blob) and store URLs in repo, or tools like git-annex/BFG for special workflows.
Below are step-by-step commands you can run in your environment (Linux, bash). Replace remote names and branch names as appropriate.
1) Install Git LFS
On Ubuntu/Debian:
sudo apt update
sudo apt install git-lfs
git lfs install
git lfs version
2) Track the file types you want in LFS
Decide which patterns to track. For your repo, track model files:
cd /root/MlOpsUbuntu/1_ai_model_deployment/1_3_Inference/HF/CTAIAnimalClassifierFastApi
git lfs track "animal-classification/**/model.keras"
git lfs track "*.h5"
git lfs track "*.pt"
git lfs track "*.onnx"
This creates or updates .gitattributes with the tracked patterns. Commit it:
git add .gitattributes
git commit -m "Add Git LFS tracking for model files"
3) Add and push new large files (new, not already in history)
If the large file isn't yet committed:
git add animal-classification/INPUT_model_path/animal-cnn/model.keras
git commit -m "Add model to LFS"
git push origin main
After this, model.keras content will be stored in Git LFS and the Git history will only contain a small pointer.
Note: GitHub has an LFS bandwidth and storage quota for free accounts; see GitHub LFS docs. For large models, consider external storage (S3) if quotas are insufficient.
4) Migrate already-committed large files into LFS
If model.keras (or other large files) are already committed and in your history, you must rewrite history to convert prior commits to use LFS pointers.
Important: This rewrites git history. Coordinate with teammates; everyone must re-clone or reset their local clones after a force-push.
Example commands:
# Make sure you have a backup or are on a clone you can force-push from
git lfs install
# Convert all files that match the pattern to LFS across all branches and tags:
git lfs migrate import --include="animal-classification/**/model.keras,*.h5,*.onnx,*.pt" --everything
# Or narrower: only the main branch
git lfs migrate import --include="animal-classification/**/model.keras" --include-ref=refs/heads/main
# Force-push updated branches (example for main)
git push --force --all origin
git push --force --tags origin
What this does: rewrites commits to replace large blobs with LFS pointers and uploads blobs to the LFS storage during the migrate/import step.
Verification:
# Check that the file is now a pointer in git history
git ls-files -s | grep model.keras
# See LFS-managed objects
git lfs ls-files