AMADEUS / OPEN_SOURCE.md

Add open source documentation

365e3b2 verified about 1 month ago

23.4 kB


	# AMADEUS Project Open‑Source Documentation

	## 📘 Notion documentation example

	### Project Overview
	Project Name – AMADEUS (AI‑based Morphological Analysis & Design Engine for Unique Shoes)

	- Team composition: P‑Practical Project third semester AI team 3 (Kim Taeryang, Park Hyundong, Park Chanwoo, Bang Hojun). From our coordination, Hyundong and Taeryang implemented and tested the core functions, while Hojun and Chanwoo managed the open‑source materials on GitHub and Notion/Hugging Face.
	- Project goal: Unlike traditional mass‑production footwear manufacture, the aim is to leverage AI to reconstruct a foot’s 3D shape from a smartphone video and produce a custom LAST (shoe mold).
	- Motivation: Existing sizing systems are stuck in the 20th century and are unsuited to the era of hyper‑personalisation. Conflicts between mass production and personalisation have emerged due to advances in AI and growing demand for custom products【528303309685221†L34-L45】.
	- Key objectives: Generate a high‑precision 3D mesh from 2D images to shorten the handmade shoe production process. This improves artisans’ productivity and reduces the time needed to create a LAST from several days to a few hours, enabling customisation without physical visits【528303309685221†L49-L60】.

	### Market analysis & differentiation
	- Overseas companies such as Aetrex (USA) and 3DOE (China) offer foot‑scanning services with dedicated 3D cameras【528303309685221†L64-L77】, and domestic competitors like Gabenyang and DearFoot also rely on expensive scanners【528303309685221†L70-L77】. These specialised devices raise costs and limit accessibility.
	- AMADEUS performs 3D reconstruction and LAST production using only smartphone footage. Whereas traditional methods require a customer visit → measurement → mold making → LAST carving/modification and take several days, this project compresses the sequence to smartphone capture → AI processing → 3D printing, greatly reducing time and automating production【528303309685221†L80-L94】.

	### Pipeline overview (eight‑stage process)

	The public GitHub repository provides a detailed README that outlines the current pipeline as eight interdependent stages. This pipeline differs slightly from the six‑step summary in the original report, so adapt your Notion/Hugging Face documentation accordingly:

	1. Pre‑processing (Step 1) — isolate the foot region and remove background noise.
	Goal: improve data quality by generating masked images.
	Technologies: YOLOv11 (detects the foot’s bounding box) and SAM (Segment Anything) for pixel‑level segmentation.
	Operation: each frame from `raw_images/` is processed by YOLO and SAM; the resulting masks are saved to `data/masked_images/`【717972134425363†screenshot】.

	2. 3D Reconstruction (Step 2) — estimate camera poses and generate a sparse point cloud using COLMAP.
	Technologies: feature extraction, sequential matching and bundle adjustment.
	Outputs: `colmap_work/sparse/0/` (camera parameters and initial 3D point cloud)【894071447112061†screenshot】.

	3. Undistortion & Alignment (Step 3) — remove lens distortion and orient the model’s coordinate system.
	Technologies: COLMAP’s image undistorter and an `auto_align_colmap.py` script.
	Outputs: undistorted images saved in `sugar_ready/` ready for 3DGS training【894071447112061†screenshot】.

	4. 3D Gaussian Splatting Training (Step 4) — train a field of 3D Gaussian balls to produce a high‑fidelity point cloud.
	Technologies: the 3DGS algorithm (from graphdeco‑inria/gaussian‑splatting).
	Outputs: `output/vanilla_3dgs_point_cloud/` (high‑density point cloud)【35837800387949†screenshot】.

	5. Meshing & Healing (Step 5) — convert the point cloud into a watertight mesh and clean it.
	Technologies: SuGaR (Surface‑Aligned Gaussian Splatting), a keep_largest_cluster algorithm to remove floating artifacts, and hole‑filling/healing operations【35837800387949†screenshot】.
	Outputs: `output/final_foot_mesh.obj` (refined mesh ready for 3D printing)【756719135168637†screenshot】.

	6. Real‑world scaling & shoe last design (Step 6) — scale the mesh from arbitrary units to millimetres and design the shoe last.
	Goal: convert the mesh to real‑world dimensions and model the last’s geometric features.
	Tools: scale helpers and CAD scripts; details are still under development.

	7. 3D printing & validation (Step 7) — slice and print the last on a 3D printer.
	Tools: Bambu Studio or similar slicing software; remote monitoring ensures print quality【756719135168637†screenshot】.

	8. Troubleshooting & iteration (Step 8) — handle known issues and refine the system.
	Known issues: resolution mismatch errors between masked images and raw images; auto‑alignment failures due to insufficient point‑cloud density【756719135168637†screenshot】.
	Resolution: modify `segment_foot.py` to preserve original image resolution and collect higher‑density data; continue iterating on the pipeline.

	### Expected challenges & solutions
	- Feet have few natural features, making feature extraction and automatic scaling difficult; the team plans to explore artificial feature generation【528303309685221†L212-L232】.
	- Methods are needed to minimise noise and distortion in the 3D reconstruction【528303309685221†L215-L218】.
	- A user‑friendly shooting guide will be developed, and extensive tests under varied conditions will drive improvements【528303309685221†L219-L229】.

	### Expected impact
	- The most time‑consuming steps in bespoke shoemaking (measurement → mold creation) are reduced by up to 70 percent【528303309685221†L233-L238】, and automating processes once dependent on artisans’ skill dramatically lowers costs【528303309685221†L239-L244】.
	- Introducing 3DGS technology to shoemaking eliminates the need for costly scanners and increases access to the bespoke market【528303309685221†L245-L256】.

	### Development environment & installation
	The following information summarises both the original development environment used by the team and the recommended environment described in the public repository. Include this information in the Development Guide section of your Notion page.

	- Operating system: Ubuntu 22.04 (team’s environment) or Windows 10/11 with WSL2 enabled. The public README recommends Linux or Windows with WSL2.
	- Hardware: NVIDIA RTX 3090 24 GB GPU (used during development); the README suggests a consumer GPU such as RTX 4060 8 GB or higher【717972134425363†screenshot】.
	- CUDA runtime: 11.8 (team setup).
	- Python environment: Python 3.11.13 (via conda).
	- Key libraries: PyTorch 2.0.1 + cu118, torchvision 0.15.2 + cu118, torchaudio 2.0.2 + cu118, and other dependencies listed in `requirements.txt` (e.g., plyfile, open3d, pymeshlab, trimesh, ultralytics, segment_anything, tqdm, requests and roboflow)【775823669849662†screenshot】.

	#### Recommended execution (Docker)

	To avoid complex library dependencies, the GitHub repository recommends using Docker. The steps below summarise the instructions from the README:

	1. Build the image
	```bash
	# Run in the repository root (where the Dockerfile is located)
	docker build -t amadeus .
	```
	2. Run the container
	```bash
	docker run --gpus all -it --rm \
	-v $(pwd)/data:/app/data \
	-v $(pwd)/output:/app/output \
	amadeus
	```
	This mounts the `data/` and `output/` folders into the container so that input images and results persist【408358827616127†screenshot】.
	3. Execute the pipeline
	```bash
	chmod +x run_pipeline.sh
	xvfb-run -a ./run_pipeline.sh
	```
	The script `run_pipeline.sh` runs all steps (masking, COLMAP, undistortion, 3DGS training, SuGaR meshing and post‑processing) sequentially【408358827616127†screenshot】.

	#### Source code structure
	```
	data/ # raw_images/ (original foot photos) and masked_images/ (background‑removed images)
	models/ # AI model files (YOLOv11 weights, SAM ViT-H checkpoints)
	src/ # Python source code
	preprocessing/ # scripts for automatic segmentation (mask generation)
	postprocessing/ # mesh cutting, healing and smoothing scripts
	utils/ # miscellaneous utility functions
	colmap_work/ # intermediate outputs from COLMAP (sparse model, undistorted images)
	output/ # results: vanilla_3dgs_point_cloud/, final_foot_mesh.obj, etc.
	Dockerfile # defines the Docker environment
	requirements.txt # lists Python dependencies not installed via the Dockerfile
	run_pipeline.sh # executes the full pipeline end‑to‑end
	```

	#### Installation & execution summary
	The manual approach below is retained for completeness, but the recommended way to run the system is to use the Docker commands described above (`run_pipeline.sh` automates these steps).
	1. Clone repositories & install dependencies
	```bash
	git clone https://github.com/graphdeco-inria/gaussian-splatting.git
	cd gaussian-splatting && git submodule update --init --recursive

	git clone https://github.com/Anttwo/SuGaR.git
	cd SuGaR && git submodule update --init --recursive
	```
	Install Python packages from each submodule in editable mode as needed (`pip install -e .`).

	2. Prepare COLMAP
	- Place foot images and checkerboard masks in `/app/src/colmap/input`.
	- Run `feature_extractor`, `sequential_matcher` and `mapper` to build the sparse point cloud. For example:
	```bash
	xvfb-run -a colmap feature_extractor --database_path /app/src/colmap_work/database.db --image_path /app/src/colmap/input --ImageReader.camera_model SIMPLE_RADIAL --ImageReader.single_camera 1 --SiftExtraction.use_gpu 1
	```

	3. Train 3DGS & post‑process
	- After aligning the COLMAP output (e.g., via RANSAC), start 3DGS training in `gaussian-splatting`.
	- Example: `python train.py -s /app/src/colmap_work/sugar_ready -m /app/src/colmap_work/output_3dgs_15k --iterations 15000 ...`
	- Once training completes, perform noise removal and largest‑cluster extraction.

	4. SuGaR & mesh post‑processing
	- Ensure SuGaR submodule versions are consistent, then use the `sugar_ready` folder as input to generate the surface mesh.
	- Use Blender or PyMeshLab to fill holes and ensure the mesh is watertight.

	5. Scale adjustment & printing
	- Use a scale helper to match the mesh to actual foot dimensions. Slice the final mesh in Bambu Studio and print on a 3D printer.



	### Dockerization & runtime environment
	The project provides a `Dockerfile` that uses `nvidia/cuda:11.8.0-devel-ubuntu22.04` as its base image. It installs system packages such as Git, FFmpeg and COLMAP, installs PyTorch 2.0.1+cu118, `torchvision` and `torchaudio`, copies the code into `/app` and installs additional Python dependencies from `requirements.txt`. A typical snippet is shown below:

	```Dockerfile
	FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
	# install system packages
	RUN apt-get update && apt-get install -y git ffmpeg colmap ...
	# install PyTorch and dependencies
	RUN pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --extra-index-url https://download.pytorch.org/whl/cu118
	# copy project and install Python packages
	COPY . /app
	WORKDIR /app
	RUN pip install -r requirements.txt
	CMD ["/bin/bash"]
	```

	A `docker-compose.yaml` defines a service named `ai-shoefitter` that enables GPU access, mounts the project directory, configures `shm_size` for PyTorch data loaders and exposes port 8080. Running `docker-compose up` builds and starts the container. The `requirements.txt` file lists Python libraries beyond PyTorch and they are installed automatically during the Docker build.

	### Postprocessing scripts
	After training the 3DGS model and extracting a mesh with SuGaR, several postprocessing scripts refine the outputs:

	- `auto_align_colmap.py` aligns the COLMAP sparse model and point cloud to a consistent orientation using Open3D. It backs up the original sparse folder and writes an aligned point cloud as `aligned_debug.ply`.
	```bash
	python auto_align_colmap.py
	```

	- `clean_3dgs_ply_2.py` performs Statistical Outlier Removal (SOR) and optional radius‑based filtering on a 3DGS point cloud. It removes noise and reports the number of retained points.
	```bash
	python clean_3dgs_ply_2.py --input_ply point_cloud.ply --output_ply cleaned.ply --nb_neighbors 20 --std_ratio 2.0 --use_radius --radius_ratio 0.01 --min_radius_neighbors 16
	```

	- `heal_mesh.py` uses PyMeshLab to remove small components, fill holes and smooth a mesh. It reports vertex and face counts before and after healing.
	```bash
	python heal_mesh.py --input_mesh mesh.ply --output_mesh healed_mesh.ply
	```

	- `cut_sugar_with_pcd_mask.py` trims a SuGaR mesh using a reference point cloud. It builds a KD‑tree, keeps vertices within a distance threshold, selects the largest connected component and applies Taubin smoothing.
	```bash
	python cut_sugar_with_pcd_mask.py --ref_pcd foot_mask.ply --input_mesh sugar_mesh.ply --output_mesh cut_mesh.ply --dist_thresh 0.02 --voxel_size 0.005
	```

	These scripts generate clean point clouds and meshes ready for scale calibration and final printing.

	### Schedule & roles
	- Data collection & pre‑processing: until ~11/29【528303309685221†L258-L268】
	- 3D reconstruction & optimisation: until ~12/13【528303309685221†L258-L268】
	- Mesh extraction & post‑processing / scale algorithm implementation: mid‑December【528303309685221†L258-L268】
	- System integration, prototyping & validation: January–March 2026【528303309685221†L278-L283】
	- Final validation & production: May–June 2026【528303309685221†L278-L283】

	State role assignments at the top of your Notion page and manage progress/issues via a checklist or daily log. Role assignments: Hyundong organised the entire pipeline; Taeryang built the COLMAP→3DGS→SuGaR pipeline and handled noise removal; Hojun and Chanwoo managed the open‑source documentation. All members (Hyundong, Taeryang, Hojun and Chanwoo) contributed to dataset video capture, mesh construction and noise removal, as well as to creating the slide deck and the presentation video.

	---

	## 🤗 Hugging Face model card

	This project will release its trained codebase open source and provide a model card on Hugging Face. The following structure is recommended for that model card.

	### Model Name
	AMADEUS 3D Foot Reconstruction Pipeline — a pipeline that produces LASTs based on a 3D mesh reconstructed from smartphone footage.

	### Model Description
	This pipeline combines YOLO11n‑seg, COLMAP, 3D Gaussian Splatting (3DGS) and SuGaR to generate high‑resolution 3D meshes from 2D images. Unlike mass‑production approaches, it creates personalised foot models using only smartphone video so that a LAST can be 3D printed. The stages include data collection and labelling, SfM‑based point‑cloud generation, 3DGS optimisation, SuGaR mesh extraction, scale adjustment and 3D printing【528303309685221†L100-L166】【528303309685221†L168-L199】.

	### Intended Uses & Applications
	- Custom shoe production: Customers can film their feet with a smartphone to create their own LAST; bespoke workshops can use this to enhance productivity.
	- Medical & research: Useful for analysing foot disorders and tracking changes in foot morphology over time.
	- Education & demonstration: Suitable for teaching 3D reconstruction and footwear design or for interactive demos.

	### Limitations
	- Due to the lack of distinctive features on the foot, SfM may struggle with feature extraction, leading to noisy reconstructions【528303309685221†L212-L232】.
	- Accurate scaling requires a checkerboard or scale helper, and the foot and marker must be clearly visible in the video【528303309685221†L219-L229】.
	- The current pipeline relies on a GPU, and training/reconstruction times are long, so it is not suited to real‑time applications.

	### Training data
	- Data source: More than 1,000 foot images and checkerboard masks captured by the team【528303309685221†L100-L106】.
	- Labelling: Polygon masks of the foot and scale markers were drawn manually and used to fine‑tune the YOLO11n‑seg model【528303309685221†L107-L111】.
	- Data augmentation: Rotations, flips and brightness adjustments were applied to improve generalisation【528303309685221†L112-L120】.

	### Model architecture & pipeline
	1. Segmentation: A fine‑tuned YOLO11n‑seg model segments the foot and scale marker.
	2. SfM reconstruction: COLMAP extracts features, matches them and performs bundle adjustment to build a sparse point cloud【528303309685221†L141-L150】.
	3. 3D Gaussian Splatting: Initialise a 3D Gaussian splat field from the point cloud and optimise positions, colours and sizes based on image comparisons【528303309685221†L153-L166】.
	4. Mesh extraction: The SuGaR algorithm converts the Gaussian representation into a surface mesh and performs normalisation and hole filling【528303309685221†L168-L180】.
	5. Scale adjustment: A scaling algorithm adjusts the mesh to actual foot dimensions and produces the final output【528303309685221†L182-L190】.

	### Evaluation
	- Accuracy metrics: Reconstruction accuracy measured by distance error in millimetres, average distance between point cloud and mesh, and fit quality when the LAST is used.
	- Current results: According to the project report, the 3D reconstruction time was shortened from days to hours, and the overall production process was reduced by about 70 percent【528303309685221†L233-L238】.

	### How to use
	```python
	# Sample usage (summary)
	# 1. Perform segmentation and prepare input for COLMAP
	!python yolo_inference.py --input video.mp4 --output /app/src/colmap/input

	# 2. Run COLMAP (feature extraction/matching/mapping)
	!xvfb-run -a colmap feature_extractor --database_path db.db --image_path /app/src/colmap/input --ImageReader.camera_model SIMPLE_RADIAL --ImageReader.single_camera 1 --SiftExtraction.use_gpu 1
	!xvfb-run -a colmap sequential_matcher --database_path db.db --SiftMatching.use_gpu 1 --SequentialMatching.overlap 20
	!xvfb-run -a colmap mapper --database_path db.db --image_path /app/src/colmap/input --output_path /app/src/colmap_work/sparse

	# 3. Train 3DGS
	!python train.py -s /app/src/colmap_work/sugar_ready -m /app/src/colmap_work/output_3dgs_15k --iterations 15000

	# 4. Run SuGaR and scale adjustment
	!python run_sugar.py --input /app/src/colmap_work/sugar_ready --output /app/src/final_mesh
	```

	### License & citation
	- License: The source code for this project follows the MIT licence and may be used freely for research, education and commercial purposes.
	- Citation: If you use this project, please cite the AMADEUS report below.

	```text
	@report{amadeus2025,
	title = {AI‑based Morphological Analysis & Design Engine for Unique Shoes},
	author = {Kim Taeryang, Park Hyundong, Park Chanwoo, Bang Hojun},
	year = {2025},
	note = {P‑Practical Project third semester (AI) team 3 report}
	}
	```

	### Contributors
	- Park Hyundong: Organised the entire pipeline and coordinated processes.
	- Kim Taeryang: Developed the COLMAP→3DGS→SuGaR pipeline and performed noise removal.
	- Bang Hojun & Park Chanwoo: Curated the open‑source documentation (README/Notion/Hugging Face) and environment setup.
	- All members (Hyundong, Taeryang, Hojun and Chanwoo): Participated collectively in dataset video capture, mesh construction, noise removal, slide deck creation and the presentation video.
	### Integration with GitHub, Notion & Hugging Face

	Your project uses Git for version control, but you may also want to connect documentation and models across services. Here is what each platform supports and how to configure it:

	#### Notion ↔ GitHub

	Notion’s GitHub integration is designed to link and visualise GitHub content, not to import an entire repository. The Notion help centre explains that you can:

	- Embed code previews: copy a GitHub permalink or a specific file/line selection and paste it into a Notion page; choose Paste as preview to embed a live view of the snippet【460183045658220†L99-L106】. This keeps documentation up to date but does not copy the repository into Notion.
	- Create synced databases: paste the link to a pull request or issue into Notion and select Paste as database. Notion will create a table view of fields such as title, description, state and assignees and automatically sync updates【460183045658220†L99-L124】. This helps track tasks and link them to your GitHub project.
	- Map identities: map GitHub identities to Notion profiles so that assignees and reviewers in the synced database link to the correct team member【460183045658220†L150-L159】.

	> Note: Notion cannot pull source code into its workspace. Use GitHub as the source of truth and reference it from Notion for documentation and project management.

	#### GitHub ↔ Hugging Face

	Hugging Face repositories (for Spaces, models or datasets) are Git repositories. You can synchronise your GitHub repo with Hugging Face using Git remotes or GitHub Actions:

	- Add your HF Space as a remote: after creating a Space, add it as an extra remote in your local Git repo and push your code:

	```bash
	git remote add space https://huggingface.co/spaces/HF_USERNAME/SPACE_NAME
	git push --force space main
	```

	The first command registers the Space as a remote; the second pushes your `main` branch to it【11725546813663†L108-L127】.
	- Set up a GitHub Action: create a workflow file in your GitHub repository that pushes to the Hugging Face Space whenever `main` updates. For example:

	```yaml
	name: Sync to Hugging Face hub
	on:
	push:
	branches: [main]
	workflow_dispatch:
	jobs:
	sync-to-hub:
	runs-on: ubuntu-latest
	steps:
	- uses: actions/checkout@v3
	with:
	fetch-depth: 0
	lfs: true
	- name: Push to hub
	env:
	HF_TOKEN: ${{ secrets.HF_TOKEN }}
	run: git push https://HF_USERNAME:$HF_TOKEN@huggingface.co/spaces/HF_USERNAME/SPACE_NAME main
	```

	In this workflow, `HF_TOKEN` is stored as a secret in your GitHub repository; the action pushes the `main` branch to the Space【11725546813663†L128-L158】.

	These integration options allow you to keep your codebase synchronised between GitHub and Hugging Face and to monitor issues and pull requests via Notion without duplicating code.

	---

	The Notion and Hugging Face documents above are examples based on the PDF report and team dialogue. You should fill in detailed information (performance figures, code blocks, dataset links, etc.) based on your actual code and experiments.