Leacb4 commited on
Commit
6ef43da
Β·
verified Β·
1 Parent(s): f2f5c64

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +310 -37
README.md CHANGED
@@ -1,6 +1,57 @@
1
- # Fashion Search Model - GAP-CLIP
2
 
3
- Multimodal search model for fashion, combining color embeddings, categorical hierarchy embeddings, and a main CLIP model for fashion item search.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  ## πŸ“‹ Description
6
 
@@ -57,24 +108,63 @@ Where:
57
 
58
  ### Prerequisites
59
 
60
- - Python 3.8+
61
- - PyTorch 2.0+
62
- - CUDA (optional, for GPU)
 
63
 
64
- ### Installing Dependencies
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  ```bash
67
  pip install -r requirements.txt
68
  ```
69
 
 
 
 
 
 
 
 
70
  ### Main Dependencies
71
 
72
- - `torch>=2.0.0` : PyTorch for deep learning
73
- - `transformers>=4.30.0` : Hugging Face Transformers for CLIP
74
- - `huggingface-hub>=0.16.0` : To download/upload models
75
- - `pillow>=9.0.0` : Image processing
76
- - `pandas>=1.5.0` : Data manipulation
77
- - `scikit-learn>=1.3.0` : Evaluation metrics
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
  ## πŸ“ Project Structure
80
 
@@ -113,51 +203,91 @@ pip install -r requirements.txt
113
  β”‚ β”œβ”€β”€ optuna_param_importances.png # Parameter importance plot
114
  β”‚ └── optuna_guide.md # Optuna usage guide
115
  β”œβ”€β”€ upload_hf/ # HuggingFace Hub upload utilities
116
- β”‚ β”œβ”€β”€ upload_to_huggingface.py # Upload script
117
- β”‚ └── GUIDE_UPLOAD_HF.md # Upload guide
118
- β”œβ”€β”€ requirements.txt # Python dependencies
 
 
 
119
  └── README.md # This documentation
120
  ```
121
 
122
  ### Key Files Description
123
 
124
  **Core Model Files**:
125
- - `color_model.py`: ResNet18-based color embedding model (16 dims)
126
  - `hierarchy_model.py`: ResNet18-based hierarchy classification model (64 dims)
127
- - `main_model.py`: GAP-CLIP implementation with enhanced contrastive loss
128
- - `train_main_model.py`: Training with Optuna-optimized hyperparameters
129
-
130
- **Configuration**:
131
- - `config.py`: Central configuration for all paths, dimensions, and device settings
 
 
 
 
132
  - `tokenizer_vocab.json`: Vocabulary for color model's text encoder
133
 
 
 
 
 
 
 
 
 
 
 
134
  **Evaluation Suite**:
135
  - `main_model_evaluation.py`: Comprehensive evaluation across Fashion-MNIST, KAGL, and local datasets
136
- - Other evaluation scripts provide specialized analysis (color, hierarchy, search, etc.)
 
137
 
138
  **Training Data**:
139
  - `data_with_local_paths.csv`: Main training dataset with text, color, hierarchy, and image paths
140
  - `fashion-mnist_test.csv`: Evaluation dataset for zero-shot generalization testing
141
 
 
 
 
 
 
 
 
142
  ## πŸ”§ Configuration
143
 
144
- Main parameters are defined in `config.py`:
145
 
146
  ```python
147
- # Embedding dimensions
148
- color_emb_dim = 16 # Color embedding dimension (dims 0-15)
149
- hierarchy_emb_dim = 64 # Hierarchy embedding dimension (dims 16-79)
150
 
151
- # Device configuration
152
- device = torch.device("mps") # Device (cuda, mps, cpu)
153
 
154
- # Column names for dataset
155
- text_column = 'text' # Description column
156
- color_column = 'color' # Color label column
157
- hierarchy_column = 'hierarchy' # Hierarchy category column
158
- column_local_image_path = 'local_image_path' # Image path column
 
 
 
 
 
 
 
 
159
  ```
160
 
 
 
 
 
 
 
 
 
 
161
  ### Model Paths
162
 
163
  Default paths configured in `config.py`:
@@ -704,14 +834,157 @@ model.load_state_dict(checkpoint['model_state_dict'])
704
  # Continue training with your domain-specific data
705
  ```
706
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
707
  ## 🀝 Contributing
708
 
709
- Contributions are welcome! Feel free to open an issue or a pull request.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
710
 
711
- ## πŸ“§ Contact
712
 
713
- Lea Sarfati lea.attia@gmail.com
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
714
 
715
  ---
716
 
717
- **Note** : This project is under active development. For any questions or issues, please open an issue on the repository.
 
 
 
1
+ # GAP-CLIP: Guaranteed Attribute Positioning in CLIP Embeddings
2
 
3
+ [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
4
+ [![PyTorch 2.0+](https://img.shields.io/badge/pytorch-2.0+-ee4c2c.svg)](https://pytorch.org/)
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+ [![Hugging Face](https://img.shields.io/badge/πŸ€—-Hugging%20Face-yellow)](https://huggingface.co/Leacb4/gap-clip)
7
+
8
+ **Advanced multimodal fashion search model combining specialized color embeddings, hierarchical category embeddings, and CLIP for intelligent fashion item retrieval.**
9
+
10
+ ---
11
+
12
+ ## πŸš€ Quick Start
13
+
14
+ ### Installation (< 1 minute)
15
+
16
+ ```bash
17
+ # Clone the repository
18
+ git clone https://github.com/Leacb4/gap-clip.git
19
+ cd gap-clip
20
+
21
+ # Install package with pip
22
+ pip install -e .
23
+
24
+ # Or just install dependencies
25
+ pip install -r requirements.txt
26
+ ```
27
+
28
+ ### Try It Now (< 2 minutes)
29
+
30
+ ```python
31
+ from example_usage import load_models_from_hf
32
+
33
+ # Load pre-trained models from Hugging Face
34
+ models = load_models_from_hf("Leacb4/gap-clip")
35
+
36
+ # Search with text
37
+ import torch.nn.functional as F
38
+ text_query = "red summer dress"
39
+ text_inputs = models['processor'](text=[text_query], padding=True, return_tensors="pt")
40
+ text_inputs = {k: v.to(models['device']) for k, v in text_inputs.items()}
41
+
42
+ with torch.no_grad():
43
+ text_features = models['main_model'](**text_inputs).text_embeds
44
+
45
+ # Extract specialized embeddings
46
+ color_emb = text_features[:, :16] # Color (dims 0-15)
47
+ category_emb = text_features[:, 16:80] # Category (dims 16-79)
48
+ general_emb = text_features[:, 80:] # General CLIP (dims 80-511)
49
+
50
+ print(f"βœ… Successfully extracted embeddings!")
51
+ print(f" Color: {color_emb.shape}, Category: {category_emb.shape}, General: {general_emb.shape}")
52
+ ```
53
+
54
+ ---
55
 
56
  ## πŸ“‹ Description
57
 
 
108
 
109
  ### Prerequisites
110
 
111
+ - Python 3.8 or higher
112
+ - PyTorch 2.0+ (with CUDA for GPU support, optional but recommended)
113
+ - 16GB RAM minimum (32GB recommended for training)
114
+ - ~5GB disk space for models and data
115
 
116
+ ### Method 1: Install as Package (Recommended)
117
+
118
+ ```bash
119
+ # Clone repository
120
+ git clone https://github.com/Leacb4/gap-clip.git
121
+ cd gap-clip
122
+
123
+ # Install in development mode
124
+ pip install -e .
125
+
126
+ # Or install with optional dependencies
127
+ pip install -e ".[dev]" # With development tools
128
+ pip install -e ".[optuna]" # With hyperparameter optimization
129
+ pip install -e ".[all]" # With all extras
130
+ ```
131
+
132
+ ### Method 2: Install Dependencies Only
133
 
134
  ```bash
135
  pip install -r requirements.txt
136
  ```
137
 
138
+ ### Method 3: From Hugging Face (Model Only)
139
+
140
+ ```python
141
+ from example_usage import load_models_from_hf
142
+ models = load_models_from_hf("Leacb4/gap-clip")
143
+ ```
144
+
145
  ### Main Dependencies
146
 
147
+ | Package | Version | Purpose |
148
+ |---------|---------|---------|
149
+ | `torch` | β‰₯2.0.0 | Deep learning framework |
150
+ | `transformers` | β‰₯4.30.0 | Hugging Face CLIP models |
151
+ | `huggingface-hub` | β‰₯0.16.0 | Model download/upload |
152
+ | `pillow` | β‰₯9.0.0 | Image processing |
153
+ | `pandas` | β‰₯1.5.0 | Data manipulation |
154
+ | `scikit-learn` | β‰₯1.3.0 | ML metrics & evaluation |
155
+ | `tqdm` | β‰₯4.65.0 | Progress bars |
156
+ | `matplotlib` | β‰₯3.7.0 | Visualization |
157
+
158
+ ### Verify Installation
159
+
160
+ ```python
161
+ # Test that everything works
162
+ import config
163
+ config.print_config()
164
+
165
+ # Check device
166
+ print(f"Using device: {config.device}")
167
+ ```
168
 
169
  ## πŸ“ Project Structure
170
 
 
203
  β”‚ β”œβ”€β”€ optuna_param_importances.png # Parameter importance plot
204
  β”‚ └── optuna_guide.md # Optuna usage guide
205
  β”œβ”€β”€ upload_hf/ # HuggingFace Hub upload utilities
206
+ β”‚ β”œβ”€β”€ upload_to_huggingface.py # Professional upload script (rewritten)
207
+ β”‚ └── README_UPLOAD.md # Complete upload guide
208
+ β”œβ”€β”€ requirements.txt # Python dependencies (organized)
209
+ β”œβ”€β”€ setup.py # Package installation (NEW)
210
+ β”œβ”€β”€ __init__.py # Package initialization (NEW)
211
+ β”œβ”€β”€ .gitignore # Git ignore rules (NEW)
212
  └── README.md # This documentation
213
  ```
214
 
215
  ### Key Files Description
216
 
217
  **Core Model Files**:
218
+ - `color_model.py`: ResNet18-based color embedding model (16 dims) - Bug fixed ✨
219
  - `hierarchy_model.py`: ResNet18-based hierarchy classification model (64 dims)
220
+ - `main_model.py`: GAP-CLIP implementation with enhanced contrastive loss - Bug fixed ✨
221
+ - `train_main_model.py`: Training with Optuna-optimized hyperparameters - Improved ✨
222
+
223
+ **Configuration & Setup** (✨ New/Improved):
224
+ - `config.py`: ✨ Completely rewritten with type hints, auto device detection, validation utilities
225
+ - `setup.py`: ✨ NEW - Professional package installer with CLI entry points
226
+ - `__init__.py`: ✨ NEW - Package initialization for easy imports
227
+ - `.gitignore`: ✨ NEW - Comprehensive Git ignore rules
228
+ - `requirements.txt`: ✨ Improved - Organized with comments and categories
229
  - `tokenizer_vocab.json`: Vocabulary for color model's text encoder
230
 
231
+ **Upload Tools** (✨ Rewritten):
232
+ - `upload_hf/upload_to_huggingface.py`: ✨ Complete professional rewrite with:
233
+ - Object-oriented design
234
+ - Multiple authentication methods
235
+ - Category-based uploads (models, code, docs, etc.)
236
+ - Progress tracking
237
+ - Automatic model card generation
238
+ - Detailed error handling
239
+ - `upload_hf/README_UPLOAD.md`: ✨ NEW - Complete upload guide
240
+
241
  **Evaluation Suite**:
242
  - `main_model_evaluation.py`: Comprehensive evaluation across Fashion-MNIST, KAGL, and local datasets
243
+ - `evaluation/run_all_evaluations.py`: ✨ NEW - Automated evaluation runner with reports
244
+ - Other scripts provide specialized analysis (color, hierarchy, search, t-SNE, etc.)
245
 
246
  **Training Data**:
247
  - `data_with_local_paths.csv`: Main training dataset with text, color, hierarchy, and image paths
248
  - `fashion-mnist_test.csv`: Evaluation dataset for zero-shot generalization testing
249
 
250
+ **CLI Commands** (✨ New):
251
+ After installation with `pip install -e .`, you can use:
252
+ ```bash
253
+ gap-clip-train # Start training
254
+ gap-clip-example # Run usage examples
255
+ ```
256
+
257
  ## πŸ”§ Configuration
258
 
259
+ Main parameters are defined in `config.py` (✨ completely rewritten with improvements):
260
 
261
  ```python
262
+ import config
 
 
263
 
264
+ # Automatic device detection (CUDA > MPS > CPU)
265
+ device = config.device # Automatically selects best available device
266
 
267
+ # Embedding dimensions
268
+ color_emb_dim = config.color_emb_dim # 16 dims (0-15)
269
+ hierarchy_emb_dim = config.hierarchy_emb_dim # 64 dims (16-79)
270
+ main_emb_dim = config.main_emb_dim # 512 dims total
271
+
272
+ # Default training hyperparameters
273
+ batch_size = config.DEFAULT_BATCH_SIZE # 32
274
+ learning_rate = config.DEFAULT_LEARNING_RATE # 1.5e-5
275
+ temperature = config.DEFAULT_TEMPERATURE # 0.09
276
+
277
+ # Utility functions
278
+ config.print_config() # Print current configuration
279
+ config.validate_paths() # Validate that all files exist
280
  ```
281
 
282
+ ### New Features in config.py ✨
283
+
284
+ - **Automatic device detection**: Selects CUDA > MPS > CPU automatically
285
+ - **Type hints**: Full type annotations for better IDE support
286
+ - **Validation**: `validate_paths()` checks all model files exist
287
+ - **Print utility**: `print_config()` shows current settings
288
+ - **Constants**: Pre-defined default hyperparameters
289
+ - **Documentation**: Comprehensive docstrings for all settings
290
+
291
  ### Model Paths
292
 
293
  Default paths configured in `config.py`:
 
834
  # Continue training with your domain-specific data
835
  ```
836
 
837
+ ## πŸ“¦ Upload to Hugging Face
838
+
839
+ The project includes a **professional upload script** (✨ completely rewritten) for easy deployment:
840
+
841
+ ```bash
842
+ cd upload_hf
843
+
844
+ # Authenticate (first time only)
845
+ huggingface-cli login
846
+
847
+ # Upload everything
848
+ python upload_to_huggingface.py --repo-id your-username/gap-clip --categories all
849
+
850
+ # Or upload specific categories
851
+ python upload_to_huggingface.py --repo-id your-username/gap-clip --categories models code
852
+
853
+ # Create private repository
854
+ python upload_to_huggingface.py --repo-id your-username/gap-clip --private
855
+ ```
856
+
857
+ **Features**:
858
+ - ✨ Object-oriented design with `HuggingFaceUploader` class
859
+ - ✨ Multiple authentication methods (token, saved, interactive)
860
+ - ✨ Category-based uploads: models, code, docs, data, optuna, evaluation
861
+ - ✨ Progress tracking with tqdm
862
+ - ✨ Automatic model card generation
863
+ - ✨ Detailed error handling and recovery
864
+ - ✨ Upload statistics and summary
865
+
866
+ See `upload_hf/README_UPLOAD.md` for complete documentation.
867
+
868
+ ## πŸ§ͺ Testing & Evaluation
869
+
870
+ ### Quick Test
871
+
872
+ ```bash
873
+ # Test configuration
874
+ python -c "import config; config.print_config()"
875
+
876
+ # Test model loading
877
+ python example_usage.py --repo-id Leacb4/gap-clip --text "red dress"
878
+ ```
879
+
880
+ ### Full Evaluation Suite
881
+
882
+ ```bash
883
+ # Run all evaluations
884
+ cd evaluation
885
+ python run_all_evaluations.py --repo-id Leacb4/gap-clip
886
+
887
+ # Results will be saved to evaluation_results/ with:
888
+ # - summary.json: Detailed metrics
889
+ # - summary_comparison.png: Visual comparison
890
+ ```
891
+
892
+ ## πŸ› Known Issues & Fixes
893
+
894
+ ### Fixed Issues ✨
895
+
896
+ 1. **Color model image loading bug** (Fixed in `color_model.py`)
897
+ - Previous: `Image.open(config.column_local_image_path)`
898
+ - Fixed: `Image.open(img_path)` - Now correctly gets path from dataframe
899
+
900
+ 2. **Function naming in training** (Fixed in `main_model.py` and `train_main_model.py`)
901
+ - Previous: `train_one_epoch_enhanced`
902
+ - Fixed: `train_one_epoch` - Consistent naming
903
+
904
+ 3. **Device compatibility** (Improved in `config.py`)
905
+ - Now automatically detects and selects best device (CUDA > MPS > CPU)
906
+
907
+ ## πŸŽ“ Learning Resources
908
+
909
+ ### Documentation Files
910
+
911
+ - **README.md** (this file): Complete project documentation
912
+ - **upload_hf/README_UPLOAD.md**: Upload guide for Hugging Face
913
+ - **evaluation/**: Multiple evaluation examples
914
+
915
+ ### Code Examples
916
+
917
+ - **example_usage.py**: Basic usage with Hugging Face Hub
918
+ - **evaluation/fashion_search.py**: Interactive search examples
919
+ - **evaluation/tsne_images.py**: Visualization examples
920
+
921
  ## 🀝 Contributing
922
 
923
+ We welcome contributions! Here's how:
924
+
925
+ 1. **Report bugs**: Open an issue with detailed description
926
+ 2. **Suggest features**: Describe your idea in an issue
927
+ 3. **Submit PR**: Fork, create branch, commit, and open pull request
928
+ 4. **Improve docs**: Help make documentation clearer
929
+
930
+ ### Development Setup
931
+
932
+ ```bash
933
+ # Install with dev dependencies
934
+ pip install -e ".[dev]"
935
+
936
+ # Run tests (if available)
937
+ pytest
938
+
939
+ # Format code
940
+ black .
941
+ flake8 .
942
+ ```
943
 
944
+ ## πŸ“Š Project Statistics
945
 
946
+ - **Language**: Python 3.8+
947
+ - **Framework**: PyTorch 2.0+
948
+ - **Models**: 3 specialized models (color, hierarchy, main)
949
+ - **Embedding Size**: 512 dimensions
950
+ - **Training Data**: 20,000+ fashion items
951
+ - **Lines of Code**: 5,000+ (including documentation)
952
+ - **Documentation**: Comprehensive docstrings and guides
953
+
954
+ ## πŸ”— Links
955
+
956
+ - **Hugging Face Hub**: [Leacb4/gap-clip](https://huggingface.co/Leacb4/gap-clip)
957
+ - **GitHub**: [github.com/Leacb4/gap-clip](https://github.com/Leacb4/gap-clip)
958
+ - **Contact**: lea.attia@gmail.com
959
+
960
+ ## πŸ“§ Contact & Support
961
+
962
+ **Author**: Lea Attia Sarfati
963
+ **Email**: lea.attia@gmail.com
964
+ **Hugging Face**: [@Leacb4](https://huggingface.co/Leacb4)
965
+
966
+ For questions, issues, or suggestions:
967
+ - πŸ› **Bug reports**: Open an issue on GitHub
968
+ - πŸ’‘ **Feature requests**: Open an issue with [Feature Request] tag
969
+ - πŸ“§ **Direct contact**: lea.attia@gmail.com
970
+ - πŸ’¬ **Discussions**: Hugging Face Discussions
971
+
972
+ ---
973
+
974
+ ## πŸ“œ License
975
+
976
+ This project is licensed under the MIT License - see the LICENSE file for details.
977
+
978
+ ## πŸ™ Acknowledgments
979
+
980
+ - LAION team for the base CLIP model
981
+ - Hugging Face for transformers library and model hosting
982
+ - PyTorch team for the deep learning framework
983
+ - Fashion-MNIST dataset creators
984
+ - All contributors and users of this project
985
 
986
  ---
987
 
988
+ **⭐ If you find this project useful, please consider giving it a star on GitHub!**
989
+
990
+ **πŸ“’ Version**: 1.0.0 | **Status**: Production Ready βœ… | **Last Updated**: December 2024