PeterDAI commited on
Commit
c1e0dfe
·
verified ·
1 Parent(s): f5b8f28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -1
README.md CHANGED
@@ -3,4 +3,80 @@ license: mit
3
  language:
4
  - en
5
  pipeline_tag: image-segmentation
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  language:
4
  - en
5
  pipeline_tag: image-segmentation
6
+ ---
7
+
8
+ # LogoCleaner: Logo Detection and Removal Models
9
+
10
+ This repository contains the required models for the LogoCleaner Software, developed as a course project for **COMP4432** at the Department of Computing, The Hong Kong Polytechnic University. The implementation code and detailed usage instructions can be found at [LogoCleaner GitHub Repository](https://github.com/hiteacherIamhumble/LogoCleaner).
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ This repository contains two key model weights:
17
+
18
+ 1. **sam_vit_b_01ec64.pth**: The Segment Anything Model (SAM) with ViT-B backbone developed by Meta. SAM is a powerful foundation model for image segmentation tasks that can identify objects in images based on prompts.
19
+
20
+ 2. **best_model.pth**: A custom selector module that works on top of the SAM model, specifically trained by our team to identify and select logo regions in images.
21
+
22
+ Together, these models form the backbone of the LogoCleaner application, which can automatically detect and remove logos from images.
23
+
24
+ ### Model Architecture
25
+
26
+ - **SAM (ViT-B)**: A vision transformer-based architecture that serves as a powerful segmentation foundation model.
27
+ - **Selector Module**: A custom neural network that takes SAM's outputs and specializes in logo identification.
28
+
29
+ ## Intended Uses & Limitations
30
+
31
+ ### Intended Uses
32
+
33
+ - Automatic logo detection in images
34
+ - Logo removal and inpainting for privacy or copyright reasons
35
+ - Educational purposes for computer vision and image processing
36
+
37
+ ### Limitations
38
+
39
+ - Performance may vary depending on logo complexity and image quality
40
+ - The model works best with clear, distinct logos rather than heavily stylized or distorted ones
41
+
42
+ ## Training Data
43
+
44
+ The selector module was trained on [FlickrLogos-32](https://www.uni-augsburg.de/en/fakultaet/fai/informatik/prof/mmc/research/datensatze/flickrlogos/) dataset released in ICMR11 and updated in ICML2017, which contains photos showing brand logos and is meant for the evaluation of logo retrieval and multi-class logo detection/recognition systems on real-world images.
45
+
46
+ Note that even though the dataset is open-source and well-known, we cannot provide the dataset with link since the owner requires an (informal) email to request_flickrlogos@informatik.uni-augsburg.de in order to get the dataset. Our team also send the email and get the original datasets and we apologize for the inconvience.
47
+
48
+ ## Training Procedure
49
+
50
+ We use a single RTX 4090 graphic card and train with 50 epoches (the selector module converges very fast at around the 45 epoch). You could refer to the training source code with our [GitHub Repository](https://github.com/hiteacherIamhumble/LogoCleaner).
51
+
52
+ ## Evaluation Results
53
+
54
+ We choose the classic fusion loss (BCE and Dice loss) for the evaluation results and outperforms the classic Unet with 2% dice loss. The details please refer to the report in our [GitHub Repository](https://github.com/hiteacherIamhumble/LogoCleaner).
55
+
56
+ ## Usage
57
+
58
+ ### Direct Download
59
+
60
+ Both model files can be downloaded directly from this repository's files section.
61
+
62
+ ### Using the Hugging Face Hub
63
+
64
+ ```python
65
+ from huggingface_hub import hf_hub_download
66
+
67
+ # Download the SAM model
68
+ sam_path = hf_hub_download(
69
+ repo_id="PeterDAI/LogoCleaner",
70
+ filename="sam_vit_b_01ec64.pth"
71
+ )
72
+
73
+ # Download the selector model
74
+ selector_path = hf_hub_download(
75
+ repo_id="PeterDAI/LogoCleaner",
76
+ filename="best_model.pth"
77
+ )
78
+ ```
79
+
80
+ ## Contact
81
+
82
+ If you are the Prof or TA assessing our project and encouter any problem, feel free to contact us with this [email](22097845d@connect.polyu.hk).