alexmaks02 commited on
Commit
7d62b66
·
verified ·
1 Parent(s): d6cf20b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -13
README.md CHANGED
@@ -19,23 +19,15 @@ Official page for the paper:
19
 
20
  Published in Expert Systems with Applications (Elsevier)
21
 
22
- ---
23
- ## Pipeline
24
- ![Pipeline](./pipeline.png)
25
-
26
- ---
27
-
28
- ## Paper
29
-
30
  DOI: https://doi.org/10.1016/j.eswa.2026.131646
31
 
32
- ---
33
-
34
- ## Code
35
  GitHub repository:
36
  https://github.com/AlexMaks02/VC-SCMAE
37
 
38
  ---
 
 
 
39
  ## Highlights
40
  - Proposes a self-supervised pre-train framework for vehicle-centric visual tasks.
41
  - Extends CGD-MAE with richer data analysis and an enhanced pre-training design.
@@ -43,11 +35,10 @@ https://github.com/AlexMaks02/VC-SCMAE
43
  - Ablation and qualitative results validate the proposed design.
44
  - Improves state-of-the-art vehicle-centric benchmarks in fine-tuning and linear-probe.
45
 
46
- ---
47
  ## Abstract
48
  In this work, we present VC-SCMAE, a Vehicle-Centric Semantic Contrastive-Guided Masked Autoencoder framework that distills knowledge from multimodal foundational models. Our approach extends MAE pre-training with contrastive guidance, combining masked image modeling with instance-level discrimination to produce more robust and transferable representations. On top of this discriminative backbone, we apply CLIP-style semantic distillation, leveraging a large-scale vehicle dataset (Automobile1M) and a visually grounded unpaired text corpus. Unlike conventional vision–language models that rely on aligned image–text pairs, our method transfers semantic knowledge from a pre-trained CLIP model without requiring explicit alignment. We further introduce specialized distillation losses that enhance open-vocabulary logits during vision-language distillation, thereby strengthening semantic alignment across modalities. Experiments demonstrate that VC-SCMAE effectively transfers to vehicle-specific downstream tasks via both linear probing and fine-tuning, unifying structural, discriminative, and semantic understanding within a single pre-training framework.
49
 
50
- ---
51
  ## Citation
52
  ```bibtex
53
  @article{MARQUES2026131646,
 
19
 
20
  Published in Expert Systems with Applications (Elsevier)
21
 
 
 
 
 
 
 
 
 
22
  DOI: https://doi.org/10.1016/j.eswa.2026.131646
23
 
 
 
 
24
  GitHub repository:
25
  https://github.com/AlexMaks02/VC-SCMAE
26
 
27
  ---
28
+ ## Pipeline
29
+ ![Pipeline](./pipeline.png)
30
+
31
  ## Highlights
32
  - Proposes a self-supervised pre-train framework for vehicle-centric visual tasks.
33
  - Extends CGD-MAE with richer data analysis and an enhanced pre-training design.
 
35
  - Ablation and qualitative results validate the proposed design.
36
  - Improves state-of-the-art vehicle-centric benchmarks in fine-tuning and linear-probe.
37
 
 
38
  ## Abstract
39
  In this work, we present VC-SCMAE, a Vehicle-Centric Semantic Contrastive-Guided Masked Autoencoder framework that distills knowledge from multimodal foundational models. Our approach extends MAE pre-training with contrastive guidance, combining masked image modeling with instance-level discrimination to produce more robust and transferable representations. On top of this discriminative backbone, we apply CLIP-style semantic distillation, leveraging a large-scale vehicle dataset (Automobile1M) and a visually grounded unpaired text corpus. Unlike conventional vision–language models that rely on aligned image–text pairs, our method transfers semantic knowledge from a pre-trained CLIP model without requiring explicit alignment. We further introduce specialized distillation losses that enhance open-vocabulary logits during vision-language distillation, thereby strengthening semantic alignment across modalities. Experiments demonstrate that VC-SCMAE effectively transfers to vehicle-specific downstream tasks via both linear probing and fine-tuning, unifying structural, discriminative, and semantic understanding within a single pre-training framework.
40
 
41
+
42
  ## Citation
43
  ```bibtex
44
  @article{MARQUES2026131646,