Spaces:

TeleSafety
/

Loupe

Sleeping

App Files Files Community

xxwyyds commited on Jul 14, 2025

Commit

891e05c

verified ·

1 Parent(s): abffab7

Upload 86 files

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +16 -0
LICENSE +21 -0
README.md +100 -14
app.py +341 -0
app3.py +591 -0
configs/base.yaml +6 -0
configs/ckpt/base.yaml +12 -0
configs/ckpt/cls.yaml +8 -0
configs/ckpt/cls_seg.yaml +9 -0
configs/ckpt/seg.yaml +8 -0
configs/ckpt/test.yaml +7 -0
configs/dataset/base.yaml +4 -0
configs/dataset/custom.yaml +5 -0
configs/dataset/ddl.yaml +6 -0
configs/hparams/base.yaml +6 -0
configs/hparams/cls.yaml +10 -0
configs/hparams/cls_seg.yaml +6 -0
configs/hparams/seg.yaml +13 -0
configs/hparams/test.yaml +7 -0
configs/infer.yaml +11 -0
configs/model/base.yaml +19 -0
configs/model/cls.yaml +15 -0
configs/model/cls_seg.yaml +16 -0
configs/model/seg.yaml +32 -0
configs/model/test.yaml +7 -0
configs/stage/cls.yaml +1 -0
configs/stage/cls_seg.yaml +8 -0
configs/stage/seg.yaml +1 -0
configs/stage/test.yaml +10 -0
configs/train.yaml +11 -0
dataset_preprocess.ipynb +334 -0
ffhq/ffhq-0001.png +3 -0
ffhq/ffhq-0009.png +3 -0
ffhq/ffhq-0032.png +3 -0
ffhq/ffhq-0041.png +3 -0
ffhq/ffhq-0055.png +3 -0
ffhq/ffhq-0062.png +3 -0
ffhq/ffhq-0085.png +3 -0
ffhq/ffhq-0096.png +3 -0
ffhq/ffhq-0100.png +3 -0
ffhq/ffhq-0112.png +3 -0
ffhq/ffhq-0136.png +3 -0
ffhq/ffhq-0138.png +3 -0
ffhq/ffhq-0142.png +3 -0
ffhq/ffhq-0154.png +3 -0
ffhq/ffhq-0155.png +3 -0
ffhq/ffhq-0505.png +3 -0
predict_case.ipynb +0 -0
requirements.txt +9 -0
runtime.txt +1 -0

.gitattributes CHANGED Viewed

@@ -49,3 +49,19 @@ loupe/ffhq/ffhq-0142.png filter=lfs diff=lfs merge=lfs -text
 loupe/ffhq/ffhq-0154.png filter=lfs diff=lfs merge=lfs -text
 loupe/ffhq/ffhq-0155.png filter=lfs diff=lfs merge=lfs -text
 loupe/ffhq/ffhq-0505.png filter=lfs diff=lfs merge=lfs -text

 loupe/ffhq/ffhq-0154.png filter=lfs diff=lfs merge=lfs -text
 loupe/ffhq/ffhq-0155.png filter=lfs diff=lfs merge=lfs -text
 loupe/ffhq/ffhq-0505.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0001.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0009.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0032.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0041.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0055.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0062.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0085.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0096.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0100.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0112.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0136.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0138.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0142.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0154.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0155.png filter=lfs diff=lfs merge=lfs -text
+ffhq/ffhq-0505.png filter=lfs diff=lfs merge=lfs -text

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 kamichanw
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,14 +1,100 @@
----
-title: Loupe
-emoji: 🌖
-colorFrom: pink
-colorTo: blue
-sdk: gradio
-sdk_version: 5.36.2
-app_file: app.py
-pinned: false
-license: mit
-short_description: deepfake image detection and localization.
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Loupe
+The 1st place solution of IJCAI 2025 challenge track 1: Forgery Image Detection and Localization. The top 5 final leaderboard is as follows:
+| User | Overall Score |
+|:---:|:---:|
+| Loupe (ours) | **0.846** |
+| Rank 2 | 0.8161 |
+| Rank 3 | 0.8151 |
+| Rank 4 | 0.815 |
+| Rank 5 | 0.815 |
+## Setup
+### 1. Create environment
+```bash
+conda create -y -n loupe python=3.11
+conda activate loupe
+pip install -r requirements.txt
+mkdir ./pretrained_weights/PE-Core-L14-336
+```
+### 2. Prepare pretrained weights
+Download [Perception Encoder](https://github.com/facebookresearch/perception_models) following their original instructions, and place `PE-Core-L14-336.pt` at `./pretrained_weights/PE-Core-L14-336`. This can be done with `huggingface-cli`:
+```bash
+export HF_ENDPOINT=https://hf-mirror.com
+huggingface-cli download facebook/PE-Core-L14-336 PE-Core-L14-336.pt --local-dir ./pretrained_weights/PE-Core-L14-336
+```
+### 3. Prepare datasets
+Download the dataset to any location of your choice. Then, use the [`dataset_preprocess.ipynb`](./dataset_preprocess.ipynb) notebook to preprocess the dataset. This process converts the dataset into a directly loadable `DatasetDict` and saves it in `parquet` format.
+After preprocessing, you will obtain a dataset with three splits: `train`, `valid`, and `test`. Each item in these splits has the following structure:
+```python
+{
+    "image": "path/to/image",  # but will be loaded as an actual PIL.Image.Image object
+    "mask": "path/to/mask",    # set to None for real images without masks
+    "name": "basename_of_image.png"
+}
+```
+> [!NOTE]
+> You can also adapt to a new dataset. In this case, you need to modify [`dataset_preprocess.ipynb`](./dataset_preprocess.ipynb) for your own use.
+After preparation, the last thing to do is specifying `path/to/your/dataset` in [dataset.yaml](configs/dataset.yaml).
+## How to train
+Loupe employs a two or three stages training process. The first stage trains the classifier and can be executed with the following command:
+```bash
+python src/train.py stage=cls
+```
+During training, two directories will be automatically created:
+* `./results/checkpoints` — contains the DeepSpeed-format checkpoint with the highest AUC on the validation set (when using the default training strategy, which can be configured in `./configs/base.yaml`).
+* `./results/{stage.name}` — contains logs in TensorBoard format. You can monitor the training progress by running:
+```bash
+tensorboard --logdir=./results/cls
+# or `tensorboard --logdir=./results/seg`, etc.
+```
+After training completes, the best checkpoint will be saved in the directory `./checkpoints/cls-auc=xxx.ckpt`. This directory contains several configs and checkpoint file `model.safetensors` which stores the best checkpoint in the safetensors format.
+The second stage trains the segmentation head. To do so, simply replace the command line argument `stage=cls` with `stage=seg` in the stage 1 command.
+The third stage is optional, which jointly trains the backbone, classifier head, and segmentation head. By default, a portion of the validation set is used as training data, while the remainder is reserved for validation. The reason why I use validation set as an extra training set is the test set used in the competition is slightly out-of-distribution (OOD). I found that
+if continue training on the original training set will result in overfitting. However, if you prefer to train the whole network from scratch directly on the training set, you can do so by:
+```bash
+python src/train.py stage=cls_seg \
+    ckpt.checkpoint_paths=[] \
+    model.freeze_backbone=true \
+    stage.train_on_trainset=true
+```
+All training configurations can be adjusted within the `configs/` directory. Detailed comments are provided to facilitate quick and clear configuration.
+## How to test or predict
+By default, testing is performed on the full validation set. This means it is not suitable for evaluating Loupe trained in the third stage, since the third stage trains Loupe on the validation set itself (see above). Alternatively, if you are willing to make a slight modification to [data loading process](./src/data_module.py) to have Loupe train on the training set instead, this limitation can be avoided.
+To evaluate a trained model, you can run:
+```bash
+python src/infer.py stage=test ckpt.checkpoint_paths=["checkpoints/cls/model.safetensors","checkpoints/seg/model.safetensors"]
+```
+The `ckpt.checkpoint_paths` configuration is defined under `configs/ckpt`. It is a list that specifies the checkpoints to load sequentially during execution.
+The prediction step is essentially the same as the test step. You only need to add an additional parameter to specify the output directory for predictions. For example:
+```bash
+python src/infer.py stage=test \
+    ckpt.checkpoint_paths=["checkpoints/cls/model.safetensors","checkpoints/seg/model.safetensors"] \
+    stage.pred_output_dir=./pred_outputs
+```
+The classification predictions will be saved in `./pred_outputs/predictions.txt`, and the mask outputs will be stored in `./pred_outputs/masks`. For more details on available parameters, please refer to `configs/stage/test.yaml`.
+## Code reading guides
+Nobody cares this work, leave this section blank.

app.py ADDED Viewed

	@@ -0,0 +1,341 @@

+import gradio as gr
+import os
+import tempfile
+import numpy as np
+from PIL import Image
+from src.predict import process_single_image
+import sys
+sys.path.insert(0, "./src")
+def get_example_images(folder_path="/home/xxw/Loupe/ffhq"):
+    return [os.path.join(folder_path, f) for f in os.listdir(folder_path)
+            if f.lower().endswith(('.png', '.jpg', '.jpeg'))]
+def safe_extract_prob(cls_probs):
+    """安全地从cls_probs中提取概率值"""
+    try:
+        if cls_probs is None:
+            return 0.0
+        elif isinstance(cls_probs, (list, np.ndarray)) and len(cls_probs) > 0:
+            return float(cls_probs[0])
+        elif hasattr(cls_probs, '__getitem__'):
+            return float(cls_probs[0])
+        else:
+            return float(cls_probs)
+    except (TypeError, IndexError, ValueError) as e:
+        print(f"Error extracting probability: {e}")
+        return 0.0
+with gr.Blocks(title="Loupe图像伪造检测系统", theme=gr.themes.Soft()) as demo:
+    gr.Markdown("""
+    # Loupe🕵️‍♂️ 图像伪造检测系统
+    ### 上传图像或从示例中选择，系统将检测图像中的伪造区域
+    """)
+    with gr.Row():
+        with gr.Column(scale=1):
+            with gr.Tab("上传图像"):
+                image_input = gr.Image(type="pil", label="原始图像")
+                upload_button = gr.Button("检测伪造", variant="primary")
+            with gr.Tab("选择示例"):
+                example_images = get_example_images()
+                example_dropdown = gr.Dropdown(
+                    choices=example_images,
+                    label="选择示例图像",
+                    value=example_images[0] if example_images else None
+                )
+                example_button = gr.Button("检测示例", variant="secondary")
+            with gr.Accordion("高级选项", open=False):
+                threshold = gr.Slider(0, 1, value=0.5, label="检测阈值")
+        with gr.Column(scale=1):
+            gr.Markdown("### 检测结果")
+            with gr.Tabs():
+                with gr.Tab("处理后的图像"):
+                    output_image = gr.Image(label="伪造检测结果", interactive=False)
+                with gr.Tab("对比视图"):
+                    with gr.Row():
+                        original_display = gr.Image(label="原始图像", interactive=False)
+                        processed_display = gr.Image(label="处理后图像", interactive=False)
+            with gr.Group():
+                with gr.Row():
+                    fake_prob = gr.Number(label="伪造概率", precision=4)
+                    # Simplified to just show the probability as text
+                    result_text = gr.Textbox(label="检测结果", interactive=False)
+            save_button = gr.Button("保存结果", variant="secondary")
+    gr.Markdown("""
+    ---
+    ### 关于
+    - **技术**: Forgery Image Detection and Localization.
+    - **版本**: 1.0.0
+    """)
+    def process_image(image, threshold_value):
+        with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp_file:
+            image_path = tmp_file.name
+            image.save(image_path)
+        try:
+            processed_img, cls_probs = process_single_image(image_path)
+            # 安全提取概率值
+            prob = safe_extract_prob(cls_probs)
+            print(f"Classification probability: {prob:.4f}" if prob is not None else "No cls output")
+            return {
+                output_image: processed_img,
+                original_display: image,
+                processed_display: processed_img,
+                fake_prob: prob,
+                result_text: f"伪造概率: {prob:.4f}"
+            }
+        except Exception as e:
+            print(f"Error in processing: {e}")
+            return {
+                output_image: None,
+                original_display: image,
+                processed_display: None,
+                fake_prob: 0.0,
+                result_text: "处理错误"
+            }
+        finally:
+            if os.path.exists(image_path):
+                os.unlink(image_path)
+    def process_example(image_path, threshold_value):
+        try:
+            processed_img, cls_probs = process_single_image(image_path)
+            # 安全提取概率值
+            prob = safe_extract_prob(cls_probs)
+            print(f"Classification probability: {prob:.4f}" if prob is not None else "No cls output")
+            original_img = Image.open(image_path)
+            return {
+                image_input: original_img,
+                output_image: processed_img,
+                original_display: original_img,
+                processed_display: processed_img,
+                fake_prob: prob,
+                result_text: f"伪造概率: {prob:.4f}",
+                threshold: threshold_value
+            }
+        except Exception as e:
+            print(f"Error in processing example: {e}")
+            return {
+                image_input: None,
+                output_image: None,
+                original_display: None,
+                processed_display: None,
+                fake_prob: 0.0,
+                result_text: "处理错误",
+                threshold: threshold_value
+            }
+    upload_button.click(
+        process_image,
+        [image_input, threshold],
+        [output_image, original_display, processed_display, fake_prob, result_text]
+    )
+    example_button.click(
+        process_example,
+        [example_dropdown, threshold],
+        [image_input, output_image, original_display, processed_display, fake_prob, result_text, threshold]
+    )
+    save_button.click(
+        lambda img: img.save("result.jpg") if img else None,
+        [output_image],
+        None
+    )
+if __name__ == "__main__":
+    demo.launch(server_name="0.0.0.0", server_port=7860)
+# import gradio as gr
+# import os
+# import tempfile
+# import numpy as np
+# from PIL import Image
+# from src.predict import process_single_image
+# import sys
+# sys.path.insert(0, "./src")  # 确保src目录在路径中
+# # 可以处理无mask的图像 也可以处理有mask的两张图像
+# # 获取图像文件夹中的图片列表
+# def get_example_images(folder_path="/home/xxw/Loupe/ffhq"):
+#     return [os.path.join(folder_path, f) for f in os.listdir(folder_path)
+#             if f.lower().endswith(('.png', '.jpg', '.jpeg'))]
+# # 创建Gradio界面
+# with gr.Blocks(title="Loupe图像伪造检测系统", theme=gr.themes.Soft()) as demo:
+#     # 标题和描述
+#     gr.Markdown("""
+#     # Loupe🕵️‍♂️ 图像伪造检测系统
+#     ### 上传图像或从示例中选择，系统将检测图像中的伪造区域
+#     """)
+#     with gr.Row():
+#         # 左侧面板 - 原始图像
+#         with gr.Column(scale=1):
+#             with gr.Tab("上传图像"):
+#                 image_input = gr.Image(type="pil", label="原始图像")
+#                 upload_button = gr.Button("检测伪造", variant="primary")
+#             with gr.Tab("选择示例"):
+#                 example_images = get_example_images()
+#                 example_dropdown = gr.Dropdown(
+#                     choices=example_images,
+#                     label="选择示例图像",
+#                     value=example_images[0] if example_images else None
+#                 )
+#                 example_button = gr.Button("检测示例", variant="secondary")
+#             with gr.Accordion("高级选项", open=False):
+#                 threshold = gr.Slider(0, 1, value=0.5,
+#                                     label="检测阈值",
+#                                     info="调整伪造检测的敏感度")
+#                 processing_mode = gr.Radio(
+#                     ["快速模式", "精确模式"],
+#                     value="快速模式",
+#                     label="处理模式"
+#                 )
+#         # 右侧输出面板
+#         with gr.Column(scale=1):
+#             gr.Markdown("### 检测结果")
+#             with gr.Tabs():
+#                 with gr.Tab("处理后的图像"):
+#                     output_image = gr.Image(label="伪造检测结果", interactive=False)
+#                 with gr.Tab("对比视图"):
+#                     with gr.Row():
+#                         original_display = gr.Image(label="原始图像", interactive=False)
+#                         processed_display = gr.Image(label="处理后图像", interactive=False)
+#             with gr.Group():
+#                 with gr.Row():
+#                     fake_prob = gr.Number(label="伪造概率", precision=2)
+#                     result_label = gr.Label(label="检测结论")
+#             save_button = gr.Button("保存结果", variant="secondary")
+#     # 底部信息
+#     gr.Markdown("""
+#     ---
+#     ### 关于
+#     - **技术**: Forgery Image Detection and Localization.
+#     - **版本**: 1.0.0
+#     - **开发者**: xxw/teleai EVOL lab
+#     """)
+#     # 定义处理函数
+#     def process_image(image, threshold_value):
+#         # 创建一个临时文件保存上传的图像
+#         with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp_file:
+#             image_path = tmp_file.name
+#             image.save(image_path)
+#         try:
+#             # 调用你的处理函数
+#             processed_img, cls_probs = process_single_image(image_path)
+#             # 获取伪造概率（假设cls_probs是一个数组，取第一个值）
+#             prob = float(cls_probs[0]) if cls_probs else 0.0
+#             # 确定结果 - 修改为返回字典格式
+#             result = {
+#                 "label": "伪造图像" if prob > threshold_value else "真实图像",
+#                 "confidences": [
+#                     {"label": "伪造", "confidence": prob},
+#                     {"label": "真实", "confidence": 1 - prob}
+#                 ]
+#             }
+#             return {
+#                 output_image: processed_img,
+#                 original_display: image,
+#                 processed_display: processed_img,
+#                 fake_prob: prob,
+#                 result_label: result  # 使用正确的字典格式
+#             }
+#         finally:
+#             # 清理临时文件
+#             if os.path.exists(image_path):
+#                 os.unlink(image_path)
+#     def process_example(image_path, threshold_value):
+#         # 直接调用你的处理函数
+#         processed_img, cls_probs = process_single_image(image_path)
+#         # 获取伪造概率
+#         prob = float(cls_probs[0]) if cls_probs else 0.0
+#         # 确定结果 - 修改为返回字典格式
+#         result = {
+#             "label": "伪造图像" if prob > threshold_value else "真实图像",
+#             "confidences": [
+#                 {"label": "伪造", "confidence": prob},
+#                 {"label": "真实", "confidence": 1 - prob}
+#             ]
+#         }
+#         # 打开原始图像用于显示
+#         original_img = Image.open(image_path)
+#         return {
+#             image_input: original_img,
+#             output_image: processed_img,
+#             original_display: original_img,
+#             processed_display: processed_img,
+#             fake_prob: prob,
+#             result_label: result,  # 使用正确的字典格式
+#             threshold: threshold_value
+#         }
+#     # 修改绑定事件
+#     upload_button.click(
+#         fn=process_image,
+#         inputs=[image_input, threshold],
+#         outputs=[output_image, original_display, processed_display, fake_prob, result_label]
+#     )
+#     def load_example_image(example_path):
+#         try:
+#             return Image.open(example_path)
+#         except:
+#             return None
+#     example_button.click(
+#         fn=process_example,
+#         inputs=[example_dropdown, threshold],
+#         outputs=[image_input, output_image, original_display, processed_display, fake_prob, result_label, threshold]
+#     )
+#     save_button.click(
+#         fn=lambda img: (img.save("result.jpg") if img else None) or "结果已保存!",
+#         inputs=[output_image],
+#         outputs=gr.Textbox(visible=True, label="保存状态"),
+#         api_name="save_result"
+#     )
+# # def greet(name):
+# #     return "Hello " + name + "!!"
+# # demo = gr.Interface(fn=greet, inputs="text", outputs="text")
+# # demo.launch()
+# # 启动应用
+# if __name__ == "__main__":
+#     demo.launch(server_name="0.0.0.0", server_port=7860)

app3.py ADDED Viewed

	@@ -0,0 +1,591 @@

+import gradio as gr
+import os
+import tempfile
+import numpy as np
+from PIL import Image
+from src.predict import process_single_image
+import sys
+sys.path.insert(0, "./src")
+# 自定义主题 - 炫彩现代化
+custom_theme = gr.themes.Default(
+    primary_hue="purple",
+    secondary_hue="pink",
+    neutral_hue="slate",
+    font=[gr.themes.GoogleFont("Poppins"), gr.themes.GoogleFont("Inter"), "Arial", "sans-serif"]
+).set(
+    button_primary_background_fill="linear-gradient(45deg, #667eea 0%, #764ba2 100%)",
+    button_primary_background_fill_hover="linear-gradient(45deg, #764ba2 0%, #667eea 100%)",
+    button_primary_text_color="white",
+    button_secondary_background_fill="linear-gradient(45deg, #f093fb 0%, #f5576c 100%)",
+    button_secondary_background_fill_hover="linear-gradient(45deg, #f5576c 0%, #f093fb 100%)",
+    button_secondary_text_color="white"
+)
+def get_example_images(folder_path="ffhq"):
+    """获取示例图片列表"""
+    return sorted([os.path.join(folder_path, f) for f in os.listdir(folder_path)
+            if f.lower().endswith(('.png', '.jpg', '.jpeg'))])
+def safe_extract_prob(cls_probs):
+    """安全地从cls_probs中提取概率值"""
+    try:
+        if cls_probs is None:
+            return 0.0
+        elif isinstance(cls_probs, (list, np.ndarray)) and len(cls_probs) > 0:
+            return float(cls_probs[0])
+        elif hasattr(cls_probs, '__getitem__'):
+            return float(cls_probs[0])
+        else:
+            return float(cls_probs)
+    except (TypeError, IndexError, ValueError) as e:
+        print(f"Error extracting probability: {e}")
+        return 0.0
+# 创建主界面
+with gr.Blocks(
+    title="Loupe - AI图像伪造检测系统",
+    theme=custom_theme,
+    css="""
+    /* 全局样式 */
+    body {
+        background: linear-gradient(-45deg, #ee7752, #e73c7e, #23a6d5, #23d5ab);
+        background-size: 400% 400%;
+        animation: gradientBG 15s ease infinite;
+        min-height: 100vh;
+    }
+    @keyframes gradientBG {
+        0% { background-position: 0% 50%; }
+        50% { background-position: 100% 50%; }
+        100% { background-position: 0% 50%; }
+    }
+    /* 主容器样式 */
+    .gradio-container {
+        background: rgba(255, 255, 255, 0.95);
+        backdrop-filter: blur(10px);
+        border-radius: 20px;
+        box-shadow: 0 20px 40px rgba(0, 0, 0, 0.1);
+        margin: 20px;
+        padding: 20px;
+    }
+    /* 标题样式 */
+    .title-box {
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+        padding: 30px;
+        border-radius: 15px;
+        margin-bottom: 30px;
+        box-shadow: 0 15px 35px rgba(102, 126, 234, 0.3);
+        position: relative;
+        overflow: hidden;
+    }
+    .title-box::before {
+        content: '';
+        position: absolute;
+        top: -50%;
+        left: -50%;
+        width: 200%;
+        height: 200%;
+        background: linear-gradient(45deg, transparent, rgba(255, 255, 255, 0.1), transparent);
+        animation: shine 3s infinite;
+    }
+    @keyframes shine {
+        0% { transform: translateX(-100%) translateY(-100%) rotate(45deg); }
+        100% { transform: translateX(100%) translateY(100%) rotate(45deg); }
+    }
+    .title-text {
+        font-weight: 700;
+        font-size: 32px;
+        color: white;
+        margin-bottom: 8px;
+        text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.3);
+        background: linear-gradient(45deg, #fff, #f0f8ff);
+        -webkit-background-clip: text;
+        -webkit-text-fill-color: transparent;
+        background-clip: text;
+    }
+    .subtitle-text {
+        color: rgba(255, 255, 255, 0.9);
+        font-size: 18px;
+        font-weight: 300;
+        text-shadow: 1px 1px 2px rgba(0, 0, 0, 0.2);
+    }
+    /* 输入和结果框样式 */
+    .input-box, .result-box {
+        background: linear-gradient(145deg, rgba(255, 255, 255, 0.9), rgba(248, 250, 252, 0.9));
+        padding: 25px;
+        border-radius: 15px;
+        margin-bottom: 20px;
+        border: 1px solid rgba(255, 255, 255, 0.3);
+        box-shadow: 0 10px 30px rgba(0, 0, 0, 0.1);
+        backdrop-filter: blur(10px);
+        transition: all 0.3s ease;
+    }
+    .input-box:hover, .result-box:hover {
+        transform: translateY(-5px);
+        box-shadow: 0 20px 40px rgba(0, 0, 0, 0.15);
+    }
+    .input-title, .result-title {
+        font-weight: 700;
+        background: linear-gradient(45deg, #667eea, #764ba2);
+        -webkit-background-clip: text;
+        -webkit-text-fill-color: transparent;
+        background-clip: text;
+        margin-bottom: 15px;
+        font-size: 20px;
+        text-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
+    }
+    /* 按钮样式 */
+    .btn-primary {
+        background: linear-gradient(45deg, #667eea 0%, #764ba2 100%);
+        border: none;
+        border-radius: 25px;
+        padding: 12px 30px;
+        font-weight: 600;
+        text-transform: uppercase;
+        letter-spacing: 1px;
+        box-shadow: 0 10px 20px rgba(102, 126, 234, 0.3);
+        transition: all 0.3s ease;
+    }
+    .btn-primary:hover {
+        transform: translateY(-3px);
+        box-shadow: 0 15px 30px rgba(102, 126, 234, 0.4);
+        background: linear-gradient(45deg, #764ba2 0%, #667eea 100%);
+    }
+    .btn-secondary {
+        background: linear-gradient(45deg, #f093fb 0%, #f5576c 100%);
+        border: none;
+        border-radius: 25px;
+        padding: 10px 25px;
+        font-weight: 600;
+        box-shadow: 0 8px 16px rgba(240, 147, 251, 0.3);
+        transition: all 0.3s ease;
+    }
+    .btn-secondary:hover {
+        transform: translateY(-2px);
+        box-shadow: 0 12px 24px rgba(240, 147, 251, 0.4);
+    }
+    /* 图片上传区域 */
+    #upload_image {
+        min-height: 350px;
+        border: 3px dashed rgba(102, 126, 234, 0.3);
+        border-radius: 15px;
+        background: linear-gradient(45deg, rgba(102, 126, 234, 0.05), rgba(118, 75, 162, 0.05));
+        transition: all 0.3s ease;
+    }
+    #upload_image:hover {
+        border-color: rgba(102, 126, 234, 0.6);
+        background: linear-gradient(45deg, rgba(102, 126, 234, 0.1), rgba(118, 75, 162, 0.1));
+        transform: scale(1.02);
+    }
+    /* 概率显示 */
+    #probability input {
+        font-weight: bold;
+        background: linear-gradient(45deg, #667eea, #764ba2);
+        -webkit-background-clip: text;
+        -webkit-text-fill-color: transparent;
+        background-clip: text;
+        font-size: 1.2em;
+    }
+    #result_text input {
+        font-size: 1.1em;
+        font-weight: 600;
+        background: linear-gradient(45deg, rgba(102, 126, 234, 0.1), rgba(118, 75, 162, 0.1));
+        border-radius: 10px;
+        border: 2px solid rgba(102, 126, 234, 0.2);
+    }
+    /* 画廊样式 */
+    .gallery-item {
+        border-radius: 12px !important;
+        transition: all 0.3s ease;
+        box-shadow: 0 5px 15px rgba(0, 0, 0, 0.1);
+    }
+    .gallery-item:hover {
+        transform: scale(1.05);
+        box-shadow: 0 10px 25px rgba(0, 0, 0, 0.2);
+    }
+    /* 示例按钮 */
+    .example-btn {
+        margin-top: 15px;
+        width: 100%;
+        background: linear-gradient(45deg, #23a6d5 0%, #23d5ab 100%);
+        border-radius: 20px;
+        font-weight: 600;
+        box-shadow: 0 8px 16px rgba(35, 166, 213, 0.3);
+        transition: all 0.3s ease;
+    }
+    .example-btn:hover {
+        transform: translateY(-2px);
+        box-shadow: 0 12px 24px rgba(35, 166, 213, 0.4);
+    }
+    /* Tab 样式 */
+    .tab-nav button {
+        border-radius: 15px 15px 0 0;
+        background: linear-gradient(45deg, rgba(102, 126, 234, 0.8), rgba(118, 75, 162, 0.8));
+        color: white;
+        font-weight: 600;
+        transition: all 0.3s ease;
+    }
+    .tab-nav button:hover {
+        background: linear-gradient(45deg, rgba(118, 75, 162, 0.9), rgba(102, 126, 234, 0.9));
+        transform: translateY(-2px);
+    }
+    /* 滑块样式 */
+    .gr-slider input[type="range"] {
+        background: linear-gradient(45deg, #667eea, #764ba2);
+        border-radius: 10px;
+    }
+    /* 手风琴样式 */
+    .gr-accordion {
+        background: linear-gradient(145deg, rgba(255, 255, 255, 0.8), rgba(248, 250, 252, 0.8));
+        border-radius: 15px;
+        border: 1px solid rgba(102, 126, 234, 0.2);
+        box-shadow: 0 5px 15px rgba(0, 0, 0, 0.1);
+    }
+    /* 炫彩加载动画 */
+    @keyframes rainbow {
+        0% { background-position: 0% 50%; }
+        50% { background-position: 100% 50%; }
+        100% { background-position: 0% 50%; }
+    }
+    .processing {
+        background: linear-gradient(-45deg, #ee7752, #e73c7e, #23a6d5, #23d5ab);
+        background-size: 400% 400%;
+        animation: rainbow 2s ease infinite;
+    }
+    /* 响应式设计 */
+    @media (max-width: 768px) {
+        .title-text { font-size: 24px; }
+        .subtitle-text { font-size: 16px; }
+        .input-box, .result-box { padding: 20px; }
+    }
+    """
+) as demo:
+    # 标题部分 - 炫彩渐变设计
+    with gr.Column(elem_classes="title-box"):
+        gr.Markdown("""
+        <div class="title-text">🔍 Loupe 图像伪造检测系统</div>
+        <div class="subtitle-text">✨ 基于深度学习的图像伪造检测与定位技术</div>
+        """)
+    # 添加装饰性分割线
+    gr.HTML("""
+    <div style="height: 4px; background: linear-gradient(90deg, #667eea, #764ba2, #f093fb, #f5576c, #23a6d5, #23d5ab);
+                border-radius: 2px; margin: 20px 0; box-shadow: 0 2px 10px rgba(0,0,0,0.2);"></div>
+    """)
+    # 主界面组件
+    with gr.Row(equal_height=True):
+        with gr.Column(scale=1, min_width=300):
+            # 输入图像区域 - 炫彩设计
+            with gr.Column(elem_classes="input-box"):
+                gr.Markdown("""<div class="input-title">🎨 输入图像</div>""")
+                with gr.Tabs():
+                    with gr.Tab("📤 上传图片", id="upload_tab"):
+                        image_input = gr.Image(type="pil", label="", elem_id="upload_image")
+                        upload_button = gr.Button("🚀 开始检测", variant="primary", size="lg", elem_classes="btn-primary")
+                    with gr.Tab("🖼️ 示例图片", id="example_tab"):
+                        example_images = get_example_images()
+                        example_gallery = gr.Gallery(
+                            value=example_images,
+                            label="",
+                            columns=4,
+                            rows=None,
+                            height="auto",
+                            object_fit="contain",
+                            allow_preview=True,
+                            selected_index=None
+                        )
+                        # 添加炫彩检测按钮
+                        example_button = gr.Button(
+                            "✨ 检测选中的示例图片",
+                            variant="primary",
+                            elem_classes="example-btn"
+                        )
+                        # 隐藏组件用于存储选中索引
+                        selected_index = gr.Number(visible=False)
+            with gr.Accordion("⚙️ 高级设置", open=False):
+                threshold = gr.Slider(0, 1, value=0.5, step=0.01, label="🎯 检测敏感度")
+                gr.HTML("""
+                <div style="background: linear-gradient(45deg, rgba(102,126,234,0.1), rgba(118,75,162,0.1));
+                           padding: 10px; border-radius: 8px; margin-top: 10px;">
+                    <small style="color: #667eea; font-weight: 500;">💡 调整数值可改变检测的严格程度</small>
+                </div>
+                """)
+        with gr.Column(scale=1.5, min_width=500):
+            # 检测结果区域 - 炫彩设计
+            with gr.Column(elem_classes="result-box"):
+                gr.Markdown("""<div class="result-title">🎯 检测结果</div>""")
+                with gr.Tabs():
+                    with gr.Tab("🔍 检测效果", id="result_tab"):
+                        output_image = gr.Image(label="伪造区域标记", interactive=False)
+                    with gr.Tab("⚖️ 对比视图", id="compare_tab"):
+                        with gr.Row():
+                            original_display = gr.Image(label="原始图像", interactive=False)
+                            processed_display = gr.Image(label="检测结果", interactive=False)
+                with gr.Group():
+                    with gr.Row():
+                        fake_prob = gr.Number(label="🎲 伪造概率", precision=2, elem_id="probability")
+                        result_text = gr.Textbox(label="📝 检测结论", interactive=False, elem_id="result_text")
+                with gr.Row():
+                    save_button = gr.Button("💾 保存结果", variant="secondary", elem_classes="btn-secondary")
+                    clear_button = gr.Button("🧹 清除", variant="secondary", elem_classes="btn-secondary")
+    # 关于部分 - 炫彩设计
+    with gr.Accordion("🌟 关于系统", open=False):
+        gr.HTML("""
+        <div style="background: linear-gradient(135deg, rgba(102,126,234,0.1), rgba(118,75,162,0.1), rgba(240,147,251,0.1));
+                   padding: 20px; border-radius: 15px; border: 1px solid rgba(102,126,234,0.2);">
+            <h3 style="background: linear-gradient(45deg, #667eea, #764ba2); -webkit-background-clip: text;
+                      -webkit-text-fill-color: transparent; margin-bottom: 15px;">
+                ✨ Loupe 伪造图像检测系统
+            </h3>
+            <div style="display: grid; grid-template-columns: repeat(auto-fit, minmax(200px, 1fr)); gap: 15px;">
+                <div style="background: rgba(102,126,234,0.1); padding: 15px; border-radius: 10px;">
+                    <strong style="color: #667eea;">🚀 技术</strong><br>
+                    基于深度学习的图像伪造检测与定位
+                </div>
+                <div style="background: rgba(118,75,162,0.1); padding: 15px; border-radius: 10px;">
+                    <strong style="color: #764ba2;">⭐ 特点</strong><br>
+                    高精度、实时处理、可解释性强
+                </div>
+                <div style="background: rgba(240,147,251,0.1); padding: 15px; border-radius: 10px;">
+                    <strong style="color: #f093fb;">📱 版本</strong><br>
+                    v2.0.0 炫彩版
+                </div>
+                <div style="background: rgba(245,87,108,0.1); padding: 15px; border-radius: 10px;">
+                    <strong style="color: #f5576c;">👥 开发者</strong><br>
+                    EVOL Lab (jyc, xxw)
+                </div>
+            </div>
+            <div style="margin-top: 20px; padding: 15px; background: linear-gradient(45deg, rgba(35,166,213,0.1), rgba(35,213,171,0.1));
+                       border-radius: 10px; border-left: 4px solid #23a6d5;">
+                <strong style="color: #23a6d5;">💡 系统介绍</strong><br>
+                本系统可检测多种图像篡改痕迹，包括复制-移动、拼接、擦除等操作。采用最新的深度学习算法，提供高精度的检测结果和直观的可视化分析。
+            </div>
+        </div>
+        """)
+    # 页脚 - 炫彩设计
+    gr.HTML("""
+    <div style="margin-top: 40px; padding: 20px; text-align: center;
+               background: linear-gradient(135deg, rgba(102,126,234,0.1), rgba(118,75,162,0.1));
+               border-radius: 15px; border-top: 2px solid rgba(102,126,234,0.3);">
+        <div style="background: linear-gradient(45deg, #667eea, #764ba2); -webkit-background-clip: text;
+                   -webkit-text-fill-color: transparent; font-weight: 600; margin-bottom: 10px;">
+            ✨ 感谢使用 Loupe 图像伪造检测系统 ✨
+        </div>
+        <div style="color: #64748b; font-size: 14px;">
+            © 2025 EVOL Lab | 让AI守护图像真实性 🛡️
+        </div>
+        <div style="margin-top: 10px;">
+            <span style="background: linear-gradient(45deg, #f093fb, #f5576c); -webkit-background-clip: text;
+                        -webkit-text-fill-color: transparent; font-weight: 500;">
+                🌟 科技点亮未来，智能守护真实 🌟
+            </span>
+        </div>
+    </div>
+    """)
+    def process_image(image, threshold_value):
+        """处理上传的图像"""
+        if image is None:
+            return {
+                output_image: None,
+                fake_prob: 0.0,
+                result_text: "❌ 请上传有效图像"
+            }
+        with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp_file:
+            image_path = tmp_file.name
+            image.save(image_path)
+        try:
+            processed_img, cls_probs = process_single_image(image_path)
+            prob = safe_extract_prob(cls_probs)
+            # 根据概率生成炫彩结论
+            if prob > threshold_value + 0.2:
+                conclusion = "🚨 高度疑似伪造"
+                emoji = "🔴"
+            elif prob > threshold_value:
+                conclusion = "⚠️ 可能伪造"
+                emoji = "🟡"
+            else:
+                conclusion = "✅ 未检测到伪造"
+                emoji = "🟢"
+            return {
+                output_image: processed_img,
+                original_display: image,
+                processed_display: processed_img,
+                fake_prob: prob,
+                result_text: f"{emoji} {conclusion} (概率: {prob:.2f})"
+            }
+        except Exception as e:
+            print(f"Error in processing: {e}")
+            return {
+                output_image: None,
+                original_display: image,
+                processed_display: None,
+                fake_prob: 0.0,
+                result_text: f"❌ 处理错误: {str(e)}"
+            }
+        finally:
+            if os.path.exists(image_path):
+                os.unlink(image_path)
+    def process_example(example_data, selected_idx, threshold_value):
+        """处理示例图像"""
+        if not example_data or selected_idx is None:
+            return {
+                image_input: None,
+                output_image: None,
+                original_display: None,
+                processed_display: None,
+                fake_prob: 0.0,
+                result_text: "⚠️ 请先选择示例图片",
+                threshold: threshold_value
+            }
+        try:
+            selected_idx = int(selected_idx)
+            image_info = example_data[selected_idx]
+            # 处理不同的数据格式
+            if isinstance(image_info, (tuple, list)):
+                image_path = image_info[0]  # (path, caption)格式
+            elif isinstance(image_info, dict):
+                image_path = image_info.get("name", image_info.get("path"))
+            else:
+                image_path = image_info
+            print(f"Processing selected image (index {selected_idx}): {image_path}")  # 调试日志
+            # 处理图像
+            processed_img, cls_probs = process_single_image(image_path)
+            prob = safe_extract_prob(cls_probs)
+            original_img = Image.open(image_path)
+            # 根据概率生成炫彩结论
+            if prob > threshold_value + 0.2:
+                conclusion = "🚨 高度疑似伪造"
+                emoji = "🔴"
+            elif prob > threshold_value:
+                conclusion = "⚠️ 可能伪造"
+                emoji = "🟡"
+            else:
+                conclusion = "✅ 未检测到伪造"
+                emoji = "🟢"
+            return {
+                image_input: original_img,
+                output_image: processed_img,
+                original_display: original_img,
+                processed_display: processed_img,
+                fake_prob: prob,
+                result_text: f"{emoji} {conclusion} (概率: {prob:.2f})",
+                threshold: threshold_value
+            }
+        except Exception as e:
+            print(f"Error in processing example: {e}")
+            return {
+                image_input: None,
+                output_image: None,
+                original_display: None,
+                processed_display: None,
+                fake_prob: 0.0,
+                result_text: f"❌ 示例处理错误: {str(e)}",
+                threshold: threshold_value
+            }
+    def clear_all():
+        """清除所有输入输出"""
+        return {
+            image_input: None,
+            output_image: None,
+            original_display: None,
+            processed_display: None,
+            fake_prob: 0.0,
+            result_text: "🧹 已清除所有数据"
+        }
+    def update_selected_index(evt: gr.SelectData):
+        """更新选中的图片索引"""
+        return evt.index
+    # 交互逻辑
+    upload_button.click(
+        process_image,
+        [image_input, threshold],
+        [output_image, original_display, processed_display, fake_prob, result_text]
+    )
+    # 示例图片选择事件
+    example_gallery.select(
+        update_selected_index,
+        None,
+        selected_index
+    )
+    # 示例图片检测按钮点击事件
+    example_button.click(
+        process_example,
+        [example_gallery, selected_index, threshold],
+        [image_input, output_image, original_display, processed_display, fake_prob, result_text, threshold]
+    )
+    save_button.click(
+        lambda img: (img.save("result.jpg"), "💾 结果已保存为 result.jpg")[1] if img else "❌ 没有图像可保存",
+        [output_image],
+        None,
+        api_name="save_result"
+    )
+    clear_button.click(
+        clear_all,
+        [],
+        [image_input, output_image, original_display, processed_display, fake_prob, result_text]
+    )
+if __name__ == "__main__":
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7864,
+        favicon_path="./favicon.ico" if os.path.exists("./favicon.ico") else None
+    )

configs/base.yaml ADDED Viewed

	@@ -0,0 +1,6 @@

+seed: 42
+trainer:
+  precision: 16-mixed
+  strategy: deepspeed_stage_2_offload # options: "deepspeed_stage_2_offload" / "ddp"
+  fast_dev_run: false # set to true for debugging
+hydra.output_subdir: null

configs/ckpt/base.yaml ADDED Viewed

	@@ -0,0 +1,12 @@

+# list of checkpoints to load, the latter will override the former
+# must end with .safetensors or .pt/.pth
+checkpoint_paths: []
+saver:
+  _target_: "pytorch_lightning.callbacks.ModelCheckpoint"
+  dirpath: null # it will be set during runtime
+  filename: "loupe-{val_loss:.4f}"
+  monitor: val_loss
+  mode: min
+  save_top_k: 1
+  save_last: False

configs/ckpt/cls.yaml ADDED Viewed

	@@ -0,0 +1,8 @@

+defaults:
+  - base
+  - _self_
+saver:
+  filename: "cls-{auc:.4f}"
+  monitor: auc
+  mode: max

configs/ckpt/cls_seg.yaml ADDED Viewed

	@@ -0,0 +1,9 @@

+defaults:
+  - base
+  - _self_
+checkpoint_paths: ["checkpoints/cls/model.safetensors", "checkpoints/seg-f1=0.8804-iou=0.8866.ckpt/model.safetensors"]
+saver:
+  filename: "cls_seg-{auc:.4f}-{f1:.4f}-{iou:.4f}"
+  monitor: overall  # overall is mean of auc, f1, and iou, based on challenge requirements
+  mode: max

configs/ckpt/seg.yaml ADDED Viewed

	@@ -0,0 +1,8 @@

+defaults:
+  - base
+  - _self_
+saver:
+  filename: "seg-{f1:.4f}-{iou:.4f}"
+  monitor: iou
+  mode: max

configs/ckpt/test.yaml ADDED Viewed

	@@ -0,0 +1,7 @@

+defaults:
+  - cls_seg
+  - _self_
+# list of checkpoints to load, the latter will override the former
+# must end with .safetensors or .pt/.pth
+checkpoint_paths: ["checkpoints/cls/model.safetensors", "checkpoints/seg/model.safetensors"]

configs/dataset/base.yaml ADDED Viewed

	@@ -0,0 +1,4 @@

+data_dir: null
+num_workers: 8
+valid_size: 0.1 # float between 0 and 1 or int representing the number of samples

configs/dataset/custom.yaml ADDED Viewed

	@@ -0,0 +1,5 @@

+defaults:
+  - base
+  - _self_
+data_dir: "/gemini/space/jyc/casia2/"

configs/dataset/ddl.yaml ADDED Viewed

	@@ -0,0 +1,6 @@

+defaults:
+  - base
+  - _self_
+valid_size: 5000
+data_dir: "/gemini/space/jyc/track1/"

configs/hparams/base.yaml ADDED Viewed

	@@ -0,0 +1,6 @@

+weight_decay: 1e-3
+warmup_step: 0.1
+decay_step: 0.1
+grad_clip_val: 1.0
+scheduler: "wsd" # Options: "cosine" / "wsd"
+epoch: 1

configs/hparams/cls.yaml ADDED Viewed

	@@ -0,0 +1,10 @@

+defaults:
+  - base
+  - _self_
+cls_lr: null # lr for cls head
+backbone_lr: 1e-5 # sometimes we want to finetune backbone with a smaller learning rate
+lr: 5e-4 # default lr for other not specified params
+batch_size: 48
+accumulate_grad_batches: 8

configs/hparams/cls_seg.yaml ADDED Viewed

	@@ -0,0 +1,6 @@

+defaults:
+  - cls
+  - seg
+  - _self_
+batch_size: 32

configs/hparams/seg.yaml ADDED Viewed

	@@ -0,0 +1,13 @@

+defaults:
+  - base
+  - _self_
+weight_decay: 5e-2
+seg_lr: null # lr for seg head
+backbone_lr: 1e-5 # sometimes we want to finetune backbone with a smaller learning rate
+lr: 3e-4 # default lr for other not specified params
+epoch: 1
+batch_size: 40
+accumulate_grad_batches: 3

configs/hparams/test.yaml ADDED Viewed

	@@ -0,0 +1,7 @@

+defaults:
+  - base
+  - _self_
+batch_size: 96
+lr: 1e-4
+accumulate_grad_batches: 1

configs/infer.yaml ADDED Viewed

	@@ -0,0 +1,11 @@

+defaults:
+  - base
+  - dataset: ddl
+  - stage: test
+  - model: test
+  - hparams: test
+  - ckpt: test
+  - _self_
+trainer:
+  enable_checkpointing: false

configs/model/base.yaml ADDED Viewed

	@@ -0,0 +1,19 @@

+# basic configs
+hidden_act: "gelu"
+hidden_dropout_prob: 0.1
+initializer_range: 0.02
+# backbone configs
+backbone_name: PE-Core-L14-336
+backbone_path: ./pretrained_weights/pe/PE-Core-L14-336.pt
+freeze_backbone: False
+# backbone overrides, you can set attr to '-' to use default value
+# visit https://github.com/facebookresearch/perception_models/blob/main/core/vision_encoder/config.py for available overrides
+backbone_overrides:
+  output_dim: null # set to null to use our own proj mlp
+  # NOTE: pool_type of PE-Spatial-G14-448 is none
+  # but loupe requires a pool_type, specify it to "attn" / "tok"
+  # pool_type: "attn" / "tok"
+  pool_type: "-"
+  use_cls_token: "-"

configs/model/cls.yaml ADDED Viewed

	@@ -0,0 +1,15 @@

+defaults:
+  - base
+  - _self_
+# loupe configs
+cls_mlp_ratio: 2 # 2 times of Perception Encoder output dim
+cls_mlp_layers: 2
+enable_patch_cls: True
+enable_cls_fusion: True
+freeze_cls: False
+freeze_backbone: True
+cls_forge_weight: 0.2
+patch_forge_weight: 0.85

configs/model/cls_seg.yaml ADDED Viewed

	@@ -0,0 +1,16 @@

+defaults:
+  - cls
+  - seg
+  - _self_
+freeze_backbone: False
+freeze_cls: False
+freeze_seg: False
+cls_loss_weight: 2.0
+seg_loss_weight: 1.0
+# conditional_queries is typically used for test-time adaptation
+# during segmentation training, conditional_queries will serve as an
+# extra condition provided for pixel decoder.
+enable_conditional_queries: True

configs/model/seg.yaml ADDED Viewed

	@@ -0,0 +1,32 @@

+defaults:
+  - base
+  - _self_
+fpn_scales: [0.5, 2, 4] # rescale the last hidden states of backbone. for PE-Core-L14-336, rescale to 12x12, 48x48, 96x96
+freeze_backbone: True
+freeze_seg: False
+# tversky alpha and beta control the weight of false positive and false negative, respectively
+# the tversky beta is set to 1 - alpha
+tversky_alpha: 0.3
+# weight for forged pixels, set between 0 and 1
+pixel_forge_weight: 0.8
+# epsilon for poly1 focal loss
+pixel_poly_epsilon: 1.0
+# conditional_queries is typically used for test-time adaptation
+# during segmentation training, conditional_queries will serve as an
+# extra condition provided for pixel decoder.
+enable_conditional_queries: True
+# mask2former overrides, you can set attr to '-' to use default value
+# visit https://huggingface.co/docs/transformers/main/model_doc/mask2former#transformers.Mask2FormerConfig for available overrides
+mask2former_overrides:
+  num_queries: 20
+  mask_weight: 5
+  class_weight: 2
+  dice_weight: 5
+  id2label:
+    0: "forgery"
+  label2id:
+    forgery: 0

configs/model/test.yaml ADDED Viewed

	@@ -0,0 +1,7 @@

+defaults:
+  - cls_seg
+  - _self_
+freeze_backbone: True
+freeze_cls: True
+freeze_seg: True

configs/stage/cls.yaml ADDED Viewed

	@@ -0,0 +1 @@


1	+ name: cls

configs/stage/cls_seg.yaml ADDED Viewed

	@@ -0,0 +1,8 @@

+name: cls_seg
+# whether to train on the training set.
+# since the classifier and segmenter are sometimes trained separately,
+# we default to training on most of the validation set to avoid overfitting.
+# However, if you prefer to train from scratch directly on the training set,
+# you can do so by setting this variable to true.
+train_on_trainset: false

configs/stage/seg.yaml ADDED Viewed

	@@ -0,0 +1 @@


1	+ name: seg

configs/stage/test.yaml ADDED Viewed

	@@ -0,0 +1,10 @@

+name: test
+# if set to a specific path, model predictions will be saved to this path
+# with the filename format:
+#  - predicted masks: {predict_output_dir}/{ckpt-name}/masks/{image-name}.png
+#  - predicted labels: {predict_output_dir}/{ckpt-name}/predictions.txt
+# in predictions.txt, each line is {image-name}.png,{probs to be forged:.4f}
+pred_output_dir: null
+enable_tta: false

configs/train.yaml ADDED Viewed

	@@ -0,0 +1,11 @@

+defaults:
+  - base
+  - dataset: ddl
+  - stage: null # options: "cls" / "seg" / "cls_seg"
+  - model: ${stage}
+  - hparams: ${stage}
+  - ckpt: ${stage}
+  - _self_
+trainer:
+  enable_checkpointing: true

dataset_preprocess.ipynb ADDED Viewed

	@@ -0,0 +1,334 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "531bfd29",
+   "metadata": {},
+   "source": [
+    "In this notebook, we transform raw datasets to parquet format to enable faster loading speed during training and evaluation.\n",
+    "\n",
+    "The raw format of released datasets is as follows:\n",
+    "```python\n",
+    "# train set\n",
+    "/train/real/...\n",
+    "/train/fake/...\n",
+    "/train/masks/...\n",
+    "# valid set\n",
+    "/valid/real/...\n",
+    "/valid/fake/...\n",
+    "/valid/masks/...\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "8bd7e9d5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from datasets import Dataset, DatasetDict\n",
+    "from datasets import Features, Image, Value\n",
+    "from typing import List, Optional\n",
+    "\n",
+    "\n",
+    "def load_images_from_dir(directory: str) -> List[str]:\n",
+    "    return [\n",
+    "        os.path.join(directory, fname)\n",
+    "        for fname in os.listdir(directory)\n",
+    "        if fname.endswith((\"jpg\", \"jpeg\", \"png\", \"tif\"))\n",
+    "    ]\n",
+    "\n",
+    "\n",
+    "def create_split(root_dir: str, split: str) -> Optional[Dataset]:\n",
+    "    fake_dir = os.path.join(root_dir, split, \"fake\")\n",
+    "    masks_dir = os.path.join(root_dir, split, \"masks\")\n",
+    "    real_dir = os.path.join(root_dir, split, \"real\")\n",
+    "\n",
+    "    if all(not os.path.isdir(p) for p in [fake_dir, masks_dir, real_dir]):\n",
+    "        return None\n",
+    "\n",
+    "    print(f\"Split: {split},\", end=\" \")\n",
+    "    fake_images, real_images, mask_images = [], [], []\n",
+    "    if os.path.isdir(fake_dir):\n",
+    "        fake_images = load_images_from_dir(fake_dir)\n",
+    "        print(f\"Fake images: {len(fake_images)}\", end=\"\")\n",
+    "    if os.path.isdir(masks_dir):\n",
+    "        mask_images = load_images_from_dir(masks_dir)\n",
+    "        print(f\", Masks: {len(mask_images)}\", end=\"\")\n",
+    "        assert len(fake_images) == len(mask_images)\n",
+    "    if os.path.isdir(real_dir):\n",
+    "        real_images = load_images_from_dir(real_dir)\n",
+    "        print(f\", Real images: {len(real_images)}\", end=\"\")\n",
+    "    print()\n",
+    "\n",
+    "    return Dataset.from_dict(\n",
+    "        {\n",
+    "            \"path\": fake_images + real_images,\n",
+    "            \"image\": fake_images + real_images,\n",
+    "            \"mask\": mask_images + [None] * len(real_images),\n",
+    "        },\n",
+    "        features=Features(\n",
+    "            {\"path\": Value(dtype=\"string\"), \"image\": Image(), \"mask\": Image()}\n",
+    "        ),\n",
+    "    )\n",
+    "\n",
+    "\n",
+    "def create_dataset(root_dir: str) -> DatasetDict:\n",
+    "    return DatasetDict(\n",
+    "        {\n",
+    "            split: d\n",
+    "            for split in [\"train\", \"valid\", \"test\"]\n",
+    "            if (d := create_split(root_dir, split)) is not None\n",
+    "        }\n",
+    "    )\n",
+    "\n",
+    "\n",
+    "# replace with your own dataset path\n",
+    "root_dir = \"/gemini/space/lye/track1\"\n",
+    "save_dir = \"/gemini/space/jyc/track1\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a1d6f1c7",
+   "metadata": {},
+   "source": [
+    "We merge `real/` and `fake/` into `images` column for simplity. A image is real if there is no corresponding mask."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "07009f1e",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Split: train, Fake images: 798831, Masks: 798831, Real images: 156100\n",
+      "Split: valid, Fake images: 199708, Masks: 199708, Real images: 39025\n",
+      "Split: test, Images: 222847\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "DatasetDict({\n",
+       "    train: Dataset({\n",
+       "        features: ['path', 'image', 'mask'],\n",
+       "        num_rows: 954931\n",
+       "    })\n",
+       "    valid: Dataset({\n",
+       "        features: ['path', 'image', 'mask'],\n",
+       "        num_rows: 238733\n",
+       "    })\n",
+       "    test: Dataset({\n",
+       "        features: ['path', 'image'],\n",
+       "        num_rows: 222847\n",
+       "    })\n",
+       "})"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "dataset = create_dataset(root_dir)\n",
+    "dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3aa7de84",
+   "metadata": {},
+   "source": [
+    "Then save processed datasets to parquet."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cd6b20bc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "os.makedirs(save_dir, exist_ok=True)\n",
+    "for split in dataset:\n",
+    "    dataset[split].to_parquet(os.path.join(save_dir, f\"{split}.parquet\"))\n",
+    "    print(f\"Saved {split} split to {save_dir}/{split}.parquet\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f63933c8",
+   "metadata": {},
+   "source": [
+    "Load from processed datasets to do whatever you want."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "4af7f346",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Dataset({\n",
+       "    features: ['path', 'image', 'mask'],\n",
+       "    num_rows: 954931\n",
+       "})"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import os\n",
+    "from datasets import load_dataset\n",
+    "\n",
+    "trainset = load_dataset(\"parquet\", data_dir=save_dir, split=\"train\")\n",
+    "trainset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b3c84f0a",
+   "metadata": {},
+   "source": [
+    "Since the forged components are usually smaller in proportion compared to the real ones, this leads to class imbalance.\n",
+    "For optimal training performance, hyper parameters such as `pixel_forge_weight` and `cls_forge_weight` in `src.loupe.configuration_loupe.LoupeConfig` must be appropriately configured. These parameters control the weights of forged pixels and forged images.\n",
+    "\n",
+    "Once suitable parameters are found using the following code snippet, you can set them in `configs/model/cls.yaml` or `configs/model/seg.yaml`.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "40a5ec91",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "19d416f59f20464692ee95bddefdaded",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Computing mask stats (num_proc=8):   0%|          | 0/5000 [00:00<?, ? examples/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "cls_forge_weight: 0.16920000000000002\n",
+      "patch_forge_weight: 0.9294853830073696\n",
+      "pixel_forge_weight: 0.9160308902282281\n"
+     ]
+    }
+   ],
+   "source": [
+    "import numpy as np\n",
+    "from PIL import Image\n",
+    "from tqdm.notebook import tqdm\n",
+    "\n",
+    "cls_forge_weight: float  # the ratio of forged images to total images.\n",
+    "# the ratio of forged patches to total patches across all images.\n",
+    "patch_forge_weight: float\n",
+    "# the ratio of forged pixels to total pixels across fake images.\n",
+    "pixel_forge_weight: float\n",
+    "\n",
+    "num_subset_samples = min(5000, len(trainset))\n",
+    "subset = trainset.shuffle().select(range(num_subset_samples))\n",
+    "image_size, patch_size = 336, 14\n",
+    "\n",
+    "\n",
+    "def compute_mask_stats(example):\n",
+    "\n",
+    "    if example[\"mask\"] is None:\n",
+    "        return {\n",
+    "            \"is_forge\": 0,\n",
+    "            \"forge_pixel_sum\": 0.0,\n",
+    "            \"total_pixel_count\": 0,\n",
+    "            \"forge_patch_sum\": 0.0,\n",
+    "        }\n",
+    "\n",
+    "    mask = example[\"mask\"].convert(\"L\").resize((image_size, image_size), Image.NEAREST)\n",
+    "    mask_np = np.array(mask, dtype=np.float32)\n",
+    "\n",
+    "    if mask_np.max() != mask_np.min():\n",
+    "        mask_np = (mask_np - mask_np.min()) / (mask_np.max() - mask_np.min())\n",
+    "    else:\n",
+    "        mask_np[:] = 0.0\n",
+    "\n",
+    "    forged_pixel_sum = mask_np.sum()\n",
+    "    total_pixels = mask_np.size\n",
+    "\n",
+    "    reshaped = mask_np.reshape(\n",
+    "        image_size // patch_size, patch_size, image_size // patch_size, patch_size\n",
+    "    )\n",
+    "    patches = reshaped.transpose(0, 2, 1, 3)\n",
+    "    forged_patch_sum = (patches != 0).sum(axis=(2, 3)) / (patch_size * patch_size)\n",
+    "    forged_patch_sum = forged_patch_sum.sum()\n",
+    "\n",
+    "    return {\n",
+    "        \"is_forge\": 1,\n",
+    "        \"forge_pixel_sum\": forged_pixel_sum,\n",
+    "        \"total_pixel_count\": total_pixels,\n",
+    "        \"forge_patch_sum\": forged_patch_sum,\n",
+    "    }\n",
+    "\n",
+    "\n",
+    "processed = subset.map(compute_mask_stats, num_proc=8, desc=\"Computing mask stats\")\n",
+    "\n",
+    "num_forge_images = sum(processed[\"is_forge\"])\n",
+    "num_forge_pixels = sum(processed[\"forge_pixel_sum\"])\n",
+    "num_total_pixels = sum(processed[\"total_pixel_count\"])\n",
+    "num_forge_patches = sum(processed[\"forge_patch_sum\"])\n",
+    "num_total_patches = len(processed) * (image_size // patch_size) ** 2\n",
+    "\n",
+    "cls_forge_weight = 1 - num_forge_images / len(processed)\n",
+    "patch_forge_weight = 1 - num_forge_patches / num_total_patches\n",
+    "pixel_forge_weight = 1 - num_forge_pixels / num_total_pixels\n",
+    "\n",
+    "print(\"cls_forge_weight:\", cls_forge_weight)\n",
+    "print(\"patch_forge_weight:\", patch_forge_weight)\n",
+    "print(\"pixel_forge_weight:\", pixel_forge_weight)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "loupe2",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}