Command Palette
Search for a command to run...
MMFusion: Combining Image Forensic Filters for Visual Manipulation Detection and Localization
Triaridis Kostas ; Tsigos Konstantinos ; Mezaris Vasileios

Abstract
Recent image manipulation localization and detection techniques typicallyleverage forensic artifacts and traces that are produced by a noise-sensitivefilter, such as SRM or Bayar convolution. In this paper, we showcase thatdifferent filters commonly used in such approaches excel at unveiling differenttypes of manipulations and provide complementary forensic traces. Thus, weexplore ways of combining the outputs of such filters to leverage thecomplementary nature of the produced artifacts for performing imagemanipulation localization and detection (IMLD). We assess two distinctcombination methods: one that produces independent features from each forensicfilter and then fuses them (this is referred to as late fusion) and one thatperforms early mixing of different modal outputs and produces combined features(this is referred to as early fusion). We use the latter as a feature encodingmechanism, accompanied by a new decoding mechanism that encompasses featurere-weighting, for formulating the proposed MMFusion architecture. Wedemonstrate that MMFusion achieves competitive performance for both imagemanipulation localization and detection, outperforming state-of-the-art modelsacross several image and video datasets. We also investigate further thecontribution of each forensic filter within MMFusion for addressing differenttypes of manipulations, building on recent AI explainability measures.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-manipulation-detection-on-casia-v1 | Early Fusion | AUC: .929 Balanced Accuracy: .845 |
| image-manipulation-detection-on-casia-v1 | Late Fusion | AUC: .930 Balanced Accuracy: .860 |
| image-manipulation-detection-on-cocoglide | Early Fusion | AUC: .755 Balanced Accuracy: .660 |
| image-manipulation-detection-on-cocoglide | Late Fusion | AUC: .760 Balanced Accuracy: .677 |
| image-manipulation-detection-on-columbia | Late Fusion | AUC: .977 Balanced Accuracy: .822 |
| image-manipulation-detection-on-columbia | Early Fusion | AUC: .996 Balanced Accuracy: .962 |
| image-manipulation-detection-on-coverage | Early Fusion | AUC: .839 Balanced Accuracy: .770 |
| image-manipulation-detection-on-coverage | Late Fusion | AUC: .792 Balanced Accuracy: .720 |
| image-manipulation-detection-on-dso-1 | Late Fusion | AUC: .958 Balanced Accuracy: .830 |
| image-manipulation-detection-on-dso-1 | Early Fusion | AUC: .966 Balanced Accuracy: .935 |
| image-manipulation-localization-on-casia-v1 | Early Fusion | Average Pixel F1(Fixed threshold): .784 |
| image-manipulation-localization-on-casia-v1 | Late Fusion | Average Pixel F1(Fixed threshold): .775 |
| image-manipulation-localization-on-cocoglide | Late Fusion | Average Pixel F1(Fixed threshold): .574 |
| image-manipulation-localization-on-cocoglide | Early Fusion | Average Pixel F1(Fixed threshold): .553 |
| image-manipulation-localization-on-columbia | Early Fusion | Average Pixel F1(Fixed threshold): .888 |
| image-manipulation-localization-on-columbia | Late Fusion | Average Pixel F1(Fixed threshold): .864 |
| image-manipulation-localization-on-coverage | Late Fusion | Average Pixel F1(Fixed threshold): .641 |
| image-manipulation-localization-on-coverage | Early Fusion | Average Pixel F1(Fixed threshold): .663 |
| image-manipulation-localization-on-dso-1 | Late Fusion | Average Pixel F1(Fixed threshold): .899 |
| image-manipulation-localization-on-dso-1 | Early Fusion | Average Pixel F1(Fixed threshold): .869 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.