DetectiumFire Multimodal Fire Understanding Dataset
Date
Publish URL
Paper URL
License
Non-Commercial
DetectiumFire is a dataset released in 2025 by Tulane University in collaboration with Aalto University, designed for tasks such as flame detection, visual reasoning, and multimodal generation. The related research paper is titled "...".DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire UnderstandingThe "Flame Scene" track has been included in the NeurIPS 2025 Datasets and Benchmarks Track, aiming to provide a unified training and evaluation resource for computer vision and vision-language models.
This dataset contains over 145,000 high-quality real-world fire images and 25,000 fire-related videos. In addition to real data, it includes 8,000 synthetic fire images generated using a diffusion model, and 12,000 carefully selected preference pairs from the RLHF process to enhance model alignment. It covers both real and synthetic flame and non-flame images and videos, accompanied by flame intensity, environmental information, text descriptions, and human preference annotations. The dataset consists of four parts: real images, real videos, synthetic flame images generated by the diffusion model, and human preference data based on pairwise comparisons. The synthetic images provide YOLO-formatted detection annotations, while the preference data records the human judgments regarding generation quality.
Dataset composition:
- Real images
- fire: Realistic flame images and YOLO format annotations
- non_fire: Difficult negative examples that do not contain flames but are easily confused (such as bright light, smoke, sunset).
- Real video (real_video)
- fire: Real video footage containing visible flames
- non_fire: Scenes without fire, used for robustness testing.
- Synthetic images
- stable_diff_v15/train: Image generation using SFT fine-tuning + YOLO annotation
- dpo_stable_diff_v15/train: DPO fine-tuning generated images + YOLO annotations
- Preference data (preference_dataset)
- preference.json: Comparison and interpretation of human preferences for paired generated images, used for RLHF/DPO training.

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.