a month ago

Predicting Ground-Level Scene Layout from Aerial Imagery

Zhai Menghua Bessinger Zachary Workman Scott Jacobs Nathan

Abstract

We introduce a novel strategy for learning to extract semantically meaningfulfeatures from aerial imagery. Instead of manually labeling the aerial imagery,we propose to predict (noisy) semantic features automatically extracted fromco-located ground imagery. Our network architecture takes an aerial image asinput, extracts features using a convolutional neural network, and then appliesan adaptive transformation to map these features into the ground-levelperspective. We use an end-to-end learning approach to minimize the differencebetween the semantic segmentation extracted directly from the ground image andthe semantic segmentation predicted solely based on the aerial image. We showthat a model learned using this strategy, with no additional training, isalready capable of rough semantic labeling of aerial imagery. Furthermore, wedemonstrate that by finetuning this model we can achieve more accurate semanticsegmentation than two baseline initialization strategies. We use our network toaddress the task of estimating the geolocation and geoorientation of a groundimage. Finally, we show how features extracted from an aerial image can be usedto hallucinate a plausible ground-level panorama.

Code Repositories

viibridges/crossnet

Benchmarks

Benchmark	Methodology	Metrics
cross-view-image-to-image-translation-on-4	CrossNet	SSIM: 0.4147

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette