JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organisers at EO4Society.Conf@esa.int.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Please note that all times are shown in the time zone of the conference. The current conference time is: 18th June 2026, 02:57:11pm CEST

Daily Overview

Session

Import to your local calendar

Session 4 - Artificial Intelligence/Machine Learning contribution to signal processing - part II

Time:

Tuesday, 12/May/2026:

2:15pm - 3:30pm

Session Chair: Michela Corvino, European Space Agency
Session Chair: Thibault Taillade, European Space Agency

Location: Big Hall

Presentations

2:15pm - 2:30pm
ID: 170

*Key Note - Earth Intelligence in the age of Foundation Models

Mikolaj Czerkawski

Co-founder and Partner Scientist at Asterisk Labs

2:30pm - 2:45pm
ID: 130

From Temporal Stacking to Single-Pass Inference: A Deep Learning Framework for SAR Denoising with Real Sentinel-1 References

Salvatore Prochilo, Andrea Cavallini, Vito Moliterni

Starion Group Italia S.p.A., Italy

Deep learning has become a powerful approach for improving Synthetic Aperture Radar (SAR) imagery. However, its ability to reduce speckle is limited by a persistent data bottleneck caused by the lack of clean reference images that are necessary for supervised training. Existing methodologies rely on synthetically speckled optical images, multi-look approximations, or self-supervised strategies, which often lead to domain gaps or compromise reconstruction fidelity. To address this limitation, this study introduces a methodology to generate high-quality pseudo-clean reference images directly from genuine Sentinel-1 data, thereby enabling the construction of reliable training targets derived from authentic SAR observations. Building upon this data foundation, a dedicated deep learning framework is developed to perform noisy reduction through supervised training of convolutional neural networks on authentic SAR imagery.

Reference images are constructed through a pipeline that collects temporally separated Sentinel-1 Ground Range Detected (GRD) acquisitions over the same geographic area, aligns them via subpixel coregistration, and aggregates them using temporal statistics. The aggregation suppresses spatially uncorrelated noise components (including speckle, thermal noise, and system-level disturbances) while preserving persistent structures and radiometric properties. Nonetheless, this multi-temporal process requires multiple acquisitions and significant computational resources to denoise each image. Consequently, aggregated references are employed exclusively as targets for offline training. Once trained on the resulting noisy-clean pairs, the network can denoise any individual SAR acquisition through a single forward pass, thereby combining the reconstruction quality of temporal averaging with the computational efficiency of neural inference.

The proposed framework evaluates encoder-decoder, residual, and attention-based architectures. The training procedure involves patch sampling, data augmentation, and reconstruction loss functions. Preliminary results demonstrate a Peak Signal-to-Noise Ratio (PSNR) of approximately 30 dB and a Structural Similarity Index Measure (SSIM) of approximately 0.85 on Sentinel-1 test datasets. The assessment of the Equivalent Number of Looks (ENL) on real SAR scenes confirms effective noise suppression while preserving structural details. Visual inspection corroborates that edges, linear features, and fine textures are accurately maintained.

By using exclusively publicly accessible Sentinel-1 data, the methodology scales naturally across diverse geographical regions and acquisition conditions, benefiting downstream SAR applications such as target detection, land-cover classification, and change monitoring. While the current evaluation focuses on Sentinel-1 GRD data, the pipeline is sensor-independent and readily extensible to other SAR missions. This framework offers a practical and reproducible pathway for training high-performance deep learning denoisers on authentic SAR imagery, bridging the gap between synthetic benchmarks and real-world deployment.

2:45pm - 3:00pm
ID: 157

A Modular Motion-Aware Framework for AI Target Recognition in VHR SAR Imagery exploiting ISAR and Micro-Doppler Features

Federico Marmoreo¹, Massimo Zavagli¹, Francesco Vecchioli¹, Carmine Clemente², Michela Corvino³, Mario Costantini¹

¹B-Open, Rome, Italy; ²University of Strathclyde, Glasgow, UK; ³European Space Agency (ESA) Esrin, Frascati, Italy

Maritime Domain Awareness (MDA) increasingly relies on spaceborne Very High Resolution (VHR) Synthetic Aperture Radar (SAR) for all-weather monitoring. However, the complex dynamics of non-cooperative vessels severely challenge conventional classification algorithms. Target motion during long coherent integration times introduces severe defocusing and azimuth smearing. Concurrently, vibrational micro-motions induce micro-Doppler (m-D) effects. While traditionally viewed as a source of degradation, these m-D signatures contain highly discriminative information regarding a vessel’s mechanical characteristics that remains largely unexploited in standard intensity-based AI models.

To address these limitations and advance moving target analysis, we propose a modular, physics-aware processing framework. Rather than relying on monolithic end-to-end models, our pipeline explicitly exploits motion-related information. Our proposed dual-branch deep learning architecture integrates spatial recovery and dynamic feature extraction prior to AI classification.

The pipeline begins with robust target detection and segmentation using a Mask R-CNN architecture equipped with a Feature Pyramid Network. Trained on the HRSID dataset using custom motion-blur augmentation, the module achieves a Box mAP of 0.671, demonstrating performance comparable to state-of-the-art SAR vessel detectors.

Following detection, the framework applies a unified Doppler Phase Coherence (DPC) approach, building on recent maritime ISAR advancements. Applied to Single Look Complex (SLC) VHR SAR data, DPC estimates translational motion parameters to mitigate blur. This successful geometric recovery is confirmed in several tests.

Subsequently, the same DPC framework isolates dynamic m-D vibrational signatures from the refocused target. The time-frequency analysis successfully reveals clear harmonic patterns corresponding to the vessel's mechanical vibrations.

Preliminary validation using real VHR SLC SAR data from Umbra, Capella Space, and TerraSAR-X confirms that restoring spatial resolution and isolating dynamic signatures provides highly reliable, multimodal inputs for the downstream fusion network. This motion-aware preprocessing establishes a robust foundation for operational maritime classification in complex Maritime Domain Awareness (MDA) scenarios.

3:00pm - 3:15pm
ID: 110

Towards SAR–Optical Correspondence Without 3D Models: Learning from LIDAR Projections in Very High Resolution Urban Images

Elise COLIN, Aurélien PLYER

ONERA Palaiseau, DTIS, Université Paris Saclay, France

Very high resolution Synthetic Aperture Radar (SAR) imagery offers unparalleled sensitivity to geometric structure and material properties and is a modality of choice for operational change detection, including rapid urban map updating, and post-disaster monitoring, where cloud cover, smoke, or night-time conditions can severely limit optical acquisitions. Yet, its interpretation in dense urban environments remains profoundly challenging. At sub-metric resolution, radar-specific geometric distortions—such as layover, shadowing, and multipath effects—induce complex radiometric signatures that hinder direct comparison with optical imagery. Consequently, high-resolution radar products are often underexploited, despite their unique all-weather, day–night acquisition capabilities.

This work proposes a supervised learning framework to establish a dense correspondence between Very high resolution SAR and optical imagery in urban areas, by using three-dimensional (3D) LIDAR point clouds as geometric priors. Airborne LIDAR data provided by the Institut national de l'information géographique et forestière (IGN) serve as a metrically accurate 3D reference. Each LIDAR point is first projected into TerraSAR-X Spotlight imagery and subsequently into orthorectified optical imagery. Through this dual projection, every 3D point is associated with zero, one, or multiple radar pixels within a local SAR patch, and similarly with one or several optical pixels in the corresponding optical patch. The resulting multi-view associations define a supervised training set encoding the intrinsic geometric relationships between modalities.

A learning model is then trained to infer pixel-level correspondences from these LIDAR-anchored mappings, implicitly capturing the nonlinear distortions induced by radar imaging geometry. The approach aims to learn a transferable representation capable of predicting SAR–optical correspondences even in the absence of an explicit 3D model. Ultimately, this methodology seeks to enhance the interpretability and valorization of adar imagery, facilitating multimodal data fusion, urban object reconstruction, and cross-sensor information transfer in complex environments.

3:15pm - 3:30pm
ID: 144

Beyond Static Imagery toward Burst Video Mode in Very High Resolution imaging.

Pierre-Olivier Vanberg, Adrien Descamps, Nicolas Dourt

aerospacelab, Belgium

Recent advances in Artificial Intelligence are transforming Earth Observation not only as an analytical tool, but as an integral component of the imaging instrument itself. This contribution presents an AI-enabled processing pipeline for Very High Resolution (VHR) optical satellites that rethinks traditional still-image acquisition.

Building on burst acquisition and Time Delay Integration (TDI), we introduce Burst Video Mode, a purely software-based capability that preserves the temporal dimension of VHR imaging. Rather than collapsing sequential frames into a single static image, the pipeline jointly delivers a radiometrically superior image and a temporally coherent short video.

Conceptually, this approach is analogous to “Live Photo” (iOS) or “Motion Photo” (Android) technologies in consumer photography, which capture short motion sequences around a single image, but here extended to provide significantly greater scientific and operational value.

Advanced AI models for video denoising, optical-flow-based alignment, and deblurring are leveraged to exploit inter-frame redundancy. This approach improves noise suppression, motion-aware deblurring, and geometric consistency beyond what is achievable with single-frame processing. The resulting video reveals dynamic scene information, parallax cues, and acquisition-time evolution that are otherwise lost. Importantly, the method requires no hardware changes and is compatible with existing push-frame VHR missions, subject only to SNR constraints.