# CNRS Défi Imag’In Cavalieri

### CAlcul des Variations pour L'Imagerie, l'Edition et la Recherche
d'Images

Workshop
on Optimal Transport and Optimization in Imaging

### Tuesday 11th

*9h15-9h30: Welcoming*

*9h45-10h30: C.-B.
Schönlieb and **Y.
Boink.**
Combined modelling of optimal transport and segmentation.*
Abstract

For studying vascular structures in 4D biomedical imaging, it is of
great importance to automatically determine the velocity of flow in
video sequences, for example blood flow in vessel networks. In this
thesis new optimal transport models focusing on direction and
segmentation are investigated to find an accurate displacement between
two density distributions. By incorporating fluid dynamics
constraints, one can obtain a realistic description of the
displacement. With an a-priori given segmentation of the network
structure, transport models can be improved. However, a segmentation
is not always known beforehand. Therefore, in this work a joint
segmentation-optimal transport model has been described. Other
contributions are the ability of the model to allow for inflow or
outflow and the incorporation of anisotropy in the displacement cost.
For the problem, a convex variational method has been used and
primal-dual proximal splitting algorithms have been implemented.
Existence of a solution of the model has been proved. The framework
has been applied to synthetic vascular structures and real data,
obtained from a collaboration with the applied mathematics and the
hospital in Cambridge.

This is joint work with Yoeri Boink and Christoph Brune.

Coffee
break

* *

*11h00-11h45: ** **N.
Bonneel.**
Wasserstein Barycentric Coordinates.*
Abstract

*
*

This talk will expose a notion of barycentric coordinates for
histograms via optimal transport. This is formulated as the problem
of finding the Wasserstein barycenter of several basis histograms
that best fits an input histogram according to some arbitrary loss
function. A Wasserstein barycenter is a histogram in-between the
basis histograms according to the optimal transport metric ; we use
this geometry to project an input histogram onto the set of possible
Wasserstein barycenters. We developed an efficient and robust
algorithm from an automatic differentiation of the Sinkhorn
iterative procedure.
We illustrate our algorithm with applications in computer graphics,
such as color grading an input image using a database of
photographs, or filling-in missing values in captured reflectance
functions or geometries based on similar exemplars.

*
** 11h45-12h15: E.
Cazelles. **Regularized
Wasserstein barycenter.* Abstract

The concept of barycenter in the Wasserstein space allows the
dention of a notion of Frechet mean of a set of probability
measures. However, depending on the considered data, such
barycenters may be irregular. In this paper, we thus introduce a
convex regularization of Wasserstein barycenters for random measures
supported on ℝ^{d}. We prove
the existence and uniqueness of such barycenters for a large class
of regularizing functions. A stability result of regularized
barycenters in terms of Bregman divergence associated to the convex
regularization term is also given. Hence, we study the case of data
made of i.i.d. random probability measures. In particular, we prove
the convergence in Bregman divergence of the regularized empirical
barycenter of a set of n random probability measures towards its
population counterpart, and we discuss its rate of convergence. This
approach is shown to be appropriate for the study of discrete or
absolutely continuous random measures. We then focus on the analysis
of probability measures supported on the real line. In this setting,
we propose an ecient minimization algorithm based on accelerated
gradient descent for the computation of regularized Wasserstein
barycenters. This approach is nally applied to the statistical
analysis of simulated and real data sets.

Lunch
break

*14h00-14h45: ** F. Iutzeler. **Modified
fixed points iterations and applications to randomized and
accelerated optimization algorithms.*
Abstract

Monotone operators have shown to be very interesting mathematical
objects and enabled the derivation of efficient algorithms. In this
talk, we will rely on simple fixed points iterations of an operator
with some contraction property enabling monotone convergence to some
fixed point. Then, we will see how deterministic and stochastic
variations of these iterations enable to derive randomized or
accelerated versions of popular algorithms such as the proximal
gradient or ADMM. More precisely, we will focus on i) randomization
to develop stochastic optimization algorithms; and ii) inertia to
accelerate the convergence.

*14h45-15h30: **
C. Boyer. **Adapting
to unknown noise level in super-resolution**.*
Abstract

We study sparse spikes deconvolution over the space of
complex-valued measures when the input measure is a finite sum of
Dirac masses. We introduce a new procedure to handle the spike
deconvolution when the noise level is unknown. Prediction and
localization results will be presented for this approach. An insight
on the probabilistic tools used in the proofs could be briefly given
as well.

*15h30-16h00: **
C. Poon. * *Geometric
properties of solutions to the total variation denoising problem.*
Abstract

Total variation (TV) denoising has been extensively studied in
imaging sciences since its introduction by Rudin, Osher and Fatemi
in 1992. However, the majority of theoretical works TV
regularization are actually concerned with solutions to total
variation denoising in the absence of noise. For instance, it is
known that the jump set of the regularized solutions are contained
in the jump set of the original data, but this knowledge is fairly
meaningless in the presence of noise since the noisy function can
introduce arbitrarily many new discontinuities. Furthermore, works
that consider the impact of noise on TV regularized solutions
typically derive integral estimates such as L^2 error bounds or
bounds on the total variation of the solution. However, such
estimates do not inform on the proximity of the gradient support of
the regularized solution to that of the clean function.

This talk is concerned with the impact of noise on the regularized
solutions. In particular, we consider stability in the gradient
support of solutions and address the following question: Given a
small neighbourhood around the support of Df, when will the gradient
support of the regularized solution be contained inside this
neighbourhood if the noise function has sufficiently small L^2 norm?

Coffee
break

*16h30-17h15: ** B. Wirth.**
Optimal design of transport networks.*
Abstract

*
*

Several applications in biology and engineering are concerned with
the optimal transportation of substances from source locations to
sink locations. As opposed to classical optimal transport, models
for transport networks take into account that it is more efficient
to transport material in bulk. The resulting optimal transport
networks typically have a branching structure. We discuss
different model formulations and their equivalence as well as the
geometry of (almost) optimal networks, which can be analyzed by
proving energy scaling laws in the regime of small preference for
bulk transport.

*
**17H15-17h45: **L. Chizat. **Scaling
algorithm for unbalanced optimal transport.*
Abstract*
*

This talk has three aims: (i) to describe the so-called unbalanced
optimal transport problem, (ii) to introduce a scaling algorithm
which generalizes Sinkorn's algorithm and solves this problem with
a linear convergence rate, and (iii) to showcase applications
which range from shape interpolation and colour transfer to the
simulation of very degenerate evolution PDEs.

This is joint work with Gabriel Peyré, Bernhard Schmitzer and
François-Xavier Vialard.

##

### Wednesday 12th

##

* *

*9h45-10h30: B.
Levy. **
**Semi-discrete
Optimal Transport in 3D: A Numerical Algorithm and Some
Applications.* Abstract

Semi-discrete algorithms seem to be a promising avenue for
developing efficient numerical methods for solving the
Monge-Ampère equation: Optimal Transport between a piecewise
linear density and a sum of Dirac masses can be directly
translated into a numerical algorithm, that maximizes a concave
function [Alexandrov, Aurenhammer, Mc Cann & Gangbo, Merigot].
The gradient and Hessian of this objective function involve
integrals over the intersection between a Laguerre diagram and the
tetrahedral mesh that supports the input source piecewise linear
density. I will present an efficient algorithm to compute these
integrals, as well as its use by a Newton-type solver
[Kitagawa,Merigot,Thibert], and some applications in fluid
dynamics [Gallouet Merigot].

Coffee break

*11h00-11h45: **A.
Gramfort. **Optimal
transport for statistics problem in neuroimaging.*
Abstract

*
*

Knowing how the Human brain is anatomically and functionally
organized at the level of a group of healthy individuals or
patients is the primary goal of neuroimaging research. Yet a
number of statistical challenges emerge when processing
functional brain imaging data such as functional MRI (fMRI),
electroencephalography (EEG) or magnetoencephalography (MEG).

One of them is computing an average of brain imaging data
defined over a voxel grid or a triangulation in order to
estimate a prototypical brain activation map for the
population. The problem is that data are large, the geometry
of the brain is complex and the between subjects variability
leads to spatially or temporally non-overlapping activation
foci. To address the problem of variability, data are commonly
smoothed before averaging, but this goes against the objective
of brain research to achieve better spatiotemporal resolution.
In this talk, I will show how using optimal transport and
Wasserstein barycenters one can average fMRI and MEG brain
activations without blurring the results. A specifity of the
approach is to be able to handle non-normalized data.

In a second part of the talk, I will present some work in
progress where we use optimal transport with entropic
smoothing and some capacity constraint to address the problem
of domain adaptation (DA) in machine learning. In biomedical
applications shifts in distributions between train data and
test data are quite common. For example test data could be
coming from a different machine or from a very different
cohort of patients. This is the problem that DA aims to solve
and we propose to use optimal transport for this.

*
**11h45-12h15: ** K.
Modin*. *The
polar decomposition as the limit of a lifted entropy
gradient flow.* Abstract

The work of Yann Brenier shows that optimal mass transport
(OMT) in the L^{2} sense is intrinsically tied to
polar decompositions. As a toy problem, one may consider OMT
in the category of multivariate Gaussian densities and the
associated polar decomposition A=PQ of real matrices. In this
talk I show that the P component can be obtained as the limit
of a gradient flow with respect to a lifted relative entropy
functional, thereby providing a new, geometric proof of the
polar decomposition of matrices. I shall also present a
corresponding infinite-dimensional lifted entropy gradient
flow on the space of convex functions. Here, the first
component in the polar decomposition of maps is an
equilibrium. Furthermore, I give a result hinting that the
infinite-dimensional flow has a unique limit in the smooth
category if the destination marginal is log-convex.

Lunch
break

*14h00-14h45: ** E.
Chouzenoux. * *Convergence
analysis of a stochastic majorize-minimize memory gradient
algorithm.* Abstract

Stochastic approximation techniques play a prominent role in
solving many large scale problems encountered in machine
learning or image/signal processing. In these contexts, the
statistics of the data are often unknown a priori or their
direct computation is too intensive, and they have thus to be
estimated online from the observations. For batch optimization
of an objective function being the sum of a data fidelity term
and a penalization (e.g. a sparsity promoting function),
Majorize-Minimize (MM) methods have recently attracted much
interest since they are fast, highly flexible, and effective
in ensuring convergence. The goal of this work is to show how
these methods can be successfully extended to the case when
the data fidelity term corresponds to a least squares
criterion and the cost function is replaced by a sequence of
stochastic approximations of it. In this context, we propose
an online version of an MM subspace algorithm and we establish
its convergence by using suitable probabilistic tools. We also
provide new results on the convergence rate of such kind of
algorithm. Numerical results illustrate the good practical
performance of the proposed algorithm associated with a memory
gradient subspace, when applied to both non-adaptive and
adaptive linear system identification scenarios.

*
**14h45-15h15: **
P. Tan. * *Accelerated
Alternating Descent Methods for Dystra-like Problems.*
Abstract

In this talk, we present an alternating descent scheme for
problems involving nonsmooth objectives with a quadratic
coupling of the variables. Our algorithm performs in each
variable several descent steps, which improves the
performances compared to a single step. Thanks to a FISTA-like
trick, this scheme can also be accelerated, An application of
this work is the implementation of a fast parallel solver for
the proximity operator of the Total Variation for color
images. In this case, this method can be interpreted as an
inexact scheme.

*15h15-15h45: **
Q. Denoyelle. * *Support Recovery for Sparse Super-resolution of Positive Measures.*
Abstract

We study sparse spikes super-resolution over the space of Radon measures on the real line or the torus when the input measure is a finite sum of positive Dirac masses using the BLASSO convex program. We focus on the recovery properties of the support and the amplitudes of the initial measure in the presence of noise as a function of the minimum separation of the input measure (the minimum distance between two spikes).

*15H45-16h15: ** M.
Schmitz. **Feature learning using Optimal Transport.*
Abstract

Accounting for and correcting an instrument’s Point
Spread Function (PSF) is a key element in many astronomical
missions, including the upcoming Euclid survey where one of
the key objectives is to perform galaxy shape measurements for
weak lensing. Because of the non-linear way in which the PSF
varies across the field of view, and the way in which Optimal
Transport distances account for geometric warping, using them
for PSF-related problems is a very promising prospect. Using
recent numerical schemes to compute approximate Wasserstein
distances, and thus Wasserstein Barycentres (Sliced
Wasserstein Barycentres and entropic penalty), we perform
feature learning on images, and apply it in particular to PSFs
and galaxy images.

Coffee
break and conclusion of the Workshop

##

##

##