Deep probabilistic programming
My research group focuses on Deep Probabilistic Programming (DPP)
, which combines
the modelling scope of deep learning with the principled treatment of uncertainity
from Bayesian statistics. We apply DPP to problems in bioinformatics, notably
protein folding and protein evolution, and other areas. We also
develop new inference methods for DPP, notably based on Stein variational
inference. My group is mostly using the deep probabilistic programming languages
Deep generative models of protein structure
A recent comment on protein structure
prediction in Nature Methods (2023)
stated that "distributions of conformations are the
future of structural biology"
. We are exploring the use of probabilistic programming,
deep generative models and directional statistics to generate conformational ensembles.
Bifrost is a deep generative model of protein structure fragments
, implemented in Pyro. It is
based on a Deep Markov Model, using directional statistics to represent dihedral protein angles.
The models allows sampling of conformational ensembles for given protein sequence fragments. One
of the ongoing projects of our group is to extend such a deep generative model to model
the conformational space of entire proteins.
Bifrost was published in ICML in 2021
and is currently used by the Danish AI-driven biotech company
to design multivalent vaccines based on epitope grafting.
The figure below shows conformational ensembles of 9-residue fragments sampled from Bifrost,
conditional on the sequence. The true structure is shown in yellow.
Deep generative models of proteins evolution
Another project we work on is deep generative models of protein evolution.
Draupnir, also implemented in Pyro, is a deep probabilistic model of protein evolution
that can be used for ancestral sequence reconstruction
. The model makes use of a tree-structured
Ornstein-Uhlenbeck process, obtained from a given phylogenetic tree, as
an informative prior for a variational autoencoder and allows for modelling
co-evolution of amino acid position (which most current models can't do).
Draupnir was published in ICLR in 2022
The figure below shows a phylogenetic tree colored by clade to the left,
and a two-dimensional projection of the corresponding latent variables representing
sequences at each node. The latent variables inferred from the leaf sequences
by Draupnir rfelect the shape of the phylogenetic tree well.
Stein variational inference for deep probabilistic programming
A distinct family of variational inference methods, called Stein variational inference (Stein VI)
are subject of increasing interest by the ML community due to their
enhanced flexibility as non-parametric particle-based methods. Their main advantage is that such
methods can represent the uncertainty over the neural network parameters
, even for large data
sets and models. The core of Stein VI lies in the use of a set of particles
as an approximating
(i.e. variational) distribution. As such, Stein VI is particularly adept at capturing the rich correlations
between latent variables. By virtue of its improved flexibility, Stein VI can sufficiently deal with non-
Gaussianity and multi-modality, resulting in more accurate approximations of the posterior distribution,
while maintaining the scalability of conventional VI methods. Stein VI approximates the Bayesian posterior
using a mixture of point estimates represented by a set of particles
, whereas SVI relies purely on single
point estimates of the neural network parameters. The optimisation routine then involves minimising the KL
divergence between the approximating and true posterior distributions via iterative updates of the particles’
positions according to the so-called Stein forces
(see figure). The Stein forces consist of two terms,
comprising an attractive force (+) pushing the particles towards the modes of the true posterior distribution, and a repulsive
force (-) keeping the particles from each other so that they do not collapse to the same mode.
My group is working on
Bayesian inference based on Stein mixtures
. In Stein mixtures,
each Stein particle serves as the parameters of a distribution,
as in a conventional mixture model. PhD student Ola Rønning has contributed
an implementation of Stein mixtures to Numpyro
High-performance deep probabilistic programming
We are exploring the use of the FUTHARK
language to make the estimation of deep models
of protein folding and other demanding applications of deep probabilistic programming
tractable. Futhark is a functional data-parallel programming
. FUTHARK can be compiled to very efficient parallel code,
running on either a CPU or GPU, and is intended to be used for small, performance-sensitive parts
of larger applications, typically by compiling a Futhark program to a library that can be
imported and used by applications written in conventional languages (in our case Python
and Pyro/Numpyro). The language is developed at DIKU at the
University of Copenhagen, in the PLTC section (which hosts my research group).
This work is done in collaboration with
Assoc. prof. Cosmin Oancea
Prof. Fritz Henglein