Partial alignment of multislice spatially resolved transcriptomics data (PASTE2)

Final Notes - The idea is good but the execution much less so. The authors claim that alignment is improved by using histology images, which are not available to us. Next, while the method does allow for 3D reconstruction, it does not take into account any deformations. Furthermore, the paper does not provide any measurements in terms of performance. This will need to be tested to draw final usability conclusions in terms of scalability and performance.

Additional Material -

Partial Optimal Transport with applications on Positive-Unlabeled Learning

Optimal Transport for structured data with application on graphs

Code is available on GitHub.

Abstract - The main idea of PASTE(X) is to partially (?) align slices of ST experiments using a novel Gromov-Wasserstein OT formulation using a conditional gradient algorithm.

⚠️ What immediately comes to mind is the scalability/performance of the algorithm. OT is expensive and even regularized algorithms like Sinkhorn.

Introduction

The authors mention the need for more powerful ST analysis, breaking ground in three dimensions rather than 2. This enables downstream tasks such as 3D spatial expression analysis, 3D cell-cell communication, and 3D clustering.

Alignment of 2D ST slices comes with a bunch of challenges, such as morphological differences.

Authors mention several different approaches;

Some of which use spatial information exclusively (ideally what we’re looking for) - I should take a look at:

Seurat
SCOT
Pamona

It is also mentioned that techniques from other branches such as fMRI alignment often do not work due to manual labour or difficulty in extending the methods.
✅ This paper provides a good trampoline to explain why certain approaches will NOT work.

Methods

An ST image is often describes by a tuple where and . The matrix holds the expression profiles of observations (i.e. spots). The matrix holds the 2D location of each spot. Together, they describe the spot location and its expression profile.

We can extend this formulation by introducing a pairwise distance matrix where represents the distance between spots and .

There exists a distribution over all the spots in the slice.

Partial Pairwise Slice Alignment Problem

We now introduce two ST images and . The probabilistic mapping (note, the number of observations need not be the same!) describes the probability that spot in the first image is assigned to spot in the second.

PASTE2’s precursor, PASTE, finds optimizes the objective function that is composed of an expression similarity summand and a spatial distance summand:

( is a weighting term to favour either the expression profile or the Euclidean distance). The first summand is called the Wasserstein distance, representing the cost of moving one unit of probability mass from each spot to each spot , with the cost being the gene expression dissimilarity between spot. The spatial summand is called the Gromov-Wasserstein distance, designed to preserve intraspatial distances. Together, they form the FGW optimal transport objective (Titouan V. et al, Optimal Transport for structured data with application on graphs).

The limitation of PASTE is the original constraint, where it is necessary that all spots in one image map to all spots in another. This is not always the case, and PASTE2 therefor reconsiders the necessary constraints. The authors consider the following set of constraints for the transport plan :

Here, is the overlap percentage and is effectively a hyperparameter (❌This is bad for our use case).

An iterative conditional gradient algorithm for optimization

Mathematical formulation of the problem does not matter for the general understanding (only once the thesis has to be written).

Using histological image data in alignment

We do not have any additional information and thus do not substitute the gene expression (dis)similarity summand.

3D Reconstruction based on the partial alignment matrix

The authors compute the center of each image to account for partial alignments.