Multiscale Analysis for N-dimensional Transcriptome Alignment
General TODOs
Working Conda env with all necessary packages
Fast way to create multi-scale voxels
Binning of transcriptome data into voxels (FAST)
- The goal is to do this really fast. This should absolutely not take more than a few seconds.
Voxelization pipeline (already in package format!)
Voxel representation of transcriptome data (i.e. allow for custom or built-in method to collect and voxel features in a modular format)
Find/discuss potential spatial embedding methods
- GraphPCA
- It is extremely slow (even though the authors claim linear complexity)
- Randomized Spatial PCA → ❌ Not yet tested
- STAMP - Topic Modeling
- Does not work on log-normalized counts 😠
- Novae - Foundation Model
- Uses pre-trained foundation model based on organism (ergo, for drosophila we will have to retrain the entire foundation model, based on the collected data!)
Figure out rigid alignment of voxel representations
- Generalized Procrustes Analysis
- Affine transform (
scikit-learn)
Mathematical derivation of all necessary models
- What types of models!
Discuss how we can extend the alignment to better integrate with gene regulatory task;
- Integration with single-cell multiome data (post alignment) → this would allow to map single-cell data to the 4D spatiotemporal alignment and be a “cheap” way of obtaining both spatial single-cell RNA and ATAC.
- What gene-regulatory related questions can we answer?
Methods
TODO → 13/02/2025
Expand 2D-based RANSAC + Kabsch-Umeyama to 3D
- Does rotation matrix in 3D hold up? ⇒ YES
- Do we need to resort to quaternions? ⇒ NO
Automatic voxelization determination
Proper error computation
Fix iterative refinement
Optimalization
GIF of alignment
Voxel-match plotting
Start working on/discussing the non-rigid alignment
Thesis Part - Existing registration methods
Thesis Part - Introduction
Thesis Part - Benchmark
TODO → 06/02/2025
Find voxel correlation algorithm
- Find where voxels should be rigidly aligned to. Some ideas;
- Best correlator (can not handle sparse voxelization!)
- Weighted average of k-best correlators
- Move voxel rigidly towards the found position
- Define maximal neighbourhood depending on voxelization scale (probably one-neighbours ⇒ 27 voxels to check)
See if this works better with the integration of a spatial embedding such as Novae.
- What should the number of Novae domains be?
- Can Novae be extended to 3D?
- What other limitations does Novae have?
TODO → 30/01/2025
Voxelization
- New AnnData object
- Keep center transform in metadata
- Assign cells to voxels (keep in metadata)
- Find good resolution range ⇒ not powers of two, voxels too big in this case!
Mouse Brain Segmentation
- Segment Mouse Brain into 2 datasets (→ Make 3D reconstruction of 200µm slices, 68 slices per dataset)
- Try interpolation in z-axis
- Linear interpolation
- SpatialZ
Manuscript
- Start creating custom Legrand-inspired template
- Make sure to have
- list of tables
- list of figures
- list of abbreviations
- custom TOC
- custom headers (chapter-wise) → full header?
- custom headers (section-wise) → number on side
- custom part indicator (no TOC, overdone!)
- use CMU-bright
- proper BibTex
- Port existing manuscript over