← Back to blog

Study Guide to Virtual Cell

virtual cellfoundation modelssingle-cell biologystudy guide

The idea of a Virtual Cell has been gaining momentum: build foundation models trained on massive single-cell atlases so we can simulate cellular behavior in silico. It’s ambitious, and the literature is growing fast. This post is my attempt to organize the key papers and concepts into a coherent study path.

RegulatoryGen Papers

Before diving into foundation models, it helps to understand the regulatory genomics landscape these models are trying to capture. These papers establish core concepts around how genetic variation shapes gene expression and disease.

Early Generative & Perturbation Attempts (2019 – 2021)

A key step toward virtual cells was learning to predict how cells respond to perturbations — before we had large foundation models. These early works showed that generative models, especially variational autoencoders, could capture meaningful biological variation in latent space.

  • scGen predicts single-cell perturbation responses — Lotfollahi, Wolf & Theis (2019), Nature Methods. Pioneering work that uses a variational autoencoder (VAE) combined with latent space vector arithmetics to predict single-cell perturbation responses. By learning a shared latent representation of cells, scGen can extrapolate how unseen cell types would respond to a perturbation — without requiring matched perturbed/unperturbed data for every cell type. Demonstrated cross-species transfer (mouse → human) and out-of-sample cell type prediction.

Under construction…