v2ecoli: Whole-Cell E. coli via Process-Bigraph¶
Available since v0.9.0
v2ecoli is a whole-cell model of
Escherichia coli built entirely on process-bigraph.
It composes ~55 biological processes — metabolism, transcription, translation,
replication, chromosome condensation, cell division, and more — into a single
runnable Composite. Atlantis provides a production path to run v2ecoli
simulations on HPC without any local installation.
Quick Start¶
Single cell¶
uv run atlantis compose ecoli \
--duration 60 \
--seed 0 \
--poll \
--base-url https://sms.cam.uchc.edu
This runs 60 seconds of biological time and polls until the SLURM job completes. Note your simulation ID from the output, then download results:
uv run atlantis compose results <SIM_ID> \
--dest ./ecoli_output \
--base-url https://sms.cam.uchc.edu
All options¶
Option |
Default |
Description |
|---|---|---|
|
|
Biological simulation time (seconds) |
|
|
Random seed for stochastic processes |
|
|
Execution timestep (seconds) |
|
|
JSON list of feature modules, e.g. |
|
|
Absolute path to ParCa cache inside the container |
|
off |
Wait for completion and print final status |
|
|
API server URL |
Colony Simulations¶
A colony simulation runs the same E. coli model with multiple independent random seeds, producing an ensemble that captures cell-to-cell variability arising from stochastic processes (transcription bursting, division asymmetry, etc.).
Run a colony from the CLI¶
The simplest approach is a shell loop — each seed is an independent SLURM job that runs in parallel on the cluster:
BASE_URL=https://sms.cam.uchc.edu
DURATION=120
SEEDS=10
for SEED in $(seq 0 $((SEEDS - 1))); do
SIM_ID=$(uv run atlantis compose ecoli \
--duration $DURATION \
--seed $SEED \
--base-url $BASE_URL \
2>/dev/null | grep "Simulation ID" | awk '{print $NF}')
echo "Submitted seed $SEED → sim $SIM_ID"
done
All 10 simulations run concurrently on the cluster. Poll any one of them:
uv run atlantis compose status <SIM_ID> --base-url $BASE_URL
Download colony results¶
for SIM_ID in 101 102 103 104 105 106 107 108 109 110; do
uv run atlantis compose results $SIM_ID \
--dest ./colony/seed_$((SIM_ID - 100)) \
--base-url $BASE_URL
done
Each seed_N/ directory will contain results.zip with the simulation output
(final_state.json, time-series data).
What varies between seeds¶
Stochastic variation between seeds comes from:
Process |
Source of randomness |
|---|---|
Transcription |
Poisson-distributed mRNA synthesis events |
Translation |
Stochastic ribosome binding and elongation |
Replication |
Probabilistic origin firing |
Cell division |
Stochastic partitioning of molecules to daughters |
Metabolism |
Flux variability from stochastic enzyme availability |
Setting --seed 0 through --seed N-1 gives you reproducible, independently
varying trajectories for each simulated cell.
Feature flags¶
Enable biological sub-models that are off by default:
uv run atlantis compose ecoli \
--duration 60 \
--seed 0 \
--features '["ppgpp_regulation", "rna_attenuation"]' \
--poll \
--base-url https://sms.cam.uchc.edu
Available features are defined in the v2ecoli source tree under
v2ecoli/experiments/.
How It Works Under the Hood¶
When you call atlantis compose ecoli, the following pipeline runs:
1. Process-bigraph document generation¶
The API generates a .pbg document that configures the v2ecoli Composite:
{
"state": {
"v2ecoli": {
"_type": "process",
"address": "local:v2ecoli.composite.make_composite",
"config": {
"seed": 0,
"cache_dir": "/out/cache",
"features": []
},
"interval": 1.0
}
}
}
2. Container auto-generation¶
A Singularity .def file is generated by
pbest with v2ecoli injected as an
extra pip dependency:
Bootstrap: docker
From: ghcr.io/astral-sh/uv:python3.12-bookworm
%post
pip install process-bigraph pbsim-common
pip install git+https://github.com/vivarium-collective/v2ecoli.git
# vEcoli pulled in transitively
The definition is content-hashed. If a container with the same hash already exists on HPC, the build is skipped entirely (~0s). Otherwise, a SLURM build job runs (~15 minutes for first build).
3. Runner script¶
A Python runner is generated and uploaded to the experiment directory:
from v2ecoli.composite import make_composite
composite = make_composite(
cache_dir='/out/cache',
seed=0,
features=[],
)
composite.run(60.0)
make_composite is a factory function (not a class) — it assembles the full
bigraph of ~55 biological process instances and returns a runnable Composite.
4. SLURM dispatch¶
singularity exec \
--bind /experiment:/experiment \
--bind /projects/SMS/sms_api/prod/compose/cache:/out/cache \
/path/to/container.sif \
python /experiment/v2ecoli_run.py
5. Results¶
Output is written to /experiment/output/ inside the container (bind-mounted
from HPC filesystem) and zipped to results.zip for download. Primary outputs:
final_state.json— complete cell state at end of simulation (~14 MB typical)Time-series data for observed quantities (if observables specified)
The Biological Processes¶
v2ecoli decomposes whole-cell behavior into 55 Process and Step instances.
Key subsystems:
Subsystem |
Processes |
Biology |
|---|---|---|
Gene expression |
Transcription, Translation, RnaMaturation |
mRNA synthesis, protein synthesis, RNA processing |
Metabolism |
Metabolism, PolypeptideElongation |
Flux balance analysis, elongation rates |
Regulation |
TfBinding, TfUnbinding, EquilibriumModel |
Transcription factor dynamics |
Replication |
ChromosomeCondensation, ChromosomeReplication |
DNA replication fork tracking |
Division |
Division, BulkDivision |
Cell division and molecule partitioning |
Signaling |
TwoComponentSystem |
Histidine kinase signal transduction |
Structure |
Complexation |
Protein complex assembly |
Each process declares typed ports and wires to shared stores via the
process-bigraph wiring layer. The allocate_core() function registers all
of them into a link_registry at API startup — which is also what powers
the Process Runtime endpoints.
ParCa Cache¶
v2ecoli requires a pre-computed Parameter Calculator (ParCa) cache:
File |
Contents |
|---|---|
|
Baseline cell state (molecule counts, growth rates) |
|
Serialized configs for all 55 biological processes |
|
Source file hash for staleness detection |
The cache is hosted on the HPC filesystem at
/projects/SMS/sms_api/prod/compose/cache/ and bind-mounted at /out/cache
inside the container.
Warning
The cache must be regenerated whenever the v2ecoli package is updated.
A stale cache causes a StaleCacheError on simulation startup. The
verify_cache_version() check in v2ecoli catches this and gives a clear
error message before the simulation attempts to run.
REST API Reference¶
Method |
Path |
Description |
|---|---|---|
|
|
Submit v2ecoli simulation |
|
|
SLURM job status |
|
|
Download results ZIP |
|
|
Retrieve PBG document used |
Submit request body¶
{
"duration": 60.0,
"seed": 0,
"interval": 1.0,
"features": [],
"cache_dir": "/out/cache"
}
Response¶
{
"simulation_database_id": 42,
"simulator_database_id": 7,
"status": "submitted"
}