BNMC Site Banner

22 March 2006

BNMC Stochastics Meeting



Meeting Goals

On 22 March 2006, the BNMC organized a small workshop for Caltech and closely allied people involved in stochastic simulation. The goal was to have participants meet and discuss what they are doing, learn from others who may be working on similar problems, possibly start new projects and collaborations, and generally help the BNMC shape its goals with respect to stochastic simulation theory and development.

Summary of Meeting Presentations

Mike Hucka opened the meeting with a brief introduction focused on the Beckman Institute’s BNMC itself. [Slides available in PDF format.] The BNMC is a collaboration between multiple Caltech groups and individuals, and its aim is to foster the sharing of experience and infrastructure, with the ultimate goal of bringing together biology, theory and computation in the service of research in biology. The BNMC is jump-starting its research efforts by using existing engagements in the areas of developmental modeling in plants (especially the Computable Plant project), software infrastructure for systems biology (especially SBML, BioModels.net, SBGN), and multiscale stochastic modeling (with Dan Gillespie, Linda Petzold, John Doyle, CACR, and others). The Research Activities area of the BNMC web site provides more information about these efforts.

John Doyle then summarized his and his group’s efforts in the simulation area. Dynamical models produce behaviors over time, and a simulation produces a sample path through the space of all possible behaviors. Simulation is always going to be a central tool because it is needed to generate these sample paths that can then be compared to actual data from the system being modeled. Howerver, simulations suffer from issues of scalability (simulations can be expensive, and you may need to do many of them). Moreover, they are not a good means of obtaining information about rare and catastrophic events, which is actually what often a researcher is actually most interested in finding out. Rare events in a cell’s lifetime might be, for example, a cellular event that leads to the cell becoming cancerous. Using simulations to find such rare events can be prohibitively expensive, so we need more tractable ways of searching for them. This is actually a common challenge throughout technology and biology.

On the simulation front: Many systems are multiscale in nature. In biological simulations, this happens when there are chemical constituents that are in small numbers in a cell as well as other chemical constituents that are in large numbers. The ones in small numbers often need to be simulated in a way that takes into account their individual and stochastic interactions, whereas the ones in large numbers can be simulated in a way that glosses over the details of individual molecular interactions in favor of a continuous deterministic description. This requires new theory and software. To address the question of efficient simulations of multiscale systems, John got Dan Gillespie and Linda Petzold together. They have been steadily producing new advances.

On going beyond simulation: John’s approach is focused on the idea that failures of robust behaviors indicate a weakness (fragility) in the system. The question then becomes identifying these failures. The Doyle group is focusing on using proofs, with the idea that given a description of the system and the problem, a short proof indicates robustness and a long proof may indicate fragility.


Grant Jensen then discussed his group’s work on three-dimensional imaging of cellular environments. He would like to be able to use simulation as a plausibility check of hypotheses about the systems’ construction. For example, they have been studying carboxysomes, which are structures in cyanobacteria that are believed to concentrate carbon, contributing to cyanobacteria’s ability to fixate carbon from the environment. Grant would like to model these carboxysomes in order to understand their structure and how they function. Pyruvate dehydrogenase is another cellular feature they study. Pyruvate dehydrogenase is an enzyme that catalyzes oxidative decarboxylation of pyruvate and is involved in mitochondrial metabolism. In different organisms, it is organized differently. For example, in some organisms diffusion is important in the process, whereas in others like E. coli, the enzymes are linked together and tethered to a structure in the cell. How do these differences affect the function?

Dan Gillespie then presented a summary of the stochastic simulation algorithm (SSA), as well as a framework that links stochastic discrete models to continuous deterministic formulations. His very lucid slides are available in PDF format.

Linda Petzold discussed her collaborative work with Dan and her group on stochastic simulation methods, starting with tau-leaping and moving towards hybrid methods. Tau-leaping is an approximate algorithm: whereas in SSA every reaction event is simulated, tau-leaping trades off some of that exactness to speed up execution by skipping some reaction events. Tau-leaping is actually exact when the propensity functions are constant, but in real models the propensity functions are usually not. The tau interval used in leaping is selected so that the propensity functions do not change “appreciably” in the time interval chosen. The tau-leaping algorithm tends towards the forward Euler method in the ODE and SDE regimes, but the explicit nature of the scheme makes it inefficient for stiff systems.

In stiff systems, there is a mixture of fast and slow time scales of the variables, with the fast time scales being stable. It is the slow reactions that determine the trajectory of the system, but if done naively, a simulation will have to use extremely short time steps (=> long simulation times) because of the fast time scales. The way this is approached in ODE solvers is to use implicit methods, and this is the basis of the implicit tau-leap algorithm. In this algorithm, the distribution of the fast species can still be recovered by taking a few small time steps whenever the information is needed. So far, implicit tau-leaping appears to represent the solutions of the slow time scales accurately, although they are still working on providing more theoretical justification.

Their ultimate goal is a hybrid method, in which slow reactions present in small numbers would be simulated using the SSA, and reactions where the constituents are all large populations would be simulated with a continuous deterministic approach. Reactions involving species present in moderate numbers could be treated with tau-leaping. Fast reactions involving species present in small numbers could be treated with a stochastic partial equilibrium approximation.

Linda’s group has been implementing their tests and making the code available in their StochKit stochastic simulation toolkit.

Eric Mjolness discussed his work on Stochastic Dynamical Grammars. The background for this is that biochemical reaction networks provide a paradigm for many dynamical systems in biology. The paradigm can be generalized to describe “variable-structure systems” in which objects larger than molecules (such as cells) also change in number and in their relationships over time. In the course of building mathematical and software tools for understanding such networks and dynamical systems, we have identified some interesting applied mathematical problems whose reformulation and solution would be very useful in current computational biology. For example, we can identify partial differential equations whose solution would be especially instructive for enzyme kinetics. These problems arise at the level of small reaction networks, multimolecular complexes, and the development of multicellular tissues. Developmental examples include modeling the shoot meristem of a plant.

A common mathematical framework for models at these different spatial scales can be given in terms of “dynamical grammars”. In a dynamical grammar, an input/output syntax for an elementary chemical or biological processes is mapped to an operator algebra expression for the generator of the temporal dynamics associated with that process. Many processes act simultaneously (in parallel) if their generators are summed. Contingent spatial relationships are expressed in terms of dynamical graph grammars, whose formulation could perhaps be improved by use of ideas from topology and differential geometry. By solving such problems, Eric’s group hopes to construct a useful modeling language of sufficient generality to describe multiscale, variable-structure dynamical systems that arise naturally in biology.

Philip DuToit described work he has been doing in Jerry Marsden’s group. They have been using variational integrators to study unraveling of DNA.


Mary Kennedy and her group have been studying studying biochemical signal transduction systems in central nervous system synapses. They have focused on a complex of signaling proteins, called the postsynaptic density (PSD), located just underneath excitatory receptors in the central nervous system. They have showed that it contains signal transduction molecules that can control the sensitivity of transmitter receptors, the size of receptor clusters, and perhaps the integrity of the adhesion junction that holds presynaptic terminals in place. Employing a combination of microchemical and recombinant DNA methods, they have also determined the structure of several proteins associated with the PSD. They are currently studying the associations of these proteins with each other and their specific roles in controlling synaptic transmission, with the ultimate goal being to illuminate the ways that synapses change their strength to encode memories.

Because many (they think most) of the important signaling molecules in the PSD have now been identified and sequenced, Mary’s group has turned to the question of how they work together to create the delicate mechanisms of synaptic plasticity. An aspect of their approach to this involves development of computer simulations of synaptic signaling to help understand how the large number of signaling molecules present at the synapse may work together. They have been using MCell, a software package for stochastic simulation of biochemical processes in vivo. MCell was originally developed to model the events at the neuromuscular junction: acetylcholine release, diffusion, binding to acetylcholine receptors and the nicotinic acetylcholine channel kinetics. However, in collaboration with the MCell group at the Salk Institute, Mary’s group has been extending it so that it can be used for modeling of other biochemical processes. MCell uses a stochastic approach to simulating diffusion of individual molecules (ions) and their reactions in an arbitrarily shaped space. It requires the number and positions of all molecules, diffusion constants, binding and unbinding constants for all reactions and a description of the geometry where the process that is simulated takes place. Diffusion is performed in MCell as a random walk. During one time step each diffusing molecule moves in a randomly chosen direction and travels a distance that is randomly chosen from the theoretical diffusion distribution for the given diffusion coefficient of the molecule and the time step. Therefore, contrary to the majority of random walk algorithms, MCell does not use a lattice for diffusion. Reactions are also treated as stochastic events. The probability that two individual molecules bind is calculated from the binding rate. Only the molecules that come close enough during diffusion, as determined by the ray tracing algorithm, can possibly bind. Dissociation probability is determined from the dissociation rate. Virtually any geometry within which diffusion and reaction take place can be specified, which gives MCell a significant advantage in simulation of in vivo processes over the traditional approaches based on partial differential equations.

Discussions

After the informal presentations, free-ranging discussions ensued. Some points raised include the following:

  • There seem to be two perspectives in this area: (1) assume that chemical substances can be treated as billiard balls, like a dilute gas, and (2) assume that the spatial environment is highly constrained spatially and take structure/space into account.
  • It is surely a mistake to assume that all spatially homogeneous problems can be described using the same language or models. Eric Mjolsness suggests we may be able to semantically integrate different descriptions.
  • We may want to organize work around some representative case studies, to focus discussions and continued research.
  • There are other stochastic simulation packages available and people working in the area, and also involving combining stochastic simulation with spatial geometry (like MCell does). These people were not really well represented here. Perhaps it would be a good idea to organize a workshop on the topic and invite others, such as Steve Andrews (Smoldyn), Steve Plimpton (ChemCell), Tom Bartol and Joel Stiles (MCell), and others.