
To understand biology and cure disease, we must study the complex dynamical systems created by interactions between gene products, which collectively generate phenotype. We need to reason about the dynamical processes by which cells transform environmental cues into internal signals and convert them into actions that determine cell fate.
Two challenges stand out:
The research program we pursue in partnership with Vincent Danos (Edinburgh) revolves around an agent-based (or rule-based) view of biological molecules and their actions (*). This approach is analogous in spirit to how reactions are represented in organic chemistry, but more attuned to the needs of molecular biologists. In chemistry, the internal structure of molecules is expressed in a formal language. Chemical reactions are then codified in terms of rules describing how functional groups engage in specific transformations. A rule only describes the structural context required for an interaction, leaving the rest unspecified (for example the residue of an amino acid). We use a formal language - Kappa, originally proposed by Vincent Danos and Cosimo Laneve - to express proteins in terms of "sites" that represent interaction capabilities. Such capabilities carry "state", like binding (as in complex formation), any number of post-translational modifications, or information about localization. Rules then formally express empirically obtained facts about protein-protein interactions. They specify the state of sites only to an extent necessary for stipulating the conditions for interaction. Similar approaches have been taken before, most notably in BioNetGen, with the intent of using rules as an aid in automatically generating large systems of differential equations. Our philosophy differs in that we specify a system as a set of rules, analyze that set directly (deploying techniques from abstract interpretation), and use it to drive a stochastic simulation without ever writing an equation. The system of rules replaces the system of equations as the formal object to be analyzed. Indeed, preliminary representations of EGF signaling in terms of 300+ interaction rules would yield more differential equations than Avogadro's number!
The current implementation of kappa tools includes:
We apply these tools in the study of large and combinatorially complex signaling systems (such as EGF or mTOR), and exploit them in more theoretical studies aimed at understanding principles of molecular information processing.
This framework is being further developed (particularly in the hands of Jérôme Feret and Jean Krivine). Its robust implementation was made possible by bringing together a world-class team of computer scientists - Vincent Danos, Jérôme Feret, Jean Krivine, Russ Harmer, and many others - in the context of a venture-backed company, Plectix BioSystems, that WF founded in 2005.
(*)
The main conceptual significance of an agent-based view is the ability to track "agent lineages". For example, in an agent-based view, an enzyme-substrate complex is an entity that is explicitly represented as consisting of an enzyme agent and a substrate agent. In contrast, within the framework of differential equations, the compositional structure of an enzyme-substrate complex is not represented at all. The kinetic equations refer to the complex simply in terms of a variable (holding a concentration value) whose name is arbitrary and of no formal significance. If names of variables have any structure, then only as a mnemonic device. The practical significance of agent-based models consists in (i) breaking through the combinatorial barrier and (ii) codifying empirical facts as executable (because formal) rules that describe protein behaviors.