Thursday, March 29, 2012

Thermodynamic State Ensemble Models of cis-Regulation

Thermodynamic State Ensemble Models of cis-Regulation:
by Marc S. Sherman, Barak A. Cohen



A major goal in computational biology is to develop models that accurately predict a gene's expression from its surrounding regulatory DNA. Here we present one class of such models, thermodynamic state ensemble models. We describe the biochemical derivation of the thermodynamic framework in simple terms, and lay out the mathematical components that comprise each model. These components include (1) the possible states of a promoter, where a state is defined as a particular arrangement of transcription factors bound to a DNA promoter, (2) the binding constants that describe the affinity of the protein–protein and protein–DNA interactions that occur in each state, and (3) whether each state is capable of transcribing. Using these components, we demonstrate how to compute a cis-regulatory function that encodes the probability of a promoter being active. Our intention is to provide enough detail so that readers with little background in thermodynamics can compose their own cis-regulatory functions. To facilitate this goal, we also describe a matrix form of the model that can be easily coded in any programming language. This formalism has great flexibility, which we show by illustrating how phenomena such as competition between transcription factors and cooperativity are readily incorporated into these models. Using this framework, we also demonstrate that Michaelis-like functions, another class of cis-regulatory models, are a subset of the thermodynamic framework with specific assumptions. By recasting Michaelis-like functions as thermodynamic functions, we emphasize the relationship between these models and delineate the specific circumstances representable by each approach. Application of thermodynamic state ensemble models is likely to be an important tool in unraveling the physical basis of combinatorial cis-regulation and in generating formalisms that accurately predict gene expression from DNA sequence.

No comments:

Post a Comment