- Research
- Open Access
Properties of Boolean networks and methods for their tests
- Johannes Georg Klotz^{1}Email author,
- Ronny Feuer^{2},
- Oliver Sawodny^{2},
- Martin Bossert^{1},
- Michael Ederer^{2} and
- Steffen Schober^{1}
https://doi.org/10.1186/1687-4153-2013-1
© Klotz et al.; licensee Springer. 2013
- Received: 6 January 2012
- Accepted: 26 November 2012
- Published: 11 January 2013
Abstract
Transcriptional regulation networks are often modeled as Boolean networks. We discuss certain properties of Boolean functions (BFs), which are considered as important in such networks, namely, membership to the classes of unate or canalizing functions. Of further interest is the average sensitivity (AS) of functions. In this article, we discuss several algorithms to test the properties of interest. To test canalizing properties of functions, we apply spectral techniques, which can also be used to characterize the AS of functions as well as the influences of variables in unate BFs. Further, we provide and review upper and lower bounds on the AS of unate BFs based on the spectral representation. Finally, we apply these methods to a transcriptional regulation network of Escherichia coli, which controls central parts of the E. coli metabolism. We find that all functions are unate. Also the analysis of the AS of the network reveals an exceptional robustness against transient fluctuations of the binary variables.^{a}
Keywords
- Regulatory Boolean networks
- Boolean networks
- Linear threshold functions
- Unate functions
- Canalizing function
- Sensitivity
- Average sensitivity
- Restricted functions
- Escherichia coli
1 Introduction
Boolean modeling is often used to describe signal transduction and regulatory networks [1–3]. Over the last years random Boolean models received much attention to find some generic properties that characterize regulatory networks. In addition to the study of topological features (e.g., [4]), the choice of Boolean functions in such networks is an important question to consider. Many results indicate the importance of functions with a low average sensitivity. For example, it is well known that a low expected average sensitivity is a prerequisite for non-chaotic behavior of random Boolean networks, e.g., [5, 6]. Further, so called canalizing functions have been conjectured to be characteristic for biological networks [7]. These functions have a stabilizing effect on the network dynamics [1] and many functions occurring in (non-random) regulative networks are canalizing [7].
In this work we follow a non-random approach to find properties characterizing regulatory networks. Namely, we focus on the properties of Boolean functions in a large scale Boolean regulatory network model. Our goal is also to provide efficient algorithms to test these properties.
First, we consider the membership of the regulatory functions to certain classes of functions. We first consider unate functions, which are monotone in each of their variables and were shown to be implied by a biochemical model [2].
Next, we present a test using Fourier analysis to test canalizing properties of functions. Canalizing functions are used in signal processing for certain classes of filters [8] and play an important role in random and regulatory Boolean networks, as already mentioned. Interestingly, it has been shown in [9] that a subclass of canalizing functions, namely the nested canalizing functions, is identical to the class of unate-cascade functions, a subclass of the unate functions. The test presented in this work is inspired by [10], where the so-called forcing transform was introduced to test the membership of a function to the class of canalizing functions. Here, we generalize this approach to the Fourier transform, which is a more intuitive and natural approach and furthermore some spectral properties of canalizing functions have already been investigated in [11].
It is well known that the average sensitivity can be directly obtained from the Fourier spectral coefficients. Further, the Fourier transform turns out to be useful to prove bounds on the average sensitivity. We derive an upper bound for unate functions similar to known results for monotone functions and recall a well-known lower bound on the average sensitivity.
Finally, we apply our tests to a large-scale Boolean model of the transcriptional network of Escherichia coli. We extended the network model of the transcriptional network of E. coli (Covert et al. [3]) by mapping genes to their corresponding fluxes in the flux-balance model presented by [12]. The network has a layered feed-forward structure and shows characteristic topological features, such as a long-tail like out-degree distribution.
Throughout this article we use Fourier analysis to investigate the mentioned properties. In particular we use the concept of restricted functions. Therefore we derived both-way relations between the Fourier coefficients of a Boolean function and its restriction. A very general one-way approach of this relation can be found in [13].
The remainder of this article is organized as follows: In Section 2. we give a short introduction to Boolean functions and networks, discuss some fundamentals of Fourier analysis and investigate the spectra of restricted functions. In Section 3. we discuss certain classes and properties of Boolean functions and show efficient ways to check these properties. We also introduce the average sensitivity and prove an upper bound on it for unate functions. In Section 4. we finally introduce Boolean networks and apply our methods and tests to the regulatory network of E.coli. Some final remarks are given in Section 5.
2 BFs
A BF f:{−1,1}^{ n }→{−1,1} maps n-ary binary input tuples to a binary output. In general, not all variables of a function f are relevant. A variable i is called relevant, if there exits at least one argument x∈{−1,1}^{ n } such that f(x)≠f(x⊕e_{ i }), where the argument x⊕e_{ i } is obtained from x by changing its i-th entry. In the following, we denote the number of relevant variables by k.
For the sake of simplicity we assume throughout this article, that k=n, i.e., all variables are relevant, but note that the expositions in Section 2.1 are valid in general. The assignment of + 1 and −1 chosen to represent the binary in and outputs is somewhat arbitrary. One can interpret the value −1 as “ON” or “TRUE” and + 1 as “OFF” or “FALSE”.
Fourier analysis
which directly follows from the definition of Φ_{ U }(Equation 3).
2.2 Restricted functions
The following lemma gives a relation between the Fourier coefficients of the original function and its restriction.
where U⊆[n]∖{i} and ${\Phi}_{\left\{i\right\}}\left({a}_{i}\right)=\frac{{a}_{i}-{\mu}_{i}}{{\sigma}_{i}}$.
due to ${\Phi}_{\left\{i\right\}}\left({a}_{i}\right)=\frac{{a}_{i}-{\mu}_{i}}{{\sigma}_{i}}=\frac{{\sigma}_{i}}{{a}_{i}+{\mu}_{i}}$.
which is the definition of the Fourier coefficients from Equation (4) and concludes the proof. □
A closely related property is given by the following proposition. Please note that this result for uniform distributed input variables can also be retrieved using ([13], Lemma 2.17).
by definition, hence, the proposition follows from Equation (1). □
For the general case, that a BF is restricted to more than one input, the following Corollary to Proposition 1 applies:
where U contains the indices for the Fourier coefficients of the restricted functions, i.e., U⊆[n]∖K and a is a vector containing all a_{ i },i∈K.
3 Classes and properties of functions
In this section, we will present and discuss some classes of BFs, namely unate and canalizing functions. Further, we will discuss properties of functions characterizing their robustness, like for example the AS.
3.1 Unate functions
A BF is unate if it is monotone (either increasing or decreasing) in each of its variables, a precise definition will be given below. The class of unate functions is a simple extension of the class of monotone functions defined as follows
Definition 1. A BF f:{−1,1}^{ n }→{−1,1} is called monotone, if for each i∈{1,…,n} it holds that f(x_{1},…,x_{ i }=−1,…,x_{ n })≤f(x_{1},…,x_{ i }=1,…,x_{ n }).
Now unate functions can be defined as follows.
Definition 2. A BF f is unate, if there exists a vector a∈{−1,1}^{ n }such that the function f(a_{1}·x_{1},…,a_{ n }·x_{ n }) is monotone.
The class of unate functions is closed with respect to restriction, since every restriction of a locally monotone function yields again in a locally monotone function.
To test whether a function is unate or not it is sufficient to use the definition, however, a necessary condition for a function to be unate is given by the following proposition:
3.2 Canalizing functions
for all x_{1},…x_{i−1},x_{i + 1}…x_{ n }, where b_{ i }∈{0,1} is a constant. If the restricted function, which is obtained by setting x_{ i }=1−a_{ i }, is again canalizing and so on, the function is called nested canalizing.
The following propositions give a relation between the Fourier coefficients and the canalizing property.
where μ_{ i } is the expected value of x_{ i }and σ_{ i } the corresponding standard derivation.
and the proposition follows from Equation (11). □
A similar result namely the calculation of the Fourier coefficients of a canalizing BF from the coefficients of the restricted functions $\widehat{f}{|}_{{x}_{i}={a}_{i}}\left(\mathbf{U}\right)$ is addressed in [11]. These results can also be achieved using Proposition 2.
Proposition 4 can easily be extended for nested canalizing functions:
Proof. The proof follows from Corollary 1 and Proposition 4. □
From Proposition 4 it is clear that the canalizing property can be tested by considering all Fourier coefficients of order one. Using the Fast Walsh Transform [17] this test is as fast as the one presented in [10], however, once we have retrieved the spectra of a function, we can easily compute other properties, such as the AS (see next section).
3.3 AS of functions
The AS [18] gives the influence of random disturbance at the input on the output of a BF. This can be interpreted as an indicator for the robustness of this BF and finally for the whole Boolean network.
Together with a lower bound as presented in [19, 20] and since $1-\widehat{f}{\left(\varnothing \right)}^{2}=1-\mathbb{E}{\left[f\right]}^{2}=\text{Var}\left(f\right)$ we obtain the following proposition.
where Var(f) denotes the variance off.
It can be shown that some functions get close to the upper bound. Assuming uniform distribution the upper bound in Equation (18) is smaller than $\sqrt{n}$. But it is well known that the AS of the majority function behaves like $O\left(\sqrt{n}\right)$ (see for example [21]).
4 Application to a regulatory network of E. coli
In the previous sections, we only considered BFs. Now we will focus on BNs. A synchronous BN of N nodes can be described by a graph G=G(V,E) with nodes V⊆[N], |V|=N, and edges E⊆V×V, and a set of ordered BFs F=(f_{1},f_{2},…,f_{ N }), where we also allow a dummy function (see below). Each f_{ i }has n_{ i }=k_{ i }=in-deg(i) relevant variables where in-deg(i) is the in-degree of node i, i.e., the number of edges (j,i) with j∈V. In this case a node j is called a controlling node of i. If a node i has in-degree zero, the dummy function is attached and we call it an in-node. Consequently, the number of edges emerging from i is called the out-degree of node i. Usually to each node a binary state variable is assigned, i.e., for node i we assign x_{ i }(t)∈{−1, + 1}. For in-nodes the state can be set by some external process at some time t_{0}. The state of all other nodes at time t depend on its BF and the states of all controlling nodes at time instant t−1.
4.1 Structural properties
We applied the tests described in the previous sections to the regulatory network of E. coli[3]. The model provides Boolean formulas that describe how environmental conditions act on gene expression via a transcriptional regulatory network. We extended this network by the mapping of the genes to their corresponding fluxes in the flux-balance model [12]. The network as described in the literature contains functions with irrelevant variables, respectively, redundant edges, which are removed. A list of the affected nodes and the removed edges can be found in the Additional file 1.
We found that all functions attached to the nodes are unate. Furthermore 2499 functions (98.8%) are canalizing An overview of the functions, which are not canalizing, can be found in the Additional file 1.
4.2 Robustness
Obviously, functions with a strong bias, i.e., with a high probability to be either −1 or 1, have a low AS. Further it can be seen that the average sensitivities of all functions are very close to the lower bound. The mean value of the AS is 0.918874. Hence, it can be stated that the AS of this network is rather low. Similar results can be obtained considering the network without the extension as originally defined by Covert et al. [3] and Samal and Jain [23].
In a second step we want to take the topology of the network into account. Therefore, we now assume that only the in-nodes of the network are equally distributed. However, the output of these functions will most certainly not be uniform, i.e., the functions have a bias unequal zero. Since the outputs of these functions serve as inputs of the functions of the next layer, we assume that their input distributions follow the output distribution of the first-layer functions. The output distributions of the second-layer functions serve then as input distributions of the third layer and so on. Obviously this has an impact on the as of the functions.
4.3 Comparison with random ensembles
The network appears to be more robust against transient errors as for example certain randomly constructed networks. The in-degree distributions of all controlled nodes (in-degree larger zero) is shown in Figure 3. For all nodes with in-degree k we choose a random function out of the set of functions with k relevant variables. For k=1 this results in $\mathbb{E}\left[\mathit{\text{as}}\right(f\left)\right]=1$, for k>1 we can at least state that $\mathbb{E}\left[\mathit{\text{as}}\right(f\left)\right]>\frac{k}{2}$, as it is well known that if we choose randomly from functions, we expect an AS of $\frac{k}{2}$. Taking the in-degree distribution into account this implies that the expectation of the AS of all BFs chosen in this way is larger than 1.25.
Fraction of functions with in-degree k , the mean of the AS of all functions with in-degree k , and the expectation of an accordingly chosen random function with same in-degree and same bias distribution (see text and Equation 19)
k | Fraction of functions | av(f) | $\mathbb{E}\mathbf{\left(}\mathbf{\text{av}}\mathbf{\right(}\mathbf{f}\mathbf{)}$ |
---|---|---|---|
1 | 0.579905 | 1.000000 | 1.000000 |
2 | 0.179984 | 1.000000 | 1.000000 |
3 | 0.063291 | 0.887500 | 0.985714 |
4 | 0.143987 | 0.572115 | 0.623077 |
5 | 0.015427 | 0.491987 | 0.659895 |
6 | 0.006725 | 0.933824 | 1.737920 |
7 | 0.001187 | 0.796872 | 1.423026 |
8 | 0.004747 | 0.760416 | 1.641421 |
9 | 0.001187 | 0.300781 | 0.547935 |
10 | 0.000791 | 0.312500 | 0.587713 |
11 | 0.001187 | 1.009441 | 2.984577 |
12 | 0.000396 | 1.318360 | 3.481815 |
13 | 0.001187 | 0.003174 | 0.003174 |
It should be noted that in random BNs the expectation of the AS is an order parameter[5, 6]. That is, if the expectation is less or equal to one many random networks show the so-called ordered behavior. Namely, single transient errors introduced in network nodes (by flipping their state) do not spread through the network with high probability. This ordered behavior is in sharp contrast to the so-called disordered behavior of random networks which is characterized by an expectation of the AS larger one. Indeed, it has been conjectured that biological relevant networks should be ordered (or critical) but not disordered [27]. A further investigation on how canalizing and nested canalizing functions influence the average sensitivity can be found in [7, 11].
4.4 Impact of mutations on the metabolism
When investigating a regulatory network, the impact of the network on the metabolism is of major interest. Hence, only the stability of nodes in the bottom layer, i.e., the output of the network, is relevant. In regulatory networks, mutations are a source for errors. We consider two possible types of mutations. First we assume that a part of promoter region of a gene is mutated or deleted. In terms of our network this means that a edge is removed and the corresponding input is set to false (+ 1). The gene may still be transcribed, hence, the node itself remains functional. The second type of mutation is the deletion of a gene or a mutation which leads to disfunctional gene. In this case, the node is constantly set to false. In both cases, the value of one node may change (error). This error is now fed through the out-going edges of this node to other nodes. However, due to the low sensitivity of all functions in the network, the error has no impact on many nodes and, therefore, will in most cases not reach the bottom layer, which is, as mentioned above, the only part of the network, whose stability is crucial. From that point of view it can be stated that these permanent errors behave similar to the transient errors described above and that networks with a low mean AS are robust against such errors.
5 Summary
It is an important problem to characterize BFs that appear in Boolean models of regulatory networks. This will help to understand the constraints underlying such networks, but can, for example, also help to improve network inference algorithms (see for examples [28, 29] for algorithms that utilize the membership to the class of unate functions). In this study, we focused on several properties that have been shown to be of interest in the context of Boolean regulatory networks. Namely, we discussed different classes of BFs such as unate and canalizing functions. Further, sensitivity measures of BFs, like the influence of variables, or the AS are considered. We devised simple algorithms to test these properties. To test canalizing properties of BFs we applied the Fourier representation of BFs where functions are represented as multivariate, multilinear real polynomials. To this end, we introduced two spectral relationships between the so-called restricted BFs and their unrestricted counter part. The Fourier representation is further useful as many interesting properties such as the influence of unate functions or the AS of BFs can easily be characterized in the spectral domain. For example, we show how to obtain theoretical upper bounds on the AS for unate functions using spectral techniques.
As an application of our results, we analyzed an extended [30] regulatory Boolean network model of the central metabolism of E. coli. It turned out that most functions are within the classes of unate functions. Further, the AS of most functions is close to a theoretical lower bound and far from the new upper bound. Especially, functions with large in-degree have low AS even if their so-called bias is close to 0.5 (see Figure 5). We compared our findings to random BNs with similar parameters and find that the investigated networks has an even lower AS. From that we conclude that the whole network is stable, and robust to small changes, e.g., mutations.
6 Endnote
^{a}Preliminary results of this study have been presented at the 8th International Workshop on Computational Systems Biology (WCSB 2011) and the 3rd International Conference on Bioinformatics and Computational Biology (BICoB 2011).
Declarations
Acknowledgements
The authors would like to thank Georg Sprenger and Katrin Gottlieb from the Institute for Microbiology at the University of Stuttgart for fruitful collaboration and discussions. Further we thank Reinhard Heckel for creating large parts of the software. This study was supported by the German Research Foundation “Deutsche Forschungsgemeinschaft” (DFG) under Grants Bo 867/25-1 and Sa 847/11-1.
Authors’ Affiliations
References
- Kauffman S, Peterson C, Samuelsson B, Troein C: Genetic networks with canalyzing Boolean rules are always stable. Proc. Natl Acad. Sci. USA 2004,101(49):17102-17107. 10.1073/pnas.0407783101View ArticleGoogle Scholar
- Grefenstette J, Kim S, Kauffman S: An analysis of the class of gene regulatory functions implied by a biochemical model. Biosystems 2006,84(2):81-90. http://www.sciencedirect.com/science/article/B6T2K-4HWXP4R-3/2/61a3092f98470e99a2c33786416697d0 10.1016/j.biosystems.2005.09.009View ArticleGoogle Scholar
- Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO: Integrating high-throughput and computational data elucidates bacterial networks. Nature 2004,429(6987):92-96. http://dx.doi.org/10.1038/nature02456 10.1038/nature02456View ArticleGoogle Scholar
- Aldana M: Boolean dynamics of networks with scale-free topology. Physica D 2003, 185: 45-66. 10.1016/S0167-2789(03)00174-XMATHMathSciNetView ArticleGoogle Scholar
- Shmulevich I, Kauffman SA: Activities and sensitivities in Boolean network models. Phys. Rev. Lett 2004,93(4):048701.View ArticleGoogle Scholar
- Mahdavi byK, Culshaw R, Boucher J (Eds): Dynamics of random Boolean networks. World Scientific Publishing Co, Singapore; 2007.Google Scholar
- Harris SE, Sawhill BK, Wuensche A, Kauffman S: A model of transcriptional regulatory networks based on biases in the observed regulation rules. Complexity 2002,7(4):23-40. 10.1002/cplx.10022View ArticleGoogle Scholar
- Gabbouj M, Yu PT, Coyle EJ: Convergence behavior and root signal sets of stack filters. Circuits Syst. Signal Process 1992, 11: 171-193. 10.1007/BF01189226MATHMathSciNetView ArticleGoogle Scholar
- Jarrah AS, Raposa B, Laubenbacher R: Nested Canalyzing, Unate Cascade, and Polynomial Functions. Physica D 2007,233(2):167-174. 10.1016/j.physd.2007.06.022MATHMathSciNetView ArticleGoogle Scholar
- Shmulevich I, Lahdesmaki H, Egiazarian K: Spectral methods for testing membership in certain post classes and the class of forcing functions. Signal Process. Lett. IEEE 2004,11(2):289-292.Google Scholar
- Kesseli J, Rämö P, Yli-Harja O: Analyzing dynamics of Boolean networks with canalyzing functions using spectral methods. In Proceedings of the 2005 International TICSP Workshop on Spectral Methods and Multirate Signal Processing (SMMSP 2005). (Riga, Latvia, 20-22 June 2005); 151-158.Google Scholar
- Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis VBO, Palsson V: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol. Syst. Biol 2007, 3: 121. 10.1038/msb4100155View ArticleGoogle Scholar
- Bernasconi A: Mathematical techniques for the analysis of Boolean functions. PhD thesis. University of Pisa, Italy; 1998.Google Scholar
- Bahadur RR: A representation of the joint distribution of responses to n dichotomous items. In Studies on Item Analysis and Prediction, no. 6 in Stanford Mathematical Studies in the Social Sciences. Edited by: Solomon byH. Stanford University Press, Stanford, CA; 1961:158-176.Google Scholar
- Furst ML, Jackson JC, Smith SW: Improved learning of AC0 functions. In Proceedings of the Fourth Annual Workshop on Computational Learning Theory. Morgan Kaufmann Publishers Inc., Santa Cruz; 1991:317-325.Google Scholar
- Bshouty NH, Tamon C: On the Fourier spectrum of monotone functions. J. ACM 1996,43(4):747-770. 10.1145/234533.234564MATHMathSciNetView ArticleGoogle Scholar
- Shanks J: Computation of the fast Walsh-Fourier transform. IEEE Trans. Comput 1969,C-18(5):457-459.View ArticleGoogle Scholar
- Benjamini I, Kalai G, Schramm O: Noise sensitivity of Boolean functions and applications to percolation. Publications mathematiques de l’IHES 1999, 90: 5-43.MATHMathSciNetView ArticleGoogle Scholar
- Kahn J, Kalai G, Linial N: The influence of variables on Boolean functions. In Proceedings of the 29th Annual Symposium on Foundations of Computer Science. White Plains, (New York, USA, 24-26 Oct 1988); 68-80.Google Scholar
- Friedgut E: Boolean functions with low average sensitivity depend on few coordinates. Combinatorica 1998, 18: 27-35. 10.1007/PL00009809 10.1007/PL00009809MATHMathSciNetView ArticleGoogle Scholar
- O’Donnell R: Some topics in analysis of boolean functions. In Proceedings of the 40th annual ACM symposium on Theory of computing. (ACM, Victoria; 2008:569-578. http://portal.acm.org/citation.cfm?id=1374458Google Scholar
- Heckel R, Schober S, Bossert M: Harmonic analysis of Boolean networks: determinative power and perturbations. 2011.Google Scholar
- Samal A, Jain S: The regulatory network of E. coli metabolism as a Boolean dynamical system exhibits both homeostasis and flexibility of response. BMC Syst. Biol 2008, 2: 21. http://dx.doi.org/10.1186/1752-0509-2-21 10.1186/1752-0509-2-21View ArticleGoogle Scholar
- Derrida B, Pomeau Y: Random networks of automata—a simple annealed approximation. Europhys. Lett 1986, 2: 45-49. 10.1209/0295-5075/2/1/007View ArticleGoogle Scholar
- Schober S: Analysis and identifiation of Boolean networks using harmonic analysis. Dissertation, Ulm University, Ulm, Germany; 2011.Google Scholar
- Schober S, Bossert M: Analysis of random Boolean networks using the average sensitivity. 2007.Google Scholar
- Kauffman SA: Metabolic stability and epigenesis in randomly constructed nets. J. Theor. Biol 1969, 22: 437-467. 10.1016/0022-5193(69)90015-0MathSciNetView ArticleGoogle Scholar
- Schober S, Kracht D, Heckel M, Bossert R: Detecting controlling nodes of Boolean regulatory networks. EURASIP J. Bioinf. Syst. Biol 2011,27(11):1529-1536. http://www.ncbi.nlm.nih.gov/pubmed/21989141Google Scholar
- Maucher M, Kracher B, Kühl M, Kestler HA: Inferring Boolean network structure via correlation. Bioinformatics 2011. http://bioinformatics.oxfordjournals.org/content/early/2011/04/05/bioinformatics.btr166.abstractGoogle Scholar
- Feuer R, Gottlieb K, Klotz JG, Schober S, Bossert M, Sawodny O, Sprenger G, Ederer M: Model-based analysis of adaptive evolution. In Proceedings of the 8th International Workshop on Computational Systems Biology (WCSB). (Zuerich, Switzerland; 2011:108-111.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.