Detecting controlling nodes of boolean regulatory networks
 Steffen Schober^{1}Email author,
 David Kracht^{1},
 Reinhard Heckel^{1, 2} and
 Martin Bossert^{1}
https://doi.org/10.1186/1687415320116
© Schober et al; licensee Springer. 2011
Received: 1 November 2010
Accepted: 11 October 2011
Published: 11 October 2011
Abstract
Boolean models of regulatory networks are assumed to be tolerant to perturbations. That qualitatively implies that each function can only depend on a few nodes. Biologically motivated constraints further show that functions found in Boolean regulatory networks belong to certain classes of functions, for example, the unate functions. It turns out that these classes have specific properties in the Fourier domain. That motivates us to study the problem of detecting controlling nodes in classes of Boolean networks using spectral techniques. We consider networks with unbalanced functions and functions of an average sensitivity less than $\frac{2}{3}k$, where k is the number of controlling variables for a function. Further, we consider the class of 1low networks which include unate networks, linear threshold networks, and networks with nested canalyzing functions. We show that the application of spectral learning algorithms leads to both better time and sample complexity for the detection of controlling nodes compared with algorithms based on exhaustive search. For a particular algorithm, we state analytical upper bounds on the number of samples needed to find the controlling nodes of the Boolean functions. Further, improved algorithms for detecting controlling nodes in largescale unate networks are given and numerically studied.
1 Introduction
The reconstruction of genetic regulatory networks using (possibly noisy) expression data is a contemporary problem in systems biology. Modern measurement methods, for example, the socalled microarrays, allow measuring the expression levels of thousands of genes under particular conditions. A major problem is to predict the structure of the underlying regulatory network. The overall goal is to understand the processes in cells, for example, how cells execute and control operations required for the functions performed by the cell. In the Boolean model, this implies that based on a given set of observed statetransition pairs (samples), the Boolean functions attached to each node need to be identified. In general, this problem is quite hard, due to the large number of possible Boolean functions. First results for the noiseless case appeared 1998 in the work of Liang et al. [1]. Their Reverse Engineering Algorithm (REVEAL) tries in a first step to find the controlling nodes of each node by estimating the mutual information between possible variables and the regulatory function's output. After the inputs have been identified, the truth table of the Boolean functions can be determined from the samples. If the number of variables for each function is at maximum K, the REVEAL algorithm considers any of the $\left({}_{K}^{n}\right)$ combinations of variables, where n is the number of nodes in the network.
The numerical results in [1] suggest that it is possible to identify a Boolean network using a small number of samples. Akutsu et al. [2] gave an analytical and constructive proof that it is possible to identify the network using only $\mathcal{O}\left(logn\right)$ samples with high probability. For constant values of K, the given algorithm, BOOL, has time complexity$\mathcal{O}\left({n}^{K+1}\cdot m\right)$ where m is the number of samples. Later it was shown that a similar algorithm also works in the presence of (lowlevela) noise [3]. These algorithms are based on exhaustive search in two ways. First, they search through all $\left({}_{K}^{n}\right)$ possible combinations of controlling nodes. Second, they search through all of the ${2}^{{2}^{K}}$ possible Boolean functions. Lähdesmäki et al. [4] overcame the problem to search through all possible Boolean functions, reducing the double exponential factor to roughly 2 ^{ K } . But their algorithm still searches through all $\left({}_{K}^{n}\right)$ possible variable combinations, hence, runs roughly in time n^{ K } . If n is large, applying such an algorithm is prohibitive even for moderate values of K.
The algorithms above implicitly solve two distinct problems. First, the controlling nodes of all nodes have to be detected, and second, each function has to be determined. This paper is dedicated to algorithms for detecting controlling nodes in Boolean networks. In general, this problem can be solved by exhaustive search in time n^{ K } . By exploiting structural properties of certain classes of functions, the time and sample complexity of the algorithms can be reduced. The sample complexity of an algorithm is the number of samples needed to detect the controlling nodes with a predefined probability. In fact, one can readily apply methods stemming from the area of PAC (probably approximately correct) learning theory [5], as the network identification problem can be reduced to the problem of learning Boolean juntas, i.e., Boolean functions that depend^{b} only on a small number of their arguments. This problem was studied by Arpe and Reischuk [6] extending earlier work of Mossel et al. [7, 8].
where ${\mathbf{X}}_{l}^{\prime}$ and ${\mathbf{Y}}_{l}^{\prime}$ describe noisy observations of two successive network states X_{ l } and Y_{ l } at some time t_{ l } and t_{ l } + 1, respectively. The networks state X_{ l } at time t_{ l } is modeled using a uniformly distributed random variable X.
The task to detect the controlling nodes can be reduced to the problem to find the essential variables of the Boolean functions. This problem is easier to solve for some classes of functions, namely for nearly all unbalanced functions and functions of an average sensitivity less then $\frac{2}{3}k$, where k is the number of controlling variables for a function. Further the class of 1low networks, which include unate networks, linear threshold networks, and networks with nested canalyzing functions, is considered. The application of spectral learning algorithms leads to both better time and sample complexity for the detection of controlling nodes compared with exhaustive search. In particular, a slight improvement in the algorithm given in [6] is presented, for which analytical bounds on the number of samples needed to find the controlling nodes are derived. It is notable that for the class of 1low networks, the time complexity of the resulting algorithms is roughly n^{2}. The algorithm is further improved, where the main focus lies on the identification of controlling nodes in a largescale unate network.
Finally, the performance of the improved algorithms is evaluated for largescale unate networks with 500 nodes using numerical simulations. Further, the problem is studied in a Boolean network model of a control network of the central metabolism of Escherichia coli with 583 nodes [9]. Preliminary results of this work were presented in [10, 11].
The outline of the paper is as follows. In Section 2, Boolean networks are defined and the detection problem is formally stated. The two classes of functions considered here are introduced and discussed. Section 3 gives a brief introduction to the Fourier analysis of Boolean functions and discusses the spectral properties of the two classes of functions. Further, the algorithms are stated and analyzed in 3.3 and 3.4. Simulation results are presented in 3.5.
2 Regulatory networks and inference
2.1 Boolean regulatory networks
i.e., given by the prestate of the network x(t) and the Boolean functions f_{ i } .
hence, var(f) is called the set of essential variables of f. If var(f) ≤ k, a function f with n variables is usually called a (n, k)junta.
2.2 The detection problem
Assume that there exists an unknown BN that is an appropriate description of an underlying dynamical process, for example, a regulatory network. An experiment generates statetransition pairs by observing the process, but in general, the measurements of the statetransitions are noisy. The challenge is now to detect the functional dependencies between the nodes of the network.
is obtained. In the following, it is assumed that X is uniformly distributed. Some comments on choosing X uniformly distributed will be given in the last section. Given a set of samples, the task is to detect the set of essential variables of f. This should be achieved in an efficient way, since the number of nodes can be very large in realistic problems. Further, the probability of a detection error should be as small as possible.
2.3 Classes of regulatory functions
Different classes of functions have been proposed to model regulatory functions. The authors do not attempt to interfere in this discussion. Merely, the approach taken here is to show that many of the proposed functions fall into two classes for which Fourierbased algorithms provide an advantage in running time over algorithms based on exhaustive search. A precise definition is given later. Two classes of functions that may be reasonable models of functions in genetic regulatory networks are presented. For both of these classes, it is assumed that the number of essential variables is less or equal to k. The first class, denoted by ${\mathcal{C}}_{\u2308\frac{2}{3}k\u2309}$, includes

functions with average sensitivity less than $\frac{2}{3}k$, and

unbalanced functions,
where it is assumed that for any function f any restriction f′ on k′ > 1 of its essential variables has an average sensitivity less or equal than $\frac{2}{3}{k}^{\prime}$ or is an unbalanced functions (or both). Note that a restriction f′ is obtained from f by setting some of its variables to fixed values. The second class ${\mathcal{C}}_{1}$ includes

unate functions, which further include

nested canalizing functions, and

linear threshold functions.
Basically, low average sensitivity is a prerequisite of nonchaotic behavior in random Boolean networks (RBNs), in particular, the expectation of the average sensitivity has to be less or equal to 1 [13]. This motivates to study the class ${\mathcal{C}}_{\u2308\frac{2}{3}k\u2309}$ as it is widely assumed that Boolean models of biological networks are tolerant to perturbations. Unbalanced functions^{c} are of interest due to a similar reason; namely, it is well known that the average sensitivity of balanced functions is lower bounded by 1 [14]. Hence, a function that has average sensitivity less than 1 is necessarily unbalanced.
where w_{ i } ∈ ℝ. For n < 4, the classes of unate and linear threshold functions coincide [20].
3 Learning essential variables of regulatory functions
3.1 Fourier analysis and learning
see, for example, [22]. If the number of samples m grows, the estimator Equation 8 will converge to its expected value, namely $\widehat{f}\left(U\right)$.
3.2 Spectral properties of specific regulatory functions
The Boolean functions mentioned in Section 2.3 be categorized according to their lowness[6].
Clearly any function that is τlow is also τ′low if τ′ > τ. The notation of lowness allows to define the following families of classes.
Definition 2. ${\mathcal{C}}_{\tau}$is the set of functions that are τlow.
[23], and the fact that for any Boolean function, the influence of an essential variable is larger than zero. Hence, if the i th variable of a unate function f is essential, the Fourier coefficient $\widehat{f}\left(\left\{i\right\}\right)$ is nonzero.
Now the class ${\mathcal{C}}_{\u2308\frac{2}{3}k\u2309}$ is discussed, first the following definition is needed.
Correlation immune functions were considered by Siegenthaler [24] who used a different definition. The definition in terms of the Fourier coefficients as used here is due to Xiao and Massey [25]. These functions are of interest in cryptography, for example, to design combining functions of stream ciphers.
Unbalanced correlation immune functions cannot exist for too large m as the next theorem shows.
Theorem 1 (Mossel et al. [8]). Let f : {1, +1} ^{ n } → {1, +1} be an unbalanced, mth order correlation immune function. Then$m\le \frac{2}{3}\cdot n$.
A similar proposition holds for functions with low average sensitivity.
Proposition 1. Let f : {1, +1} ^{ n } → {1, +1} be a mthorder correlation immune function such that as $\left(f\right)\le \frac{2}{3}n$, where X ∈ {1, +1} ^{ n } is uniformly distributed. Then$m\le \frac{2}{3}\cdot n$.
which contradicts the assumption of the proposition. □
Proposition 2. Let f be a function with k ≥ 2 essential variables (out of n) such that any restriction f′ on k′ of its essential variables, where 1 < k′ ≤ k, has an average sensitivity less or equal than$\frac{2}{3}{k}^{\prime}$or is an unbalanced functions (or both). Then f is$\u2308\frac{2}{3}k\u2309$low.
Proof. First note that if k = 2 the proposition is true. Now consider a function with k > 2. By assumption there is a variable i ∈ var(f) with a "low" coefficient,
1 Input: $\mathcal{X}$, n, d
2 Output: $\stackrel{\u0303}{R}$ the essential variables
3 Global Parameters: τ, ϵ
4 begin
5 $\stackrel{\u0303}{R}=\varnothing $;
6 foreach U ⊆ [n] and 1 ≤ U ≤ τ do
7 $\u0125\left(U\right)\leftarrow {\left(12\epsilon \right)}^{\leftU\right1}\cdot {m}^{1}{\sum}_{\left(\mathbf{x}\mathsf{\text{,}}y\right)\in \chi}y\cdot {\chi}_{U}\left(\mathbf{x}\right)$;
8 if $\left\u0125\left(U\right)\right\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\ge {2}^{d}$ then
9 $\stackrel{\u0303}{R}\leftarrow \stackrel{\u0303}{R}\cup U$;
10 end
11 end
12 end
Algorithm 1: τNOISYFOURIER _{ d }
This argument can now be repeated recursively (applying Eq. (12) to f_{1} and f_{+1}) showing the proposition. □
3.3 The τ NOISYFOURIER_{ d }algorithm
A simple algorithm to find the essential variables of τlow (n, k)juntas directly follows from Equations 6 and 7. First, all Fourier coefficients up to weight τ are estimated. The absolute value of each estimated coefficient $\u0125\left(U\right)$ is compared with a threshold. If a coefficient $\widehat{f}\left(U\right)$ is nonzero, its absolute value cannot be smaller then 2^{k+1}, see Equation 7. Hence, if $\left\u0125\left(U\right)\right$ is larger than 2 ^{ k } , the variables corresponding to U are classified as essential. The algorithm was given by [6], but they used 2^{d1}as threshold (see Line 8).
The following theorem appeared first in [6] but with a different bound.
Then Algorithm 1 identifies all essential variables with probability 1  δ.
The bound is even true if ϵ is only an upper bound on the noise rate. The theorem follows from applying standard Hoeffding bounds. Note that the bound above is different to [6]. If τ = 1, the number of samples required to reach a predefined probability of error is smaller by a factor 4. This directly follows from the different threshold used here. If τ > 1, it was claimed in [6] that n^{ τ } can be replaced by n. But simulation results of the authors (not shown) contradict this result; hence, we rely here on the weaker result shown in Theorem 2. This issue will be discussed in future work.
3.4 Improved algorithms
In the following section, two algorithms are discussed that lead to better numerical results as Algorithm 1 especially for low k. The first algorithm is a straight forward modification of the τNOISYFOURIER algorithm and is discussed in Section 3.4.1. The second algorithm requires a further assumption on the functions to which it is applied; namely, suppose that f is τlow. If a variable of the function f is set to a particular fixed value, i.e., 1 or +1, the restricted version of f is obtained (this will be discussed in more detail later on). Now it has to be assumed that the restricted function is still τlow, i.e., they have to be recursive τlow. While it is possible to define such classes, only unate functions are considered. On the one hand, they naturally fulfill the constraint defined above, as any restriction of a unate function is again a unate function. On the other hand, they seem to be the most important class of functions as discussed earlier. Nevertheless, the following algorithms will be formulated in a way such that it is clear how to apply them for recursive τlow functions.
3.4.1 A modification of the τNOISYFOURIER _{ d }
3.4.2 The K JUNTA algorithm
The second algorithm is based on the original idea of Mossel et al. [8] who recursively applied their algorithm to restricted functions of the original. While they did for other reasons, a slight modification of their approach can be used to reduce the number of samples needed. The running time of the algorithm is increased by an exponential dependency on k.
1 Input: $\mathcal{X}$, n, d
2 Output: $\stackrel{\u0303}{R}$ the essential variables
3 Global Parameters: τ, ϵ
4 begin
5 $\stackrel{\u0303}{R}\leftarrow \varnothing $;
6 foreach U ⊆ [n] and U ≤ τ do
7 $\u0125\left(U\right)\leftarrow {\left(12\epsilon \right)}^{\leftU\right1}\cdot {m}^{1}\cdot {\sum}_{\left(\mathbf{x}\mathsf{\text{,}}y\right)\in \mathcal{X}}y\cdot {\chi}_{U}\left(\mathbf{x}\right)$;
8 end
9 ${U}_{i}:\left\u0125\left({U}_{1}\right)\right\ge \left\u0125\left({U}_{2}\right)\right\ge \cdots \ge \left\u0125\left({U}_{l}\right)\right$ // mod: sorted index;
10 for i = 1 to l do
11 if$\left\stackrel{\u0303}{R}\right<d$then // mod: limiting condition
12 if$\left\u0125\left({U}_{i}\right)\right\ge {2}^{d}$ then $\stackrel{\u0303}{R}\leftarrow \stackrel{\u0303}{R}\cup {U}_{i}$;
13 end
14 end
15 end
Algorithm 2: τ NOISYFOURIER^{MOD}
The algorithm is now described as follows. Suppose there exists a procedure IDENTIFY that can identify at least one essential variable of a function f given a number of samples. If no essential variables exist, i.e., if f is constant, the procedure returns the empty set Ø.
Given a (n, k)junta f, with k > 0, and a set I ⊆ R = var(f) that contains some essential variables that are already known. Further, assume that there is a restriction ρ that fixes exactly the variables in I. The function f_{ ρ } can be either the constant function or depend on some of the variables that are not fixed yet. For the latter case suppose that at least one new variable can be identified, using procedure IDENTIFY. Denote the set of newly identified variables with I. Then the procedure is continued with all of the 2^{I}new restrictions that fix the variables in I until all these subrestrictions will be constant. The resulting algorithm in a recursive form is given as Algorithm 3. Initially, the algorithm is started with $\mathsf{\text{K}}\mathsf{\text{J}}\mathsf{\text{UNTA}}\left(\mathcal{X},n,d\right)$, where the global parameters (τ = 1, ϵ) are fixed.
Most of the algorithm has been explained already. First note that passing n as an argument is not necessary, because it is an implicit parameter of the
1 Input: $\mathcal{X}$, n, d
2 Output: $\stackrel{\u0303}{R}$ the essential variables
3 Global Parameters: τ, ϵ
4 begin
5 $\stackrel{\u0303}{R}\leftarrow \varnothing $;
6 $I\leftarrow \mathsf{\text{IDENTIFY}}\left(\mathcal{X},d\right)$;
7 if (d > I > 0) then
8 ${\stackrel{\u0303}{R}}^{\prime}\leftarrow \varnothing $;
9 foreach restriction ρ do
10 ${\stackrel{\u0303}{R}}^{\prime}\leftarrow {\stackrel{\u0303}{R}}^{\prime}\cup \mathsf{\text{K}}\mathsf{\text{J}}\mathsf{\text{UNTA}}\phantom{\rule{2.77695pt}{0ex}}\left({\mathcal{X}}_{\rho},\phantom{\rule{2.77695pt}{0ex}}n\phantom{\rule{2.77695pt}{0ex}}\leftI\right,\phantom{\rule{2.77695pt}{0ex}}d\phantom{\rule{2.77695pt}{0ex}}\leftI\right\right)$;
11 end
12 $\stackrel{\u0303}{R}\leftarrow \mathsf{\text{COMBINE}}\left(\stackrel{\u0303}{R},\phantom{\rule{0.3em}{0ex}}{\stackrel{\u0303}{R}}^{\prime},\phantom{\rule{2.77695pt}{0ex}}\rho \right)$;
13 end
14 end
Algorithm 3: K JUNTA
1 Input: $\mathcal{X}$, n, d
2 Output: I variables found
3 Global Parameters: τ, ϵ
4 begin
5 I ← ∅;
6 foreach U ⊆ [n] and U ≤ τ do
7 $\u0125\left(U\right)\leftarrow {\left(12\epsilon \right)}^{\leftU\right1}\cdot {m}^{1}\cdot {\sum}_{\left(\mathbf{x},y\right)\in \mathcal{X}}y\cdot {\chi}_{U}\left(\mathbf{x}\right)$;
8 end
9 $M\leftarrow \mathrm{arg}{\mathrm{max}}_{U:0<\leftU\right\le \tau}\widehat{h}(U)$;
10 if$(\text{CONST}\text{(}\widehat{h}(M),\widehat{h}(\varnothing ),d)=true)$then I ← M ;
11 end
Algorithm 4: IDENTIFY
samples. Further comments should be given to the line 9. The foreach loop is executed for each of the 2^{I}possible restrictions of the variables contained in I. For each restriction, the corresponding restricted sample set is calculated and passed in a new call to K JUNTA. Each of these calls runs on smaller problems, namely finding variables of a (n  I, d  I)junta. Notably, each of these runs is independent of the others. The variables found are then combined with $\stackrel{\u0303}{R}$ in line 11 using the procedure COMBINE. This is not just a union of sets since one has to take care about the labeling of the variables. For example, if $\stackrel{\u0303}{R}=\left\{1\right\}$, and a subsequent call of K JUNTA returns variables joined to ${\stackrel{\u0303}{R}}^{\prime}=\left\{1,3\right\}$, combining both leads to $\stackrel{\u0303}{R}=\left\{1,2,4\right\}$.
The IDENTIFY procedure The question remains how to identify some of the essential variables or how to decide whether the function is constant. For τlow functions, it is sufficient to estimate all coefficients $\widehat{f}\left(U\right)$ with U ≤ τ. In [7], it was proposed to search for the first coefficient that is above a certain threshold. The approach here is different. In particular, all coefficients with weight less or equal τ are computed. The coefficient with the maximum absolute value is compared with the zero coefficient to distinguish between a constant and a nonconstant function. How this can be done is discussed below. The resulting algorithm is formulated in terms of Algorithm 4 on page 12. In line 8, the procedure CONST is called which tries to distinguish between a constant function and a nonconstant function. If a nonconstant function is found, the variables in M are returned, otherwise the empty set.
The CONST procedure In the following it is discussed how a constant function can be distinguished from a nonconstant function, given that the function depends on not more than k variables. This is done based on the zero coefficient $\widehat{f}\left(\varnothing \right)$ and the coefficient with the largest absolute value, denoted by $\widehat{f}\left(M\right)$. Note that if and only if f is constant, $\left\widehat{f}\left(\varnothing \right)\right=1$ and $\widehat{f}\left(U\right)=0$ for any set U ≠ ∅ by Parseval's theorem. If f is nonconstant, $\left\widehat{f}\left(\varnothing \right)\right<1$ and there exists at least one coefficient with $\left\widehat{f}\left(U\right)\right>0$ for some U; hence, it follows that $\left\widehat{f}\left(M\right)\right>0$.
To distinguish between a constant and a nonconstant function different procedures exist. The most simple one was proposed by Mossel et al. which will be denoted by CONST 1. There, if $\left\u0125\left(\varnothing \right)\right>1{2}^{d}$ or $\left\u0125\left(M\right)\right<{2}^{d}$, the function is declared as constant.
where dist (·,·) denotes the Euclidean distance.
A note on the computational complexity As mentioned, Algorithm 3 has an increased complexity compared with Algorithm 1. In the worst case, the algorithm is called 2 ^{ k } times, but clearly each time on a smaller problem. If it is assumed that $\u0125\left(U\right)$ can be computed in time $\mathcal{O}\left(n\cdot m\right)$, the algorithm runs in $\mathcal{O}\left({\mathsf{\text{2}}}^{k}\cdot {n}^{\mathsf{\text{2}}}\cdot m\right)$ for 1low functions. Obviously for constant k, this reduces to $\mathcal{O}\left({n}^{\mathsf{\text{2}}}\cdot m\right)$.
3.5 Simulation results for unate networks
To compare the performance of the different algorithms, the following procedure is used. Suppose a BF f is chosen uniformly at random from a class $\mathcal{F}\subseteq {\mathcal{F}}^{n}$ of nary τlow functions, where τ and n are known. For the functions f, a set of m noisy statetransitions ${\mathcal{X}}^{m}=\left\{\left({\mathbf{X}}_{l}^{\prime},\phantom{\rule{2.77695pt}{0ex}}{Y}_{l}^{\prime}\right)l=1..m\right\}$ is generated as described in Section 2.2. The noise rate is fixed to ϵ = 0.05.
is a prior indicator on the algorithm's performance. It should be mentioned that if there exists a function f such that var(f) > d, the detection error probability ${P}_{\mathcal{E}}$ does not vanish, even for large m.
In a network, this can be interpreted as the fraction of edges that have not been detected. The definitions above are consistent with Zhao et al. [27] who defined the type1error as the event that a node i is classified as a controlling node of some node j although this is not the case. Consequently the type2error is defined as the event $\left\{i\notin \stackrel{\u0303}{R}i\in \mathsf{\text{var}}\left(f\right)\right\}$.
3.5.1 τNOISYFOURIER_{ d }versus $\tau {\mathsf{\text{N}}\mathsf{\text{OISY}}\mathsf{\text{F}}\mathsf{\text{OURIER}}}_{d}^{mod}$
3.5.2 $\tau {\mathsf{\text{N}}\mathsf{\text{OISY}}\mathsf{\text{F}}\mathsf{\text{OURIER}}}_{d}^{mod}$versus K JUNTA
Again a subset of unate functions with exactly k essential variables is used to compare the $\tau \mathsf{\text{N}}\mathsf{\text{OISY}}\mathsf{\text{F}}{\mathsf{\text{OURIER}}}_{d}^{\mathsf{\text{mod}}}$ algorithm with the K JUNTA algorithm. The parameter d is always set to k. The results are shown in Figure 2. For functions with a low number of essential variables, the procedure CONST 1 outperforms the τNOISYFOURIER_{ d } algorithm. But the better performance vanishes with an increasing number of variables.
3.5.3 τNOISYFOURIER_{ d }versus K JUNTA on an E. coli network
Indegree distribution of the Boolean network (see text).
var(f)  0  1  2  3  4  5  6  7  8 

#  12  293  159  66  38  9  4  0  2 
Remarkable results: In the previous simulations, the parameter d is always set to k. Further only functions with exactly k essential variables are chosen. Here, the parameter d is usually smaller than k, which implies that not all variables can be found. Only variables with influence large or equal 2^{d}can be detected. This is implied by Equations 10 and 7. On the other hand, even if d < k for some function f, the algorithm can possibly detect some of the essential variables of f.
4 Conclusion
In this paper, the problem to detect controlling nodes in Boolean networks is discussed. Boolean functions that are relevant for modeling genetic networks seem to belong to classes of functions for which spectralbased algorithms provide an efficient solutionboth, in computational complexity and data needed. Especially the algorithms for unate functions are highly efficient in both running time and the number of samples needed to identify controlling nodes. Further analytical bounds on the probability of a detection error can be stated.
If the samples are chosen according to a uniform distribution, the results are promising. Applying the methods to the E. coli control network, with 583 nodes, shows that using approximately 200 samples, it is possible to find nearly 40% of all edges in the network with a precision rate close to one. On the other hand, a wrong selection of the parameter d can have a dramatic effect on the precision. For example, if under the same conditions d = 4 is chosen, the precision will drop below 0.5. Fortunately, the choice of the parameter can be guided by the available analytical bounds of the detection error probability. The latter is dominated by the probability that the estimator $\u0125\left(\left\{i\right\}\right)$ will deviate from $\widehat{f}\left(\left\{i\right\}\right)$ by more than +/ 2^{d}. But this also determines the precision of the algorithm. Suppose that 200 samples are obtained from the E. coli network. The analytical bounds shown in Figure 1 suggest to choose d = 1 which indeed leads to a high precision (see Figure 3).
Clearly, our assumption of uniformly distributed samples is too optimistic. Fortunately, known results from PAC learning [6] show that it is possible to use similar algorithms for product distributed samples, i.e., in a random vector X each X_{ i } is chosen independently of the others with a certain probability such that $\mathsf{\text{1}}<E\left\{{X}_{i}\right\}={\mu}_{i}<\mathsf{\text{1}}$. But there is a major problem: If μ_{max} = max_{1≤i≤n}μ_{ i }  gets closer to 1, the number of samples needed will increase with roughly (1  μ_{max})^{2k}. In unate networks, this coincides with the fact that the influences of the variables can become very small. Hence, further investigations in this direction are necessary. This would be a major step toward the application of spectral algorithms in a realworld scenario.
Endnotes
^{a}The theoretical analysis requires the noise level to be bounded below a small value. ^{b}This will be defined more precisely later. ^{c}A function is unbalanced if the number of +1 and 1 in the truth table is different. ^{d}Using a better implementation as Algorithm 2, this can be reduced to 2τ log N. ^{e}The detailed table of the used functions can be found in the supplementary material.
Declarations
Authors’ Affiliations
References
 Liang S, Fuhrman S, Somogyi Reveal R: A general reverse engineering algorithm for inference of genetic network architectures. Proceedings of the Pacific Symposium on Biocomputing 1998, 1829.Google Scholar
 Akutsu T, Miyano S, Kuhara S: Identification of genetic networks from a small number of gene expression patterns under the boolean network model. Proceedings of the Pacific Symposium on Biocomputing 1999, 1728.Google Scholar
 Akutsu T, Miyano S, Kuhara S: Inferring qualitative relations in genetic networks and metabolic pathways. Bioinformatics 2000,16(8):727734. 10.1093/bioinformatics/16.8.727View ArticleGoogle Scholar
 Lähdesmäki H, Shmulevich I, YliHarja O: On learning gene regulatory networks under the boolean network model. Mach Learn 2003,52(12):147167.View ArticleMATHGoogle Scholar
 Valiant LG: A theory of the learnable. Commun ACM 1984,27(11):11341142. 10.1145/1968.1972View ArticleMATHGoogle Scholar
 Arpe J, Reischuk R: Learning juntas in the presence of noise. Theor Comput Sci 2007,384(1):221. 10.1016/j.tcs.2007.05.014MathSciNetView ArticleMATHGoogle Scholar
 Mossel E, O'Donnell R, Servedio RP: Learning juntas. In Proceedings of the ACM Symposium on Theory of Computing. ACM, San Diego, CA, USA; 2003:206212.Google Scholar
 Mossel E, O'Donnell R, Servedio RA: Learning functions of k relevant variables. J Comput Syst Sci 2004,69(3):421434. 10.1016/j.jcss.2004.04.002MathSciNetView ArticleMATHGoogle Scholar
 Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO: Integrating highthroughput and computational data elucidates bacterial networks. Nature 2004,429(6987):9296. 10.1038/nature02456View ArticleGoogle Scholar
 Schober S, Mir K, Bossert M: Reconstruction of boolean genetic regulatory networks consisting of canalyzing or low sensitivity functions. Proceedings of International ITG Conference on Source and Channel Coding (SCC'10) 2010.Google Scholar
 Schober S, Heckel R, Kracht D: Spectral properties of a boolean model of the E.Coli genetic network and their implications of network inference. In Proceedings of International Workshop on Computational Systems Biology. Luxembourg; 2010.Google Scholar
 BenOr M, Linial N: Collective coin flipping, robust voting schemes and minima of banzhaf values. Proceedings of IEEE Symposium on Foundations of Computer Science 1985, 408416.Google Scholar
 Lynch JF: Current Developments in Mathematics Biology: Proceedings of Conference on Mathematical Biology and Dynamical Systems. In Dynamics of Random Boolean Networks. Edited by: Culshaw R, Mahdavi K, Boucher J. World Scientific Publishing Co; 2007:1538.Google Scholar
 Kahn J, Kalai G, Linial N: The influence of variables on boolean functions. IEEE Proceedings of Symposium on Foundations of Computer Science 1988, 6880.Google Scholar
 Grefenstette J, Kim So, Kauffman S: An analysis of the class of gene regulatory functions implied by a biochemical model. Biosystems 2006,84(2):8190. 10.1016/j.biosystems.2005.09.009View ArticleGoogle Scholar
 Kauffman SA, Peterson C, Samuelsson B, Troein C: Genetic networks with canalyzing boolean rules are always stable. PNAS 2004,101(49):1710217107. 10.1073/pnas.0407783101View ArticleGoogle Scholar
 Samal A, Jain S: The regulatory network of e. coli metabolism as a boolean dynamical system exhibits both homeostasis and flexibility of response. BMC Syst Biol 2008,2(1):21. 10.1186/17520509221View ArticleGoogle Scholar
 Li F, Long T, Lu Y, Ouyang Q, Tang C: The yeast cellcycle network is robustly designed. PNAS 2004,101(14):47814786. 10.1073/pnas.0305937101View ArticleGoogle Scholar
 Davidich MI, Bornholdt S: Boolean network model predicts cell cycle sequence of fission yeast. PLoS ONE 2008,3(2):e1672. 10.1371/journal.pone.0001672View ArticleGoogle Scholar
 McNaughton R: Unate truth functions. IRE Trans Electron Comput 1961, 10: 16.MathSciNetView ArticleGoogle Scholar
 Linial N, Mansour Y, Nisan N: Constant depth circuits, Fourier transform, and learnability. Journal ACM 1993,40(3):607620. 10.1145/174130.174138MathSciNetView ArticleMATHGoogle Scholar
 Bshouty NH, Jackson JC, Tamon C: Uniformdistribution attribute noise learnability. Inf Comput 2003,187(2):277290. 10.1016/S08905401(03)001354MathSciNetView ArticleMATHGoogle Scholar
 Gotsman C, Linial N: Spectral properties of threshold functions. Combinatorica 1994,14(1):3550. 10.1007/BF01305949MathSciNetView ArticleMATHGoogle Scholar
 Siegenthaler T: Correlationimmunity of nonlinear combining functions for cryptographic applications. IEEE Trans Inf Theory 1984,30(5):776780. 10.1109/TIT.1984.1056949MathSciNetView ArticleMATHGoogle Scholar
 Xiao GZ, Massey JL: A spectral characterization of CorrelationImmune combining functions. IEEE Trans Inf Theory 1988,34(3):569571. 10.1109/18.6037MathSciNetView ArticleMATHGoogle Scholar
 Knuth DE: Art of Computer Programming, Volume 3: Sorting and Searching. 2nd edition. AddisonWesley Professional, Reading, MA; 1998.Google Scholar
 Zhao W, Serpedin E, Dougherty ER: Inferring connectivity of genetic regulatory networks using informationtheoretic criteria. IEEE/ACM Trans Comput Biol Bioinf 2008,5(2):262274.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.