Boolean networks are an important class of computational models for molecular interaction networks. Boolean canalization, a type of hierarchical clustering of the inputs of a Boolean function, has been extensively studied in the context of network modeling where each layer of canalization adds a degree of stability in the dynamics of the network. Recently, dynamic network control approaches have been used for the design of new therapeutic interventions and for other applications such as stem cell reprogramming. This work studies the role of canalization in the control of Boolean molecular networks. It provides a method for identifying the potential edges to control in the wiring diagram of a network for avoiding undesirable state transitions. The method is based on identifying appropriate input-output combinations on undesirable transitions that can be modified using the edges in the wiring diagram of the network. Moreover, a method for estimating the number of changed transitions in the state space of the system as a result of an edge deletion in the wiring diagram is presented. The control methods of this paper were applied to a mutated cell-cycle model and to a p53-mdm2 model to identify potential control targets.

Introduction

A gene regulatory network (GRN) is a representation of the intricate relationships among genes, proteins, and other substances that are responsible for the expression levels of mRNA and proteins. The amount of these gene products and their temporal patterns characterize specific cell states or phenotypes. Thus, GRNs play a key role in the understanding of the various functions of cells and cellular components and ultimately might help to design intervention strategies for the control of biological systems. Recently, practical applications in cancer systems biology such as the identification of new therapeutic targets have stimulated the development of computational tools that can help to identify new intervention targets. Experimentally, the interventions are realized by manipulating the wiring diagram of a system with the use of drugs or by gene knockouts to impact the dynamics of the system so that it is directed towards a desired state [5, 7, 19, 32, 33]. From the modeling perspective, the identification of intervention targets amounts to finding a set of relevant nodes and edges that can be used for performing interventions in silico.

Many dynamic systems theory approaches have been used over the last decades to develop computational tools for analyzing the dynamics of GRNs. As a result, a large variety of models exists today. Boolean networks is a class of computational models in which genes can only be in one of two states: ON or OFF. Boolean networks (BNs), and more general discrete models in which genes can take on more than two states, have been effectively used to model biological systems such as the yeast cell-cycle network [20], the Th regulatory network [22], the lac operon [30], the p53-mdm2 complex [1, 5, 25], A. thaliana [3], and for many other systems [2, 6, 10, 11, 27, 34].

BNs as models for GRNs were introduced by S. Kauffmann [16] and R. Thomas [29]. BNs have been proposed as a framework that does not rely on kinetic constants and therefore requires fewer parameters to estimate, which simplifies analysis. Boolean canalizing rules were introduced by S. Kauffman and collaborators [14] and reflect the concept of canalization in evolutionary biology that Waddington pioneered in 1942 [31]. Boolean canalization has been intensively studied from the network dynamic perspective [13, 15, 21, 24]. It has been shown that networks that use only nested canalizing rules exhibit more stable dynamics compare to network using random rules [15, 23]. Furthermore, it has shown that each additional layer of canalization provides a degree of stability [18, 21]. The Boolean functions in published models tend to have many canalizing variables [14, 23].

Boolean networks

A Boolean network can be defined as a dynamical system that is discrete in time as well as in variable states. More formally, consider a collection x_{1},…,x_{
n
} of variables, each of which can take on values in the binary set {0,1}. A Boolean network in the variables x_{1},…,x_{
n
} is a function

where each coordinate function f_{
i
} is a Boolean function on a subset of {x_{1},…,x_{
n
}} which represents how the future value of the i-th variable depends on the present values of the variables.

The dynamical properties of a Boolean network are given by the difference equation x(t+1)=F(x(t)); that is, the dynamics is generated by iteration of F. More precisely, the dynamics of F is given by the state space graph S, defined as the graph with vertices in \(\mathbb {K}^{n}=\{0,1\}^{n}\) which has an edge from x∈{0,1}^{n} to y∈{0,1}^{n} if and only if y=F(x). In this context, the problem of finding the states x∈{0,1}^{n} where the system will get stabilized is of particular importance. These special points of the state space are called attractors of a Boolean network, and these may include steady states (fixed points), where F(x)=x, and cycles, where F^{r}(x)=x for some integer number r>1. Attractors in Boolean network modeling might represent cell types [16] or cellular states such as apoptosis, proliferation, or cell senescence [12, 28].

Canalizing functions

A Boolean function \(f(x_{1},\ldots,x_{n}):\{0,1\}^{n}\rightarrow \{0,1\}\) is canalizing in the variable x_{
i
} with canalizing input value a and canalizing output value b if f(x_{1},…,x_{
i
}=a,…,x_{
n
})=b. That is, once x_{
i
} gets its canalizing input, it by itself determines the output of the function regardless of the value of the other variables. The variable x_{
i
} is called a canalizing variable.

Nested canalizing functions

Let σ be a permutation on the set {1,2,…,n}. The function \(f(x_{1},\ldots,x_{n}):\{0,1\}^{n}\rightarrow \{0,1\}\) is a nested canalizing function (NCF) in the variable order x_{
σ(1)},…,x_{
σ(n)} with canalizing input values a_{1},…,a_{
n
}∈{0,1} and canalizing output values b_{1},…,b_{
n
}∈{0,1} if it can be represented in the form

where either n=d, where x_{
d
} is a terminal canalizing variable and g is a constant or n<d where g(x_{
σ(d+1)},…,x_{
σ(n)}) is a non-constant function and none of the variables x_{
σ(d+1)},…,x_{
σ(n)} are canalizing for g. The integer d is called the nested canalizing depth of f. Such Boolean functions are called partially nested canalizing functions (PNCFs), see [18] for more details.

Layers of canalization

A Boolean function can be represented in different forms as a nested canalizing function. A unique representation of the function is obtained by grouping the variables in layers of canalization [21]. Every Boolean function can be uniquely written as

where \(M_{i} = \prod _{i=1}^{k_{i}}\left (x_{i_{j}}+a_{i_{j}}\right)\), P_{
c
} is a polynomial with no canalizing variables, and k=k_{1}+⋯+k_{
r
} is the canalizing depth. Each variable x_{
i
} appears in exactly one of the M_{1},M_{2},…,M_{
r
},P_{
c
}. The proof of this property is given in [9]. The number r in Eq. 2 is called the layer number of f.

Example1.1.

Consider the Boolean functions f_{1}, f_{2}, and f_{3} with truth tables given at Table 1. The layers representation for f_{1} is

Thus f_{1} has layer number equal to 1, f_{2} has layer number equal to 2, and f_{3} has layer number 1. The polynomial P_{
c
} does not have canalizing variables.

Definition of control actions

This paper considers two types of control action: deletion of edges and constant expression of edges. An edge deletion represents the experimental intervention that prevents a regulation from happening. This action can be achieved by the use of therapeutic drugs that target a specific gene interaction, see reference [5] where this type of control has been experimentally applied. A node deletion can be represented by the deletion of all of its outgoing edges. A constant expression or a constitutive activation of a node might result in aberrant cell proliferation and cancer, see [5] where the constant expression of cyclin G in the signaling pathway of p53 is reported as a signature of abnormal gene expression that leads to cancer. But constant expressions could also help to drive the system into a more desirable state, see [27] where constant expression of nodes have been proposed as potential controls. As a proof of principle, this paper will consider the constant expression of an edge as a potential control action.

Definition1.2 (Edge Control).

Consider the edge \(x_{i}\rightarrow x_{j}\) in the wiring diagram \(\mathcal {W}\). For \(u_{i,j}\in \mathbb F_{2}\), the control of the edge \(x_{i}\rightarrow x_{j}\) consists of manipulating the input variable x_{
i
} for f_{
j
} in the following way:

For each value of u_{
i,j
} we have the following control settings:

When u_{
i,j
}=0, \(\mathcal {F}_{j}(x,u_{i,j}) = f_{j}(x_{j_{1}},\ldots,x_{i},\ldots,x_{j_{m}})\). That is, the control is not active.

When u_{
i,j
}=1, \(\mathcal {F}_{j}(x,u_{i,j}) = f_{j}(x_{j_{1}},\ldots,x_{i}=0,\ldots, x_{j_{m}})\). This is the case when the control is active and the action represents the removal of the edge \(x_{i}\rightarrow x_{j}\).

For simplicity, in Definition 1.2 we considered only edge deletions. To include both the deletion and constant expression of an edge we could consider the following control function

Then for each combination of \(u^{-}_{i,j}\) and \(u^{+}_{i,j}\) we have the following control settings:

For \(u^{-}_{i}=0, u^{+}_{i}=0\), \(\mathcal {F}_{j}(x,0,0) = f_{j}\left (x_{j_{1}},\ldots,x_{i},\ldots,x_{j_{m}}\right)\). That is, the control is not active.

For \(u^{-}_{i}=1, u^{+}_{i}=0\), \(\mathcal {F}_{j}(x,1,0) = f_{j}\left (x_{j_{1}},\ldots,x_{i}=0,\ldots,x_{j_{m}}\right)\). This action represents the knock out of the node x_{
j
}.

For \(u^{-}_{i}=0, u^{+}_{i}=1\), \(\mathcal {F}_{j}(x,0,1) = f_{j}\left (x_{j_{1}},\ldots,x_{i}=1, \ldots,x_{j_{m}}\right)\). This action represents the constant expression of the node x_{
j
}.

For \(u^{-}_{i}=1, u^{+}_{i}=1\), \(\mathcal {F}_{j}(x,1,1) = f_{j}\left (x_{j_{1}},\ldots,x_{i}+1, \ldots,x_{j_{m}}\right)\). This action changes the variable x_{
i
} to its negative value and might not be a relevant case of control.

Methods

Eliminating state transitions through edge deletion and constant expression

We avoid undesirable state transitions in the state space graph of a system of canalizing functions by means of edge deletion in the system’s wiring diagram.

Let \(\mathbf {F}=(f_{1},\ldots,f_{n}):\{0,1\}^{n}\to \{0,1\}^{n}\) be a Boolean network and S=(V_{
s
},E_{
s
}) be the state space graph of F, where \(V_{s}\subseteq \{0,1\}^{n}\) is the vertex set of S and \(E_{s}\subseteq \{0,1\}^{n}\times \{0,1\}^{n}\) is its edge set. Suppose for u,v∈V_{
s
} there is a directed edge {u,v}∈E_{
s
} which represents an undesirable transition. We eliminate the transition by deleting appropriate edges from the wiring diagram of the system, W=(V_{
w
},E_{
w
}), where V_{
w
}={x_{1},…,x_{
n
}} and E_{
w
}=V_{
w
}×V_{
w
}. The following is a sufficient condition for eliminating a transition from S through deleting an edge in E_{
w
}.

Method2.1.

Suppose x_{
t
}∈V_{
w
} which takes input from x_{
k
}∈V_{
w
}, i.e. {x_{
k
},x_{
t
}}∈E_{
w
} (we will also use the notation x_{
k
}→x_{
t
}). Let also x_{
k
} be a canalizing variable in f_{
t
}, the functions that determines the state of x_{
t
} in S. If the following four conditions are met, then deleting the edge {x_{
k
},x_{
t
}} from E_{
w
} results in eliminating the transition {u,v} from E_{
s
}:

1.

No variable in a more dominant layer assumes its canalizing input in u.

2.

x_{
k
} has canalizing input 0.

3.

The k-th entry of u is 1, i.e. [ u]_{
k
}=1.

4.

x_{
k
} has canalizing output that is the negation of the t-th entry of v, that is \(\overline {[\!\mathbf {v}]_{t}}\).

The reason behind the first condition is that if any variable whose layer is more or equally dominant than x_{
k
}’s layer has assumed its canalizing input in u, then replacing x_{
k
} with 0 will have no effect on f_{
t
}’s output. Deleting an edge has to impose change on the network for control and so the second requirement is needed since if [ u]_{
k
}=0 already, then deleting the edge x_{
k
}→x_{
t
} will have no effect on the network. The third condition has a similar explanation.

Similar sufficient conditions can be stated for eliminating a state space transition through constant expression of an edge, simply by replacing 0 with 1 and 1 with 0.

Node deletion can also be used for control through canalization. In that case, node deletion corresponds to deleting the outgoing edges from the deleted node, and Method 2.1 can be applied to each one of them.

Effect of edge deletion and constant expression on the state space

We count the maximum number of state space transitions that can be changed as a result of deleting a single edge.

Method2.2.

Let \(\mathbf {F}=(f_{1},\ldots,f_{n}):\{0,1\}^{n}\to \{0,1\}^{n}\) be a Boolean network where f_{
t
} is a PNCF of depth d in m variables in canalizing variable order 1,2,…,d. The deletion of the edge x_{
k
}→x_{
t
} results in

(a)

up to \(2^{n-\ell _{1}-\ell _{2}-\ldots -\ell _{r}}\phantom {\dot {i}\!}\) changes in the state space if k≤d and x_{
k
} is in the r-th layer of f_{
t
}, where ℓ_{1},…ℓ_{
r
} are the numbers of variables in layers 1,…,r, respectively; that is, the probability that any transition will be removed from the state space upon deletion of x_{
k
}→x_{
t
} is at most \(2^{n-\ell _{1}-\ell _{2}-\ldots -\ell _{r}}/2^{n}=\left (\frac {1}{2}\right)^{\ell _{1}+\ell _{2}+\ldots +\ell _{r}}\);

(b)

up to 2^{n−d−1} changes in the state space if d<k≤m, i.e., x_{
k
} is not canalizing; thus, the probability that a particular transition will be removed from the state space upon deletion of x_{
k
}→x_{
t
} is at most \(\left (\frac {1}{2}\right)^{d-1}\).

To see how the bound is calculated, notice that when x_{
k
}→x_{
t
} is deleted, half of the transition table can potentially change (the other half had x_{
k
}=0 already). Of the remaining half, half contains the canalizing input of a variable in the most dominant layer and so x_{
k
} cannot cause change. Now half of the half only can possibly change but half of that has the canalizing input of another variable from a more or equally dominant layer to the one where x_{
k
} is, thus preventing x_{
k
} from causing change, etc.

This upper bound remains the same when instead of deleting an edge, an edge is constantly expressed.

General procedure for identifying control edges

Below we provide a general procedure for identifying control edges based on the two methods we developed. Figure 1 further illustrates it.

Given a Boolean network model for a biological system:

1.

Formulate a goal in terms of the part of the state space you wish to avoid, e.g., a fixed point or a cycle. If it contains more than one state, consider all states that are part of it. Choose to begin with one of them, v.

2.

For control via edge deletion, identify all 1’s in the state u preceding v; for control via constant expression, identify all 0’s.

3.

Begin with, say, the leftmost 1 (or 0) in u. Observe the canalizing structure of the functions in your model. Then check if the conditions of Method 2.1 are satisfied.

4.

If some of the conditions of Method 2.1 are not met or you are looking for other control options, proceed to the next 1 (or 0) in u.

5.

If you wish to avoid a cycle or other trajectory that contains several states, you can repeat the above steps on all states in order to find all control options.

6.

If you find multiple edges as candidates for control, you may want to choose to delete the ones that have the smallest impact on the state space thus minimizing the side effects of edge manipulation.

Results

We apply the control methods we developed to the Boolean models of two networks: a model of the human tumor suppressor gene p53 pathways [17] and a mammalian cell-cycle network [8].

p53-mdm2 model

In [17], a Boolean model, Eq. (3), of the widely studied p53 pathway is built, where the external signal is dna_dsb, the DNA damage input.

The other variables are ATM, p53, Wip1, and Mdm2. When dna_dsb =0, the state space has a single fixed point, (0000), corresponding to no stress. However, when dna_dsb =1, i.e., the DNA damage input turns on, the state space contains a single cycle of length seven (Fig. 3) and no fixed points. The cycle represents cyclic variation in the expression patterns of all the four genes. We want to prevent this cycle from taking place through removing one or multiple transitions from it.

The wiring diagram of the model is presented on Fig. 2. By Method 2.1, we identify that deleting edge p53 →Wip1 (which also happens to correspond to deleting the node Wip1) has the effect of removing the following four transitions from the undesirable cycle in Fig. 3:

The bold entries in the states correspond to the entries where conditions 3 and 4 of Method 2.1 apply. As a result, the system has only a single steady state, (1100). Unfortunately, not all four proteins are inactive in it but we will see that this can be achieved through constant expression of the edge Wip1 →ATM, as it will be discussed later.

We can also count the number of changes that deleting p53 →Wip1 from the wiring diagram induces on the state space transitions: by Method 2.2, there can be up to 2^{4−1}=8 changes, and in fact we observe exactly as many, demonstrating that the bound from Method 2.2 is sharp: out of the 16 states in the state space, 8 contain p53 =1. When applying the original update rules, the value of p53 in these eight states remains 1 (and so Wip1 becomes 1), while after deleting p53 →Wip1, Wip1 becomes 0 since now the output of its update rule is 0, thus causing a change in the state space.

Constant expression of an edge is another strategy for removal of transitions in a state space graph. There are analogous conditions to Method 2.1 for constant expression, obtained by simply replacing 0 with 1 and vice versa. For example, Wip1 is a canalizing variable in the function of ATM with canalizing input 1 and canalizing output 0. Therefore, we can set the edge Wip1 →ATM to constant expression in order to remove the following transitions from the undesirable cycle: (10001)→(11001),(11001)→(11101), and (00011)→(10001). The result is a state space with fixed point (0000), corresponding to no stress as when dna_dsb =0. Another option for control via constant expression is the edge Mdm2 →p53 which also results in a single steady state, although this time it is (1000).

Mutated cell-cycle network model

As a second application, we consider Fauré et al. [8] who proposed a Boolean model of the cell-cycle progression. We focus on the scenario when the tumor suppressor retinoblastoma protein Rb is absent as reflected in Eq. (4). The wiring diagram for that case is given in Fig. 4. Fauré et al. [8] assume that the expression of CycD changes independently of the cell’s content and reflects the state of the growth factor. According to their model, the mammalian cell-cycle with a mutated phenotype will cycle through the eight states (Fig. 5) even when CycD is inactive.

We propose four edges from the wiring diagram in Fig. 4 that can be used for control in order to avoid the cycle in Fig. 5. These edges were identified following Method 2.1 applied on transitions in the cycle with the objective of eliminating the cycle and also leading the system to fixed point(s) where p27=Cdh1=1 as in a normal cell. The results are summarized in Table 2. Other attempts at control produced fixed points and/or cycles where p27=Cdh1=0.

Discussion

In the Methods section, we noticed that the bound from Method 2.2 is sharp. This was demonstrated using the p53 model. In general, the exact bound from part (a) of Method 2.2 is achieved when \(\mathcal {F}_{t}(x,u_{t,k})=x_{k}\) or \(\overline {x_{k}}\) (as a function), where \(\mathcal {F}_{t}(x,u_{t,k})\) is the function obtained from f_{
t
} by plugging in the canalizing input of the variables that are more or equally dominant to x_{
k
}.

The bound from Method 2.2 can also help choose which edge to delete or constantly express when there is more than one option with the purpose of controlling the side effects resulting from an edge manipulation. If it is desirable to minimize the impact on the state space, thus avoiding possible negative side effects on the system, one should choose for control an edge whose input variable is in the least dominant layer possible in the target function. That is, if x_{
i
} and x_{
j
} are both canalizing variables in f_{
t
} and x_{
j
} is in a less dominant layer than x_{
i
}, then one should choose to delete or constantly express the edge x_{
j
}→x_{
t
} since according to the bound of Method 2.2, the maximum impact of this control is smaller than if x_{
i
}→x_{
t
} is manipulated.

It is important to point out that Method 2.1 only guarantees that a certain transition will be avoided and one may be able to use this to remove a cycle from a state space as we did for the p53 model. However, the method does not guarantee that the system will not contain other cycles since removing transitions from a cycle destroys the cycle but may also create a different one, nor does it guarantee that the resulting fixed point will be exactly the desired one as it was observed in both applications. To find controllers that give the desired fixed points one could use the algebraic methods described in [26].

This paper considers edge manipulations as potential control actions to avoid undesirable attractors. Control through edge manipulations in the wiring diagram of a network has been previously considered in [4]. Although the authors of [4] consider edge additions in the wiring diagram as control actions, that is, by adding new regulators to the existing set of regulators that help the system to transition into a desirable attractor.

Conclusions

The structure of the canalizing variables in a biologically relevant Boolean rule plays an important role in the control of Boolean networks. Special combinations of canalizing inputs can help identify network controllers and the canalizing structure of a Boolean function allows to estimate the number of transitions that change after using the type of controllers proposed in this paper. Moreover, the hierarchy of the canalizing variables can be used for assessing the impact on the network dynamics as a result of a given control. This paper exploits the canalizing properties of Boolean rules to derive a method that can be useful for identifying control targets for avoiding undesirable states. Additionally, it provides a method for assessing the impact of the controllers on the dynamics of the uncontrolled network. Thus these two complementary methods can help in the selection of appropriate controllers. Method 2.1 gives a practical way for identifying the potential edges to control in the wiring diagram of a network for avoiding undesirable state transitions. Method 2.2, on the other hand, provides a measure of the impact of an edge deletion onto the state space of a model and establishes that this impact differs significantly based on the canalizing properties of the nodes involved: an edge coming from a node with stronger canalization, represented in the model by a variable in a more dominant layer, has exponentially higher probability to change the state space than an edge from a node with weaker canalization, represented by a variable in a less dominant layer. Therefore, Method 2.2 is a useful tool for assessing the impact of the controllers identified by Method 2.1 on the dynamics of the system providing a way for selecting desirable controllers.

References

Abou-Jaoudé, W, Ouattara, DA, Kaufman, M (2009). From structure to dynamics: frequency tuning in the p53-mdm2 network i. logical approach. J. Theor. Biol, 258, 561–577.

Albert, R, & Othmer, HG (2003). The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in drosophila melanogaster. J. Theor. Biol, 223, 1–18.

Balleza, E, Alvarez-Buylla, ER, Chaos, A, Kauffman, S, Shmulevich, I, Aldana, M (2008). Critical dynamics in genetic regulatory networks: examples from four kingdoms. PLoS ONE, 3, e2456.

Choi, M, Shi, J, Jung, SH, Chen, X, Cho, K-H (2012). Attractor landscape analysis reveals feedback loops in the p53 network that control the cellular response to dna damage. Sci. Signal, 5, ra83.

Fauré, A, Naldi, A, Chaouiya, C, Thieffry, D (2006). Dynamical analysis of a generic boolean model for the control of the mammalian cell cycle. Bioinforma, 22, e124—e131.

Helikar, T, Kochi, N, Kowal, B, Dimri, M, Naramura, M, Raja, SM, Band, V, Band, H, Rogers, JA (2013). A comprehensive, multi-scale dynamical model of erbb receptor signal transduction in human mammary epithelial cells. PLoS ONE, 8, e61757.

Helikar, T, Konvalina, J, Heidel, J, Rogers, JA (2008). Emergent decision-making in biological signal transduction networks. Proc. Natl. Acad. Sci. USA, 105, 1913–1918.

Huang, S (1999). Gene expression profiling, genetic networks, and cellular states: an integrating concept for tumorigenesis and drug discovery. J. Mol. Med. (Berl), 77, 469–480.

Kauffman, S, Peterson, C, Samuelsson, B, Troein, C (2003). Random boolean network models and the yeast transcriptional network. Proc. Natl. Acad. Sci, 100, 14796–14799.

Kauffman, S, Peterson, C, Samuelsson, B, Troein, C (2004). Genetic networks with canalyzing boolean rules are always stable. Proc. Natl. Acad. Sci. USA, 101, 17102–17107.

Murrugarra, D, Veliz-Cuba, A, Aguilar, B, Laubenbacher, R. (2015): Identification of control targets of boolean molecular network models via computational algebra. Under review. Link to manuscript: http://arxiv.org/abs/1508.05317.

Saadatpour, A, Wang, R-S, Liao, A, Liu, X, Loughran, TP, Albert, I, Albert, R (2011). Dynamical and structural analysis of a t cell survival network identifies novel candidate therapeutic targets for large granular lymphocyte leukemia. PLoS Comput. Biol, 7, e1002267.

Shmulevich, I, & Dougherty, ER. (2010). Probabilistic Boolean Networks - The Modeling and Control of Gene Regulatory Networks: SIAM. ISBN: 978-0-89871-692-4.

The authors declare that they have no competing interests.

Authors’ contributions

Both authors contributed equally to all aspects of the paper. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.