Template-based intervention in Boolean network models of biological systems

Motivation A grand challenge in the modeling of biological systems is the identification of key variables which can act as targets for intervention. Boolean networks are among the simplest of models, yet they have been shown to adequately model many of the complex dynamics of biological systems. In our recent work, we utilized a logic minimization approach to identify quality single variable targets for intervention from the state space of a Boolean network. However, as the number of variables in a network increases, the more likely it is that a successful intervention strategy will require multiple variables. Thus, for larger networks, such an approach is required in order to identify more complex intervention strategies while working within the limited view of the network’s state space. Specifically, we address three primary challenges for the large network arena: the first challenge is how to consider many subsets of variables, the second is to design clear methods and measures to identify the best targets for intervention in a systematic way, and the third is to work with an intractable state space through sampling. Results We introduce a multiple variable intervention target called a template and show through simulation studies of random networks that these templates are able to identify top intervention targets in increasingly large Boolean networks. We first show that, when other methods show drastic loss in performance, template methods show no significant performance loss between fully explored and partially sampled Boolean state spaces. We also show that, when other methods show a complete inability to produce viable intervention targets in sampled Boolean state spaces, template methods maintain significantly consistent success rates even as state space sizes increase exponentially with larger networks. Finally, we show the utility of the template approach on a real-world Boolean network modeling T-LGL leukemia. Conclusions Overall, these results demonstrate how template-based approaches now effectively take over for our previous single variable approaches and produce quality intervention targets in larger networks requiring sampled state spaces. Electronic supplementary material The online version of this article (doi:10.1186/s13637-014-0011-4) contains supplementary material, which is available to authorized users.


Robustness of Single Variable Measures
We claim that our small network measures of popularity and power identify targets for intervention in biological interaction networks modeled in the Boolean network framework. Furthermore we claim that the targets identified by our measures are better than those that can be identified by related approaches for identifying key network nodes, including two measures specific to Boolean networks. To justify this claim we present the following simulation experiment.

Problem Statement and Hypothesis
Are the popularity and power measures superior to related network measures for finding targets for intervention? We hypothesize that the popularity and power measures will lead to desired intervention outcomes with a greater rate of success than other methods and design the following experiment to test this prediction.

Experiment
Abstractly, an intervention should shift the steady behavior of a system to a different (usually more desirable) state. In the context of the Boolean network formalism, this is represented by shifting the steady behavior of the system, represented by an attractor state or cycle, into a different basin of attraction. A successful intervention needs only to shift the state from the starting attractor state into any state in the basin of the goal attractor, as the network dynamics will naturally bring the state to the attractor itself. Thus, we define a successful intervention as a single-variable modification to an attractor state which generates a state in the goal attractor basin.

Restrictions and Omitted Interventions
We make a few restrictions on our experiment to ensure a fair outlook on the results. First, attractor states are extremely stable and difficult to break out of with only a single variable perturbation. Nonetheless we restrict interventions to one single variable to induce a challenge and to simplify the experiment. Second, a "self-intervention" where the source and goal basins of attraction are the same could be considered a "maintenance" intervention if the basin is a desirable one; however this is a special case at best, and the robustness of attractor states will generate higher intervention success rates, inflating the overall performance. Thus we do not include interventions were the source and goal basins are the same, which also means that we discard networks with only one attractor basin. Third, our artificial intelligence planning work [2] showed that reaching the smallest, rarest basins often requires a series of interventions. Attempting these interventions with a single variable perturbation would unfairly bring scores down, so we omit interventions with a goal basin occupying less than 15% of the total state space.

Types of Interventions Performed
To create a simple comparison across a variety of approaches, we perform a single-variable perturbation upon the top variable candidate from each measure. If more than one variable shares the top rank for a particular measure, we choose between them randomly. If the measure includes a way to provide a value for the perturbation, we use the given value, otherwise we choose a value at random to be shared across any measure requiring one; in this way if more than one measure wishes to perturb the same variable but does not specify a value, they will all perturb the variable with the same random value so that no one method is advantaged within a particular instance.
We offer 14 methods for selecting an intervention target for comparison, with some of those being combinations of related measures. These methods are listed in Table 1 and are described here. First we offer our popularity and power measures. Empirical observations have shown us that sometimes the best variable can be identified by a collaboration between related measures. For example, let variable x have the highest popularity score (but a low power score), and let variable y have a high power score (but a low popularity score). Another variable z may not score higher in popularity or power than x or y, but it may have somewhat high values for both. In this case the harmonic mean of popularity and power may reveal z to be the top variable over x or y; thus we include the harmonic mean of popularity and power as the third way to identify an intervention target. These three measures are unique in that they have a specific value for each variable to be used in the intervention. This value used is based on the frequency of ON values in the reduced sum-of-product terms for the particular basin of attraction; if the frequency is greater than or equal to 0.5, a value of 1 is used, else, a value of 0 is used.
The next six methods for identifying an intervention target are variations of the influence and sensitivity measures, which are designed specifically for Boolean networks [4]. We use influence, sensitivity, and their harmonic mean to identify the best target for intervention. Following, we use a basin-specific versions of influence and sensitivity (and their harmonic mean) to offer a basinspecific advantage akin to our popularity and power measures. The remaining methods are related to the original network topology and common methods for identifying seemingly important network variables. First, we use the node with the highest degree, including both inbound and outbound edges, but not double-counting for self-regulation. Next, we use the nodes with the top centroid value, eccentricity value and betweenness centrality value [3]. We also include an intervention using a randomly chosen target for comparison purposes.

Methodology of the Experiment
We use Algorithm 1 to automate our experiments, repeating it for three different small network sizes. Each network is randomly generated using the biologically inspired connectivity probability distribution [1]. Rather than intervening upon randomly selected states, we intervene only upon attractor states. This is because we expect to find the system in a steady state, so it is the logical subset of states upon which to intervene. While some basins have a singleton attractor state, many basins have cyclic attractors with many states. To maximize the efficiency of the simulation, we will attempt an intervention from every attractor state in every basin in the state space to every other basin of significant size. This will allow us to attempt many interventions for each random network generated.

Results of Experiment
Overall we observed the dominance of the popularity and power measures (along with their harmonic mean) in identifying targets for intervention over other strategies, which is shown in Table 1 and in Fig. 1. The requested interventions were increasingly difficult with network size due to their restriction to a single variable. The benefits of counting easy "self-interventions" (i.e. staying in the same basin) were not included, and the detriments of attempting interventions to very rare basins were also omitted to get a fair idea of performance. The influence measure showed the best behavior after the popularity and power measures, and the topological measures performed poorly.
On the smaller networks, our measures dominate with a great deal of separation in performance. As the network size increases, however, the performance of the various methods begins to converge. This is the primary motivation for our introduction of template based approaches. We note that besides our methods, the Boolean-network-specific measure of influence (and it's basin-specific counterpart) perform the best on average of the related measures.
Sensitivity is a measure such that variables with low sensitivity can maintain their states better than others with higher sensitivity [5]. However this attribute does not imply the potential for regulatory control like the influence measure does. Thus it is not surprising to see influence dominate the sensitivity measure. It has been said that a variable with a low sensitivity but a high influence would make the best intervention targets [5]; however while our results certainly shown an improvement over sensitivity alone by the harmonic mean of influence and (low) sensitivity, the combination of the two was not able to outperform influence alone in either the whole-network or basin-specific cases. Of the topological measures, only node degree and betweenness passed, on average, the performance of a random intervention target. Centroid and eccentricity failed completely, oftentimes being outperformed by a random intervention.
Intervention success counts and numbers of trials were used to calculate confidence intervals according to the Binomial test, since our interventions qualified as Bernoulli trials. While we can visually see separation in the figure and elevated average and individual performances, the confidence intervals of our small network measures did indeed overlap with some of the best scoring related measures. But, while topological measures can certainly lend insight at the top level of networks, the underlying complexities of the Boolean model are clearly better elucidated by measures designed specifically for the Boolean model. Of those, the robustness of the popularity and power measures is clearly demonstrated, despite their ability to statistically separate in the given trial. We hypothesize that if the interventions were to begin from random states and not just from attractor states that we would gain the separation needed for a statistical significance; however our motivation to move to larger networks and introduce template based approaches renders this point moot.

Conclusions
We believe the dominating success of popularity and power measures is due to the inherent inclusion of a specific value to use in the intervention, and that both target variables and their values are specific to the goal basin. In each intervention instance, both the target and the value were chosen from the popularity and power measures specific to the goal basin. These advantages absent in other methods clearly resulted in the higher performance. It is also not surprising to see the influence measures appearing next in the performance after our measures for two reasons. First, like our measures, it is a measure designed in and thus taking advantage of the Boolean network model. Second, the basin-specific influence we designed to add a degree of fairness demonstrated a benefit from that modification. The topological measures without either basin or value specificity naturally performed the most poorly.
Despite a lack of complete separation of confidence intervals, but based on the consistent top performance of the popularity and power measures across Table 1: Single Variable Network Measures Compared: We have shown the ability of the single variable measures of popularity and power to identify intervention targets missed by other related measures. In this table we report average success estimates for intervening with a single variable chosen by a variety of methods and their combinations. Random networks with between 7 and 16 variables were generated and those networks with between 2 and 7 basins of attraction were intervened upon. Overall, we observed popularity, power, and their harmonic mean as supplying the most effective intervention targets. After these, the Boolean-network-specific measures of influence and sensitivity performed best, especially their basin-specific versions invented for this study. Thus we conclude that popularity, power and the harmonic mean of popularity and power produce the best single variable intervention targets.   Table 1. We observe the dominance of popularity, power, and their harmonic mean over other related measures.. various network sizes and over thousands of interventions, we fail to reject our hypothesis and conclude that the popularity and power measures lead to desired intervention outcomes with a greater rate of success than other topological, centrality-based, and even Boolean network specific methods.