Plan
Comptes Rendus

Biological modelling / Biomodélisation
Bio-array images processing and genetic networks modelling
[Traitement d'images bio-array et modélisation de réseaux génétiques]
Comptes Rendus. Biologies, Volume 326 (2003) no. 5, pp. 487-500.

Résumés

The new tools available for gene expression studies are essentially the bio-array methods using a large variety of physical detectors (isotopes, fluorescent markers, ultrasounds...). Here we present first rapidly an image-processing method independent of the detector type, dealing with the noise and with the peaks overlapping, the peaks revealing the detector activity (isotopic in the presented example), correlated with the gene expression. After this primary step of bio-array image processing, we can extract information about causal influence (activation or inhibition) a gene can exert on other genes, leading to clusters of genes co-expression in which we extract an interaction matrix M and an associated interaction graph G explaining the genetic regulatory dynamics correlated to the studied tissue function. We give two examples of such interaction matrices and graphs (the flowering genetic regulatory network of Arabidopsis thaliana and the lytic/lysogenic operon of the phage Mu) and after some theoretical rigorous results recently obtained concerning the asymptotic states generated by the genetic networks having a given interaction matrix and reciprocally concerning the minimal (in the sense of having a minimal number of non-zero coefficients) matrices having given stationary stable states.

Cet article décrit d'abord rapidement quels sont les nouveaux outils utilisés pour étudier l'expression des gènes, essentiellement les bio-arrays, qui mettent en œuvre un grand nombre de détecteurs physiques (isotopes, marqueurs fluorescents, ultra-sons...). Nous présentons une méthode de traitement d'images indépendante du type de détecteur, traitant le problème du bruit et des superpositions de pics, ces derniers révélant l'activité du détecteur (isotopique dans le cas choisi ici) corrélée avec l'expression des gènes correspondants. Après ce premier stade de traitement d'images bio-array, on peut extraire l'information relative à l'influence (activation ou inhibition) qu'un gène peut exercer sur les autres gènes, conduisant ainsi à l'apparition de groupes de co-expression, d'où l'on peut extraire une matrice d'interaction M et un graphe d'interaction associé G, susceptibles d'expliquer la dynamique de la régulation génétique corrélée avec la fonction tissulaire associée. Nous donnons quelques exemples de telles matrices et de tels graphes d'interaction (en particulier dans le cas du réseau de régulation génétique de la floraison d'Arabidopsis thaliana et dans celui de l'opéron lytique/lysogénique du phage Mu), et ensuite quelques résultats théoriques rigoureux récemment obtenus sur les états asymptotiques générés par des réseaux génétiques ayant une matrice d'interaction donnée. Réciproquement, nous décrirons les matrices minimales (au sens du nombre de leurs coefficients non nuls) ayant des états stationnaires stables donnés.

Métadonnées
Reçu le :
Accepté le :
Publié le :
DOI : 10.1016/S1631-0691(03)00114-8
Keywords: bio-array images, genetic network, inter-genic interaction matrix, positive loops, operons
Mot clés : images bio-array, réseau génétique, matrice d'interaction inter-génique, boucles positives, opérons

Jacques Demongeot 1 ; Florence Thuderoz 1 ; Thierry Pascal Baum 1 ; François Berger 2 ; Olivier Cohen 1

1 TIMC–IMAG, CNRS 5525, Faculty of Medicine, 38700 La Tronche, France
2 INSERM U 318, University Hospital of Grenoble, 38700 La Tronche, France
@article{CRBIOL_2003__326_5_487_0,
     author = {Jacques Demongeot and Florence Thuderoz and Thierry Pascal Baum and Fran\c{c}ois Berger and Olivier Cohen},
     title = {Bio-array images processing and genetic networks modelling},
     journal = {Comptes Rendus. Biologies},
     pages = {487--500},
     publisher = {Elsevier},
     volume = {326},
     number = {5},
     year = {2003},
     doi = {10.1016/S1631-0691(03)00114-8},
     language = {en},
}
TY  - JOUR
AU  - Jacques Demongeot
AU  - Florence Thuderoz
AU  - Thierry Pascal Baum
AU  - François Berger
AU  - Olivier Cohen
TI  - Bio-array images processing and genetic networks modelling
JO  - Comptes Rendus. Biologies
PY  - 2003
SP  - 487
EP  - 500
VL  - 326
IS  - 5
PB  - Elsevier
DO  - 10.1016/S1631-0691(03)00114-8
LA  - en
ID  - CRBIOL_2003__326_5_487_0
ER  - 
%0 Journal Article
%A Jacques Demongeot
%A Florence Thuderoz
%A Thierry Pascal Baum
%A François Berger
%A Olivier Cohen
%T Bio-array images processing and genetic networks modelling
%J Comptes Rendus. Biologies
%D 2003
%P 487-500
%V 326
%N 5
%I Elsevier
%R 10.1016/S1631-0691(03)00114-8
%G en
%F CRBIOL_2003__326_5_487_0
Jacques Demongeot; Florence Thuderoz; Thierry Pascal Baum; François Berger; Olivier Cohen. Bio-array images processing and genetic networks modelling. Comptes Rendus. Biologies, Volume 326 (2003) no. 5, pp. 487-500. doi : 10.1016/S1631-0691(03)00114-8. https://comptes-rendus.academie-sciences.fr/biologies/articles/10.1016/S1631-0691(03)00114-8/

Version originale du texte intégral

1 Introduction

The total mRNAs of genes to test are extracted from the studied tissue (in the present case a glioblastome tissue). DNAs are synthesized by reverse transcription from these mRNAs including bases labelled with the isotope P33. Resulting DNAs are then tested against identified complementary DNAs (cDNA targets), previously amplified by PCR and fixed on a nylon gel. The hybridisation results are revealed in a phospho-imager and yield a digital image coming from the radioactive hybridisation plate, called the bio-array image or shortly the bio-image. cDNA hybridised with a P33DNA means that the complementary sequence of the P33DNA is present in the related mRNA, proving that the corresponding gene is expressed in the studied tissue.

2 Peak segmentation

The first encountered problem is the fact that the bio-images are extremely noisy and that we have to low-pass them in order to suppress the high-frequency noise. The result of this pre-treatment (Fig. 1) is a better separation of the isotopic activity peaks, allowing a watershed separation and contouring [1,2], but it often leads to over-estimating the peak activity.

Fig. 1

Raw data (left), watershed segmenting (top right), and contouring (bottom right).

Then we will apply a more accurate segmentation and contouring method called the potential-Hamiltonian or ‘Gaussian stamping’ method: let us remark that the peaks are about Gaussian, with a relatively weak kurtosis and skewness allowing in particular the respect of the conservation ‘law’: 2/3 of the peak activity are concentrated into the set of points (x,y) where the Gaussian curvature H(x,y) vanishes, i.e. inside the maximum gradient line of the peak. By exploiting this property, it is possible to neglect the part of the peak outside the projection of this remarkable line, called in the following the characteristic line, its equation being:

H(x,y)=2g/x22g/y2-(2g/xy)2=0
where g(x,y) is the grey function at the pixel (x,y).

We are thus led to consider the new grey function H(x,y) instead of the function g(x,y) and its level line H(x,y)=0. We display after a plane differential system of which the characteristic line is a limit cycle. Let H′(x,y) be the function defined by: H′(x,y)=|H(x,y)|. Vanishing of H′(x,y) occurs on the characteristic line (see Fig. 2 for the visualization of g and H′) and if we consider the following crude system:

dx/dt=-αH'/x+βH'/y
dy/dt=-αH'/y-βH'/x
where α and β are real parameters, then the first part of this differential system is of steepest descent potential nature and along this flow, the orbits converge to the set of zeros of H′(x,y), on which the second part of convective Hamiltonian type becomes preponderant [1]. Parameters α and β are used to tune the speed of convergence to the limit cycle. To cope with random noise and numeric instabilities, we modify slightly the system into:
dx/dt=-αH'/xH(x,y)/G(x,y)+βH'/y
dy/dt=-αH'/yH(x,y)/G(x,y)-βH'/x, where G(x,y)= grad (g)2
The added term H(x,y)/G(x,y) speeds up the descent to the vanishing of H(x,y) and forces the stability. The usual discretization of Runge–Kutta yields ultimately the algorithm, which is quite easy to implement. On each pixel (i,j) – boundary effects being neglected –, the function H(i,j) reads:
H(i,j)=g(i+2,j)-2g(i+1,j)+g(i,j)g(i,j+2)-2g(i,j+1)+g(i,j)-g(i+1,j+1)-g(i,j+1)-g(i+1,j)+g(i,j)2
We have seen an important property of the characteristic line, i.e. in the case of a Gaussian peak, it delimits a volume equal to 2/3 of the total volume of the peak. This property remains about exact in case of kurtosis and skewness of the peak. Hence by multiplying by 3/2 this volume, we get a good estimation of the gene activity and this value is better than that obtained by a watershed method due to over-segmentation (Fig. 1). This approach is interesting because the lower part of the peak is often noisy. The method seems particularly efficient when the peaks are well separated. If they are close (Fig. 3), then we need to tune the parameters α and β (Fig. 4). In further developments of the method, we look for a dynamical calculation of these parameters from the data. Finally, we can standardize the estimated activity in terms of a bio-image with small squares symbolizing in grey levels the degree of hybridisation of the cDNAs (Fig. 5). From such bio-images acquired at different times of the cell cycle in cells from the same tissue, we can study the interactions between genes by estimating an interaction matrix.

Fig. 2

Representation of g, G, H′ (with indication of potential p and Hamiltonian h parts) for a Gaussian peak and the result of the Hamiltonian segmentation (below left).

Fig. 3

g (top) and H′ (bottom) for close Gaussian peaks.

Fig. 4

Treated bio-image (left), succeeding limit cycle (middle) and false contour (right).

Fig. 5

Standardized bio-image (the treated sub-image of Fig. 4 is inside the black rectangle).

3 Interaction matrix M

A major problem a geneticist has presently to face since the introduction of the bio-array imaging is the estimation of the intergenic interaction matrix M that rules the observed genes expression in operons and genetic regulatory networks [3–7]. This interaction matrix is similar to the synaptic weight matrix, which rules the relationships between neurons in a neural network. Hence, it is in general of a great biological interest and relevance to determine matrices having characteristics like: (i) a minimal number of non-zero coefficients for a given set of stationary behaviours (fixed points or cycles), (ii) a minimal number of positive or negative circuits, controlling the number of attractors and their stability (cf. [8–21] for both the discrete and the continuous case). In this paper, we give some general results about the relationships between the positive and negative circuits in the graph of the interaction matrix M and the existence of fixed points. This permits us to characterize minimal matrices, given dynamical behaviours, and therefore partly solve the first problem. Finally, we constructed a bound for the number of fixed points in terms of the number of positive circuits in the graph of the interaction matrix M. So we partly solve the second problem too.

In general, it is very difficult to have exhaustively the interaction matrices: in the genetic literature and also by observing co-expressions through bio-arrays imaging, it is possible to qualitatively or even quantitatively estimate the inhibitory (in case of repression by a protein obtained from the expression of a gene) or activatory (in case of induction or promotion) coefficients of the interaction matrix. If we have no information, we can randomly choose the matrix by respecting certain basic rules, e.g., by respecting certain proportions of activatory or inhibitory interactions. We can, for example, obtain the location density of expressed ubiquitory genes (calculated from http://www.citi2.fr/GENATLAS/) and then randomly simulate the interaction matrix and the initial conditions of the gene expression, by sampling them 100 000 times, the interaction matrices respecting the constraint to have 10% (resp. 10%) of negative (resp. positive) interactions, like in the Arabidopsis thaliana genome [3]. Fig. 6 below gives the distribution of the co-expression of the ubiquitory genes calculated from the expected stationary behaviour corresponding to a random choice of the interaction matrix and of the initial conditions: we have systematically calculated attractors (fixed points or limit cycles) corresponding to an initial condition and an interaction matrix, and then we have calculated the frequency of observing the expression of each ubiquitory gene in these attractors. In the absence of complementary information about the localization of the inhibitory or activatory interactions between ubiquitory genes, the obtained co-expression distribution is just a reflect of the spatial distribution of these ubiquitory (domestic or housekeeping) genes along the human chromosome 3 showing that it is related to the rearrangements (due to physiological crossing-over or to pathological translocations) distributions, this parenthood being proved by the analogy between their signatures—i.e. their succession of monotony intervals of increase (+) or decrease (−)—as shown in Table 1.

Fig. 6

Distributions of domestic (bottom) and all genes (middle right), crossing-over (male red and female blue top) and translocations (middle left) along chromosome 3.

Table 1

Second column shows the distribution signatures and third the number of differences (− or + being not taken different from =) between a given distribution and the crossing-over one as reference

Ubiquitory genes distribution + + = + = = + + = + + + + + 3/21
All genes distribution + = + + + + = + + = + + + 9/21
Crossing-over distribution = = = = + + = + + = = = + + + 0/21
Translocation distribution = + = = + + = = + + = = + + 3/21

By comparing the distribution signatures given in Table 1 above, it is easy to prove that we must reject the hypothesis that ubiquitory genes and translocation distributions are different from the crossing-over distribution (p<0.001), but we cannot reject the hypothesis of a difference between all genes and the crossing-over distribution.

The general coefficient mij of the interaction matrix M is equal to +1 if the gene Gj activates the gene Gi, equal to −1 if the gene Gj inhibits the gene Gi and equal to 0 if Gj and Gi have no interaction, Gi being equal to +1 (resp. −1), if it is (resp. not) expressed. Then the change of state xi of the gene Gi between t and t+1 obeys a threshold rule: xi(t+1)=H(∑k=1,nmikxk(t)−bi) or x(t+1)=H(Mx(t)-b), where H is the sign-step function (H(y)=1, if y⩾0 and H(y)=−1, if y<0) and the bis are threshold values. In the case of small regulatory genetic systems (the smallest being called operons), the knowledge of such a matrix M permits to make explicit all possible stationary behaviours of the organisms having the corresponding genome: for example, in the genetic regulatory network that rules the Arabidopsis thaliana flower morphogenesis (Fig. 7), the interaction matrix is a (11,11)-matrix, with only 22 non-zero coefficients. This matrix presents a certain number of positive or negative circuits and only four observed attractors [3].

Fig. 7

Interaction graph of the flowering genetic regulatory network of Arabidopsis thaliana (right) and an attractor of its Boolean dynamics (left).

For each operon, we can define an interaction matrix M, which just expresses that if its coefficient mij is positive (resp. negative), the gene j is a promoter or activator (resp. repressor or inhibitor) of the gene i. If mij is null, then the gene Gj has no influence on the expression of the gene Gi. The interaction graph can be built from the interaction matrix M by drawing an edge + (resp. −) between the vertices representing the genes j and i, if mij>0 (resp.<0). In order to calculate the mijs, we can either determine the s-directional correlation ρij(s) between the state vector {xj(t-s)}tC,ts of gene j at time ts and the state vector {xj(t)}tC,ts of gene i at times t,t varying during the cell cycle C, or identify the system with a Boolean neural network.

We define the connectivity K(M) of the interaction matrix M by the ratio between the numbers m of edges of the interaction graph and n of vertices: in general, for known operons and genetic regulatory networks (lactose operon [5,6], Cro operon of the phage λ, lysogenic/lytic operon [4,7] of the phage infecting E. coli, gastrulation regulatory network...), K(M) is between 1.5 and 3. The observed induction proportion (number of positive edges divided by m) is between 1/3 and 1/2. If mijs are unknown, we can take them randomly by respecting connectivity and induction proportion.

4 Genetic networks dynamics

If we consider the interaction graph of the flowering genetic regulatory network of Arabidopsis thaliana (Fig. 7) [3], then we can easily define from it a Boolean dynamics with threshold 0: the gene i has the state 1 if it is expressed and −1 if not. The change of state of gene i between the times t and t+1 obeys a majority rule, i.e. we calculate the numbers of its neighbours in state 1 with positive interaction and with negative interaction: if these two numbers are equal, then the new state of i is 1; if the activatory (resp. inhibitory) neighbours dominate, then the new state of i is 1 (resp. −1). When the time t is increasing, the configuration of gene states reaches a stable set of configuration (either a fixed configuration or a cycle of configurations), called an attractor of the genetic network dynamics. In Fig. 7 (left), an example of such an attractor is given, with final states (in black boxes) different from the initial conditions.

We will present now first some qualitative results from the human genome observation, and after some theoretical corresponding statements recently proved:

  • – in 1949, Delbrück [17] conjectured that the presence of positive loops (i.e. paths from a gene i to itself having an even number of inhibitions [11]) in the interaction graph was a necessary condition for the cell differentiation; this conjecture has been more precisely written in a good mathematical context by Thomas in 1980 [16];
  • – in 1993, Kauffman [22] conjectured that the mean number of attractors for a Boolean genetic network with n genes and with connectivity 2, was equal to √n. This conjecture is supported by real observations: we have about 30 000 genes in the human genome and about 200 different tissues, which can be considered as different attractors of the same dynamics. For Arabidopsis thaliana, K(M)=22/11=2 and there is 4≈√11 different tissues (sepals, petals, stamens, carpels) [3] and for Cro operon [18] of phage λ,K(M)=14/5=2.8 and there is 2≈√5 observed (lytic and lysogenic) attractors.

Recently [8–15], these conjectures have been partially proved.

Proposition 1

If all loops of the interaction graph are positive, then there exists a state vector x=(x1,…,xn) in {−1,1}n, such that x and −x=(−x1,…,−xn) are fixed configurations of the network dynamics.

Proposition 2

If all loops of the interaction graph are negative, then there is no fixed configuration.

Proposition 3

Let a network having n genes and n interactions, then a necessary and sufficient condition of existence of a fixed configuration x is the existence of a positive loop and −x is also fixed.

Proposition 4

Given a state vector x, the set of minimal matrices M having x as fixed configuration is given by the following conditions:

  • (1) m ij =a ij xixj, where aij⩾0 and, for all i, there exists j(i) such that aij(i)>0;
  • (2) aij(i)<biaij(i), where aijs and bis are weights and thresholds of the genetic network [14].

Proposition 5

If m is the total number of positive loops C, then the number of fixed configurations is less than or equal to 2m, and this upper bound is reached, if and only if for any positive loop C, there is no edge (x,y)∣x∉C, y∈C.

Proposition 6

If the genetic network has n genes and 2n interactions, then the expectation of the number of its fixed configurations is √n, if n is sufficiently large [15].

Proposition 7

If the interaction graph G is a connected digraph without loops having n genes and let suppose that, for any i,bi > 0 and all mijs are positive and verify either (∑jmijbi and, for any k, ∑jkmij<bi)=AND rule, or (for each j, mijbi)=OR rule, then the number of fixed configurations is 2(n−1)/2 for n odd, and 2(n−2)/2+1 for n even [14].

Numerous applications of the results above are possible in various regulatory systems [23–40], but we will focus here in the following on a very simple operon governing the choice between the lytic and lysogenic stationary states for Escherichia coli infected by the bacteriophage Mu.

5 An example of operon: the lytic/lysogenic operon of the bacteriophage Mu

Understanding the behaviour of the tempered bacteriophage Mu, considered as a transposon, constitutes a real progress in the knowledge of transposition mechanisms [41].

A Boolean model of the interactions between the expression products of the Mu genome shows the necessity for the removal of the auto-inhibition of the protein Ner (negative loop), which allows the constitutive expression of the phage promoter Pe (positive loop) to demonstrate a bi-stability [11,42]. The presence of the prophage state instability is necessary for the induction of prophages. A review of Escherichia coli factors influencing the Mu behaviour allowed us to propose the Integration Host Factor (IHF) and the Inversion Stimulating Factor (ISF) as being responsible for the removal of auto-inhibition and for the prophagic state instability. The modelling of their effects by a differential system clearly shows the same lytic/lysogenic proportions than those experimentally obtained during the exponential bacterial growth phase and the stationary bacterial growth phase.

These encouraging results demonstrate the necessity of quantitative measures for lytic/lysogenic proportions and intra-cellular concentrations of different Escherichia coli factors during the entire length of the growth cycle, in order to better understand the induction context of the bacteriophage Mu.

The bacteriophage Mu has been discovered in 1963 by L. Taylor [43]. During the infection step, Mu incorporates its DNA and proteins at random locations in the host bacterial chromosome [44]. That induces mutations and new auxotrophies and Mu belongs to the large family of transposable elements [45]. There is two developmental cycles for Mu: after integration in the bacterial chromosome, the Mu DNA is multiplied by a series of replicative transpositions, giving between 50 and 100 new viruses after lysis of the bacterial host (lytic cycle). A weak proportion of infected bacteria becomes lysogenic, the Mu DNA remaining inactive (lysogenic cycle). The induction rate represents the proportion of such lysogenic bacteria entering in the lytic cycle.

Mu infects enterobacteria and in particular E. coli [46] and the integration of its genome into the chromosome of these bacteria involves the phagic transposase pA and its activator pB [47] (cf. Fig. 8). In general, this integrated DNA is amplified through a series of replicative transpositions [48] (involving the transposase pA and its activator pB), then the late phagic functions are synthesized, leading to assembling numerous viral particles dispatched in the extra-cellular medium after the lysis of the host cell. Among a population of infected bacteria, a small proportion will give birth to lysogenic cells, in which the phagic DNA remains passive; this DNA can be replicated during the further mitoses of the lysogenic cell or can enter in a lytic cycle. After integration, the choice between the two possible developmental cycles (lytic or lysogenic) is regulated at the levels of transcription and replicative transposition of the Mu genome.

  • Transcriptional regulation. The viral DNA has a size of 37 kb [49] and possesses two promoters Pe and Pc constitutively expressed [50] (Fig. 8A). The early transcript from Pe codes for the transposase pA, for its activator pB and for a protein Mor activating (via the promoter P and the protein C [51,52]) the cascade of the late functions of the lytic cycle. The transcript coming from Pc codes for the repressor c preferentially fixed on the operators O1 and O2 (Fig. 8B) for repressing Pe and stabilizing the Mu DNA in its inactive form (the prophage DNA). At high concentration, c is also fixed on O3, repressing Pc and hence regulating its own synthesis [53]. The Ner protein is fixed between Pe and Pc and inhibits the transcription from these two promoters [47,54].
  • Transpositional regulation. During the transposition, the transposase pA transiently fix the extremities of the Mu DNA, attL and attR, and the operators O1 and O2 in stabilizing a tetrameric structure, which leads the Mu DNA to a specific configuration, the transposome. The repressor c binds also the extremities of the Mu DNA with a weak affinity [55], entering in competition with pA and then impeaching it to bind the two operators and the two extremities [56]. Hence repressor c inhibits the lytic cycle both at the transcriptional and transpositional levels, and the transposase pA being physically and functionally unstable in vivo [57], the viral genome amplification implies a non-transient expression of the promoter Pe.

Fig. 8

Simplified scheme for the viral DNA of Mu; (A) shows different genes in transcriptional units and the actions of their expression products on the promoters' activity. The fixation sites of the different phagic (e.g., Ner on O2) or bacterial (e.g., IHF on O1) proteic factors are represented in (B).

5.1 Discrete logical (Boolean) model

Experiments about the phage Mu are done during the exponential growth phase of the bacteria for increasing the homogeneity of the observed bacterial states. The lysogenic frequency is highly depending on the used protocol (1% to 50%). We will base our model on lysogenic induction rates observed in exponential phase (0.001%) and in stationary phase (about 100%) (cts 62, temperature 30 °C). For interpreting these experimental data, we used first Thomas' logical approach [16,18,20,21], in which x represents the repressor c and the entity y denotes the group of proteins Pe, Ner, pA, pB, Kill and Mor; their interactions can be designed as in Fig. 9.

Fig. 9

Simplified scheme of the bacteriophage Mu operon (the blue arrow symbolizes the presence of a positive loop).

Variables x and y take values 1 (expression/presence in the cell) or 0 (non-expression/absence from the cell). With this notation, the state Ex, x=1 and y=0 corresponds to the lysogenic state and the state Ey, x=0 and y=1 to the lytic cycle. Let us notice that it is not necessary to explicitly represent the auto-inhibition of the repressor c, because c inhibits Pe and auto-inhibits itself at high concentrations, hence maintaining its concentration between two thresholds θ1 and θ2:

Below θ1 (e.g., x=0), the repressor c has insufficient concentration to be fixed to the operators O1 and O2, hence x does not inhibit y. Between θ1 and θ2, x inhibits y but is not sufficient for fixing O3; hence there is no auto-inhibition of x. Above θ2, x inhibits y and inhibits itself until going to a level less than θ2. Because this negative loop causes a homeostasis, we consider for x only the values x=0 and x=1, and the self-inhibition is not explicit in the model.

Because the two promoters Pe and Pc are constitutively expressed, x takes the value 1 if it is not inhibited by y: if y=0, then x=1 and if y=1, then x=0. Reciprocally, if x=0, y=1 and if x=1, y=0, leading to the following logic equations:

X=¬y
Y=¬x׬y
where x and y represent the values of the two logic variables X and Y.

We consider the four possible initial states Ex (x=1, y=0), Ey (x=0, y=1), Ez (x=0, y=0) and Et (x=1, y=1); if two variables ‘enlightened’ tend to become ‘off’ (noted 1) or conversely if two variables ‘off’ tend to become ‘on’ (noted 0+), they are not changing at the same time. For example, 0+0+ gives 10 or 01. The behaviour of this system (cf. Fig. 10A) has only a stable stationary state (sss) Ex (x=1, y=0), which corresponds to the lysogenic state. But Mu is a bi-stable system and we have to put another information in the model to get a second sss.

Fig. 10

Dynamical behaviour of Mu alone (A) and with a factor cancelling auto-inhibition of y (B). The stationary steady states are represented in bold and superscripts are corresponding to state changes.

For rendering stable the lytic state Ey (x=0, y=1), we need an activation of y cancelling its self-inhibition proportionally to its concentration [11]. In stationary phase, the proportion of lysogenes is near 100%, despite the fact that c is still present. Then we need two factors participating either to the maintenance of the passive phage state, or to the lytic cycle induction (both at the transcriptional and transpositional levels). A review of such possible cellular factors leads to the following list.

  • – IHF (Integration Host Factor). IHF is a histone which presents a high affinity for the operator part of the Mu DNA (Fig. 8B), increasing its curvature, hence facilitating the fixation of the repressor c on the operators [58,59] and activating the transcription from the promoter Pe passing over the retro-inhibition by Ner, if c is absent [60]. The absence of IHF impeaches any significative production of phage Mu [61]. The cell concentration of IHF is inversely proportional to the mitosis rate (from 12 000 mol/cell in exponential phase to 52 000 mol/cell in stationary phase [61]).
  • – HU (Histone U). It causes the same effects than IHF on transcription and transposition [62], but its concentration does not vary following the bacterial growth rate [61].
  • – ISF (Inversion Stimulating Factor). ISF helps c in repressing the lytic cycle: it increases the transcriptional repression and inhibits the transposition [63]; its absence multiplies by about 600 until 800 times the viral particles production in vivo [64]. Cell concentration of ISF is maximal at the start of the exponential growth phase (80 000 mol/cell) and decreases until a rate zero when the growth diminishes [65,66].
  • – 8-proteins system. For its DNA replication Mu depends also on host enzymes, from which eight have been identified [67] and for the degradation of the transposase pA and catabolism of c on 2 proteases [41].

5.2 Continuous differential model

IHF is the only bacterial factor activating the transcription of Pe in absence of c. For 01 (Ey) being a stable stationary state (sss), the activation of Pe by IHF has to equilibrate the repression by Ner, leaving the constitutive expression of Pe to run. In stationary growth phase, the induction of prophages is total, despite the presence of a measurable activity of c. The factor ISF helping the action of the repressor c, present in exponential phase but absent in stationary phase, is responsible for that behaviour. Let us now consider a continuous differential model, where the variables are the four states previously considered (Ex, Ey, Et and Ez), IHF and ISF being parameters whose value changes can favour the passage from a state where one of these variables dominates to another state where it is another dominating. The differential system can be written as:

dEx/dt=-k3Ex/ ISF +(1/3+ ISF /k2)Et+(1/3- IHF /k1)Ez
dEy/dt=-k4Ey/ IHF +(1/3- ISF /k2)Et+(1/3+ IHF /k1)Ez
dEt/dt=-Et+Ez/3
dEz/dt=k3Ex/ ISF +k4Ey/ IHF +Et/3-Ez
We assume from experimental data that k3/k4=k2/k1 ∼6, where k4 and k1 are fixed in such a way that Ey(∞)=105 and Ez(∞)=1 (stationary phase condition). Then the non-zero steady state with Ez(∞)=1 verifies: Et(∞)=1/3, Ex(∞)=ISF(4/9−IHF/k1+ISF/3k2)/k3, Ey(∞)=IHF(4/9+IHF/k1−ISF/3k2)/k4.

5.3 Results

We have simulated (Mapple Runge–Kutta 4 with constant steps) the dynamical behaviour of 100 000 phages Mu initially in lysogenic state (Ex(0)=105, and Ey(0)=Ez(0)=Et(0)=0), in exponential growth phase ( IHF =12000 mol/cell, ISF =80000 mol/cell) and in stationary growth phase ( IHF =52000 mol/cell, ISF=1 mol/cell) with parameters values equal to k1=100, k2=700, k3=1800, k4=300.

In stationary phase (Fig. 11), the lytic state (Ey) is increasing and reaching a plateau of 100 000 lytic cells, which well corresponds to the experimental data and give an estimation of the duration of the time unit (0.004∼10 days).

Fig. 11

Simulation of the behaviour of 100 000 Mu lysogenes when bacteria are in stationary phase.

In exponential phase (Fig. 12) the lysogenic state (Ex) is practically constant with only a loss of 155 bacteria after 200 days.

Fig. 12

Simulation of the behaviour of 100 000 Mu lysogenes when bacteria are in exponential phase.

To estimate the robustness of these dynamical behaviours with respect to the values of the k1 and k4 parameters, we show that in exponential phase k4 has no incidence and k1 can vary between 10 and 5000 (Fig. 13), without changing the agreement between predicted and observed dynamical behaviours.

Fig. 13

Evolution of the asymptotic values of Ex and Ey in the exponential phase with respect to the k1 values.

The stability study in the neighbourhood of the stationary states is done by calculating the eigenvalues of the Jacobian matrix of the differential system above: 0 is always an eigenvalue that causes an asymptotic instability (but the system is trajectorially Lyapunov-stable) and, in stationary phase, among all other eigenvalues, two are complex, with negative real part, whereas one is real negative; in exponential phase, two are real negative and one is real positive.

5.4 Discussion

The Boolean model has shown the necessity to suppress the self-inhibition by Ner in order to get a bi-stability. The continuous version explains in the lytic phase the reaching a plateau behaviour for the 01 state after 10 days and, in the lysogenic phase, the long transient of the 10 state (1% of decrease after 3 months). In order to get a more predictable model, we need measures of the proportion lytic/lysogenic bacteria on a wider interval of growth rates (the local growth rate R(t) coming from a logistic or from a Monod saturation model [68]). Then the system would become non-linear with ISF and IHF being proportionally respectively to R(t) and dR(t)/dt.

6 Open problems

A frequent criticism made about the Boolean models is their unrealistic number of levels (two); it is possible to easily relax this constraint by considering either a multi-threshold network (for which most of the results of Section 4 are still available), or a non-linear automaton, like the following:

xi(t+1)= sup -1, inf 1,xi(t)- sg jm ij 1+xj(t)1-xj(t)+ sg jm ij 1-xj(t)+ sg jm ij 1+xj(t)
where sg(y)=1, if y>0, sg(y)=0, if y=0 and sg(y)=−1, if y<0, or, like in Section 5, a differentiable system. We can also merge the Boolean and the differentiable approach in a hybrid system [69,70].

Another perspective exists for the interaction matrices introduced above, i.e. the ability to calculate the barycentre between two matrices by using classical (spectral or L2) distances between matrices, then to build phylogenic trees among a set of species avoiding the complex problems coming from the non-unicity of L1 (Hamming or Manhattan) barycentres met in the sequence based phylogenic trees. The interaction-based phylogenic trees could reflect more the genomic function than the genomic anatomy and hence could explain more deeply the evolution trends, e.g., by distinguishing the evolution of domestic genetic regulatory networks (involved in rearrangements [12,13]) and that of high-level genetic regulatory networks (corresponding for example to brain signalling or hormonal regulatory systems), the complexity of their interaction matrices being probably of the same order of magnitude than the complexity of the metabolic interaction matrices corresponding to their expressed proteins (carriers, receptors or enzymes) [71,72].

A last important open problem concerns the relationship between the number F of fixed configurations and the number S of interaction loops of the interaction matrix M: the problem is in fact to find the best upper bound for F given an interaction matrix M. This question is the discrete translation of the famous 16th Hilbert's problem of determining an efficient upper bound for the number of limit cycles of a polynomial differential system. Let us summarize the role of the architecture of positive and negative (with an odd number of inhibitions) loops of M on the occurrence of multiple stationary behaviours as obtained in [8–15]: if the number of genes and the number of interactions are the same, there is only one isolated loop in M and either this loop is negative and the lowest bound (0) for F is reached, or this loop is positive and the upper bound (21) for F is reached. If the numbers of genes and interactions are respectively n and n+1, there are two interaction loops with the following structure; if both loops are negative, F=0; if there are a positive loop and a negative loop disjoint, F=0; if there is a positive loop intersecting a negative loop, F=1; if there are two disjoint positive loops, F=22. If, more generally, the number S of loops is m, then: if all loops are negative, F=0; if all loops are positive, then: 2⩽F⩽2m, and if and only if all loops are positive and disjoint, F=2m. An interesting open problem is now to make exhaustive the determination of F and S and, in particular, to find the circumstances (related to the loops structure) in which we can relate the number of intersecting and isolated loops to F. The approach for solving this open problem could consist first in finding coherent relationships between analogous properties discovered for continuous versions of the regulatory networks and for general Boolean networks.

7 Conclusion

A geneticist could exploit the results given in the above paper as follows: we have shown in Section 4 that it would be possible to characterize the minimal interaction matrices having certain state vectors as fixed configurations. The determination of these matrices is not unique, but permits to focus on certain important equivalence classes, in which the expected matrix has to belong. This considerably restricts the choice of the possible interaction matrices compatible with observed fixed configurations, when it is impossible to directly get from experiments all interaction coefficients, but also when it is only possible to observe the phenomenology of fixed or cyclic configurations. This corresponds in genetics to the phenotypic observation of stationary expression behaviours without experimental measure of the inhibitory and activatory coefficients of promoters and repressors. The possibility to obtain (even in an equivalence class) a sketch of the interaction matrix permits to construct (by randomising the unknown coefficients of M) more complicated interaction matrices, then to test if they still have the observed states as fixed configurations, finally keep or reject these tested matrices and propose further experimental strategies, using bio-arrays for refining the knowledge about the genetic network interaction structure.

Acknowledgements

We have done this work thanks to the support of the National Network for Technology Research RNTS ‘Technologies for Health’ from the French Ministry of Research. We are also indebted to A. Toussaint and C. Ranquet for their helpful discussions concerning the bacteriophage Mu. We thank also B. Hess and A. Winfree (in memoriam) for introducing us to many aspects of the dynamics of life.


Bibliographie

[1] J. Demongeot; J.-P. Françoise; M. Richard; F. Senegas; T.P. Baum A differential geometry approach for biomedical image processing, C. R. Biologies, Volume 325 (2002), pp. 367-374

[2] J. Mattes; M. Richard; J. Demongeot Tree representation for image matching and object recognition, Lect. Notes Comput. Sci., Volume 1568 (1999), pp. 298-309

[3] L. Mendoza; E.R. Alvarez-Buylla Dynamics of the genetic regulatory network for Arabidopsis thaliana flower morphogenesis, J. Theoret. Biol., Volume 193 (1998), pp. 307-319

[4] F. Thuderoz, DEA Report, UJF, Grenoble, France, 2000

[5] J. Demongeot A stochastic model for the cellular metabolism (J.R. Barra et al., eds.), Recent Developments in Statistics, North-Holland, Amsterdam, 1977, pp. 655-662

[6] P.J. Goss; J. Peccoud Quantitative modeling of stochastic systems in molecular biology using stochastic Petri nets, Proc. Natl Acad. Sci. USA, Volume 95 (1998), pp. 6750-6755

[7] P.J. Goss; J. Peccoud Analysis of the stabilizing effect of ROM on the genetic network controlling ColE1 plasmid replication (R.B. Altam et al., eds.), Pacific Symposium on Biocomputing'99, World Scientific, Singapore, 1999, pp. 65-76

[8] O. Cinquin; J. Demongeot Positive and negative feedback: striking a balance between necessary antagonists, J. Theoret. Biol., Volume 216 (2002), pp. 229-241

[9] O. Cinquin; J. Demongeot Positive and negative feedback: mending the ways of sloppy systems, C. R. Biologies, Volume 325 (2002), pp. 1085-1095

[10] J. Demongeot, F. Berger, T.-P. Baum, F. Thuderoz, O. Cohen, Bio-array images processing and genetic networks modelling, in: M. Unser, Z.P. Liang (Eds.), ISBI 2002, IEEE EMB, IEEE Proceedings, Piscataway, 2002, pp. 50–54

[11] J. Demongeot; M. Kaufman; R. Thomas Interaction matrices, regulation circuits and memory, C. R. Acad. Sci. Paris, Ser. III, Volume 323 (2000), pp. 69-80

[12] J. Demongeot; J. Aracena; S. Ben Lamine; M.A. Mermet; O. Cohen Hot spots in chromosomal breakage: from description to etiology (D. Sankoff; J.-H. Nadeau, eds.), Comparative Genomics, Kluwer, Amsterdam, 2000, pp. 71-85

[13] J. Demongeot; J. Aracena; S. Ben Lamine; S. Meignen; A. Tonnelier; R. Thomas Dynamical systems and biological regulations (E. Goles; S. Martinez, eds.), Complex Systems, Kluwer, Amsterdam, 2000, pp. 107-151

[14] J. Aracena, J. Demongeot, E. Goles, Fixed points and maximal independent sets on AND-OR networks, Discr. Appl. Math. (in press)

[15] J. Aracena; S. Ben Lamine; M.A. Mermet; O. Cohen; J. Demongeot Mathematical modelling in genetic networks: relationships between the genetic expression and both chromosomic breakage and positive circuits (N. Bourbakis, ed.), BIBE 2000, IEEE, Piscataway, 2000, pp. 141-149

[16] R. Thomas On the relation between the logical structure of systems and their ability to generate multiple steady states or sustained oscillations, Springer Ser. Synerget., Volume 9 (1980), pp. 1-23

[17] M. Delbrück Discussion, Unités biologiques douées de continuité génétique, Colloques internationaux CNRS, Volume 8 (1949), pp. 33-35

[18] R. Thomas; D. Thieffry; M. Kaufman Dynamical behavior of biological regulatory networks. I. Biological role and logical analysis of feedback loops, Bull. Math. Biol., Volume 57 (1995), pp. 328-339

[19] E.H. Snoussi; R. Thomas Logical identification of all steady states: the concept of feedback loop characteristic states, Bull. Math. Biol., Volume 55 (1993), pp. 973-991

[20] D. Thieffry; M. Colet; R. Thomas Formalization of regulatory networks: a logical method and its automatization, Math. Modelling Sci. Comput., Volume 2 (1993), pp. 144-151

[21] R. Thomas; R. D'Ari Biological Feedback, CRC Press, Boca Raton, 1990

[22] S. Kauffman The Origins of Order, Oxford University Press, Oxford, England, 1993

[23] C. Berge Graphes et Hypergraphes, Dunod, Paris, 1974

[24] E. Goles; S. Martinez Neural and Automata Networks, Maths. and Appl. Series, 58, Kluwer, Amsterdam, 1991

[25] F. Plouraboué; H. Atlan; G. Weisbuch; J.-P. Nadal A network model of the coupling of ion channels with secondary messenger in cell signaling, Network Computation in Neural Networks Systems, Volume 3 (1992), pp. 393-406

[26] B. Bollobas Random Graphs, Academic Press, London, 1985

[27] J. Aracena, Modèles mathématiques discrets associés à des systèmes biologiques. Application aux réseaux de régulation génétique, PhD Thesis, U. Chile & UJF, Santiago & Grenoble, 2001

[28] M. Leptin Gastrulation in Drosophila: the logic and the cellular mechanisms, EMBO J., Volume 18 (1999), pp. 3187-3192

[29] N. Rashevsky Mathematical Biophysics, Cambridge University Press, London, 1948

[30] A. Turing The mathematical basis of morphogenesis, Phil. Trans. Roy. Soc. B, Volume 237 (1952), pp. 37-47

[31] A.N. Kolmogorov; I. Petrowski; N. Piskounov Étude de l'équation de la diffusion avec croissance de la quantité de matière et son application à un problème biologique, Mosc. Univ. Bull. Math., Volume 1 (1937), pp. 1-25

[32] M. Thellier; L. Le Sceller; V. Norris; M.-C. Verdus; C. Ripoll Long-distance transport, storage and recall of morphogenetic information in plants. The existence of a sort of primitive plant ‘memory’, C. R. Acad. Sci. Paris, Ser. III, Volume 323 (2000), pp. 81-91

[33] J. Demongeot; M. Thellier; R. Thomas A mathematical model for storage and recall functions in plants, C. R. Acad. Sci., Ser. III, Volume 323 (2000), pp. 93-97

[34] J. Demongeot; M. Laurent Sigmoidicity in allosteric models, Math. Biosci., Volume 67 (1983), pp. 1-17

[35] R. Thomas La logique des systèmes vivants, Bull. Cl. Sci. Acad. R. Belg., Volume 74 (1988), pp. 432-442

[36] O. Cinquin, J. Demongeot, Inhibitory n-switch dynamics and applications (submitted)

[37] O. Cohen; M.A. Mermet; J. Demongeot HC Forum®: a web site based on an international human cytogenetic data base, Nucleic Acids Res., Volume 29 (2001), pp. 305-307

[38] O. Cohen; M.A. Mermet; J. Demongeot HC Forum: toward a tele-expertise plat-form in medical genetics, Lect. Notes Med. Inf., Volume 13 (2002), pp. 97-104

[39] O. Cohen; C. Cans; M. Cuillel; J.-L. Gilardi; H. Roth; M.-A. Mermet; P. Jalbert; J. Demongeot Cartographic study: breakpoints in 1574 families carrying human reciprocal translocations, Hum. Genet., Volume 97 (1996), pp. 659-667

[40] O. Cohen; C. Cans; M.-A. Mermet; J. Demongeot; P. Jalbert Viability thresholds for partial trisomies and monosomies. A study of 1159 viable unbalanced reciprocal translocations, Hum. Genet., Volume 93 (1994), pp. 188-194

[41] A. Toussaint; M.-J. Gama; J. Laachouch; G. Maenhaut-Michel Regulation of bacteriophage Mu transposition, Genetica, Volume 93 (1994), pp. 27-39

[42] H. McAdams; L. Shapiro Circuit simulation of Genetic Networks, Science, Volume 269 (1995), pp. 650-656

[43] A.L. Taylor Bacteriophage-induced mutation in E. coli, Proc. Natl Acad. Sci. USA, Volume 50 (1963), pp. 1043-1051

[44] A.I. Bukhari; D. Zipser Random insertion of Mu-1 DNA within a single gene, Nat. New Biol., Volume 236 (1972), pp. 240-243

[45] B. McClintock Controlling elements and the gene, Cold Spring Harbor Symp. Quant. Biol., Volume 21 (1956), pp. 197-216

[46] N. Symonds; A. Toussaint; P. van de Putte; M.M. Howe Phage Mu, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, USA, 1987

[47] C. Wijffelman; M. Gassler; W.F. Stevens; P. Van de Putte On the control of transcription of bacteriophage Mu, Mol. Gen. Genet., Volume 131 (1974), pp. 85-96

[48] K. Mizuuchi In vivo transposition of bacteriophage Mu: a biochemical approach to a novel replication reaction, Cell, Volume 35 (1983), pp. 785-794

[49] C.F. Marrs; M.M. Howe AvaII and Bgl I restriction maps of bacteriophage Mu, Virology, Volume 126 (1983), pp. 563-575

[50] H.M. Krause; M.R. Rothwell; N.P. Higgins The early promoter of bacteriophage Mu: definition of the site of transcript initiation, Nucleic Acids Res., Volume 11 (1983), pp. 5483-5495

[51] M. Giusti; G. Di Lallo; P. Ghelardini; L. Paolozzi The bacteriophage Mu Ner gene: a positive regulator of the C operon required for normal levels of late transcription, Virology, Volume 179 (1990), pp. 694-700

[52] K. Mathee; M.M. Howe Identification of a positive regulator of the Mu middle operon, J. Bacteriol., Volume 172 (1990), pp. 6641-6650

[53] D.Y. Kwoh; D. Zipser Specific binding of Mu repressor to DNA, Nature, Volume 277 (1979), pp. 489-491

[54] N. Gossen; P. van de Putte Role of Ner protein in bacteriophage Mu transposition, J. Bacteriol., Volume 167 (1986), pp. 503-507

[55] R. Craigie; M. Mizuuchi; K. Mizuuchi Site specific recognition of the bacteriophage Mu ends by the MuA protein, Cell, Volume 39 (1984), pp. 387-394

[56] M. Mizuuchi; R.A. Weisberg; K. Mizuuchi DNA sequence of the control region of phage D108: the N-terminal amino acid sequences of repressor and transposase are similar both in phage D108 and its relative, phage Mu, Nucleic Acids Res., Volume 14 (1986), pp. 3813-3825

[57] M.L. Pato; C. Reich Instability of transposase activity: evidence from bacteriophage Mu DNA replication, Cell, Volume 29 (1982), pp. 219-225

[58] R. Alazard; M. Bétermier; M. Chandler E. coli IHF stabilises bacteriophage Mu repressor interactions with operator DNA in vitro, Mol. Microbiol., Volume 6 (1992), pp. 1707-1714

[59] M.-J. Gama; A. Toussaint; N.P. Higgins Stabilisation of bacteriophage Mu repressor-operator complexes by the E. coli IHF protein, Mol. Microbiol., Volume 6 (1992), pp. 1715-1722

[60] G. Kukolj; M.S. Du Bow IHF activates the Ner-repressed early promoter of transposable Mu-like phage D108, J. Biol. Chem., Volume 267 (1992), pp. 17827-17835

[61] M.D. Ditto; D. Roberts; R.A. Weisberg Growth phase variation of Integration Host Factor level in E. coli, J. Bacteriol., Volume 176 (1994), pp. 3738-3748

[62] M.G. Surette; S.J. Buch; G. Chaconas Transposomes: stable protein–DNA complexes involved in the in vitro transposition of bacteriophage Mu DNA, Cell, Volume 49 (1987), pp. 253-262

[63] M. Bétermier; I. Poquet; R. Alazard; M. Chandler Involvement of E. coli FIS protein in maintenance of bacteriophage Mu lysogeny by the repressor, J. Bacteriol., Volume 175 (1993), pp. 3798-3811

[64] M. Bétermier; C. Lefrère; C. Koch; R. Alazard; M. Chandler The E. coli protein Fis: specific binding to the ends of phage Mu DNA and modulation of phage growth, Mol. Microbiol., Volume 3 (1989), pp. 459-468

[65] C.A. Ball; R. Osuna; K.C. Ferguson; R. Johnson Dramatic changes in Fis level upon nutrient upshift in E. coli, J. Bacteriol., Volume 174 (1992), pp. 8043-8056

[66] J.F. Thomson; L. Moitoso de Vargas; C. Koch; R. Kahmann; A. Landy Cellular factors couple recombination with growth phase, Cell, Volume 50 (1987), pp. 901-908

[67] R. Kruklitis; H. Nakai Participation of bacteriophage Mu A protein and host factors in initiation of Mu DNA synthesis in vitro, J. Biol. Chem., Volume 269 (1994), pp. 16469-16477

[68] J. Monod Recherches sur la croissance des cultures bactériennes, Actualités scientifiques et industrielles, Hermann, Paris, 1942

[69] J. Demongeot, J. Aracena, F. Thuderoz, T.P. Baum, O. Cohen, Genetic regulation networks: circuits, regulons and attractors, C. R. Biologies (in press)

[70] (a) M. Thellier, J. Demongeot, J. Guespin, C. Ripoll, V. Norris, R. Thomas, Storage and recall of environmental signals in a plant: modelling by use of a logical (discrete) formulation (submitted); (b) J. Demongeot, M. Thellier, J. Guespin, C. Ripoll, V. Norris, R. Thomas, Storage and recall of environmental signals in a plant: modelling by use of a differential (discrete) formulation (submitted)

[71] D. Fell; A. Wagner The small world of metabolism, Nat. Biotechnol., Volume 18 (2000), pp. 1121-1122

[72] N. Guelzim; S. Bottani; P. Bourgine; F. Képès Topological and causal structure of the yeast transcriptional regulatory network, Nat. Genet., Volume 31 (2002), pp. 60-63


Commentaires - Politique