1 / 69

Structure, Function andEvolution of

MetabolicNetworks (I)

- Jing Zhao
- College of Pharmacy, Second Military Medical

University - Shanghai Center for Bioinformation and Technology
- 2009.5.25

Spring school on multiscale methods and modeling

in biophysics and system biology, Shanghai, China

- Outline
- Reconstruction of metabolic networks
- Network metrics and topological features
- Modularity and network decomposition
- Topological diversity of networks with a given

degree sequence

I. Reconstruction of metabolic networks

Zhao J, Yu H, Luo J, Cao Z, Li Y Complex

networks theory for analyzing metabolic networks.

Chinese Science Bulletin 2006, 51(13)1529-1537.

What is network?

Examples Internet

Examples Scientific collaborations

Examples protein-protein interaction network

Metabolism

Examples metabolic network

(No Transcript)

How to get genome-specific metabolic reactions?

- Identifying ORFs from the genomic sequence
- (ii) Predicting all the enzyme genes of this

organism by sequence similarity alignment - (iii) Comparing the predicted enzymes within this

organism against the collection of known - reference pathways to determine all the reactions

of this organism.

- Two refined metabolism database for human being

manually reconstructed - BiGG database
- Duarte, N. C. Becker, S. A. Jamshidi, N.

Thiele, I. Mo, M. L. Vo, T. D. Srivas, R.

Palsson, B. O., Global reconstruction of the

human metabolic network based on genomic and

bibliomic data. PNAS 2007, 104, (6), 1777-1782. - The Edinburgh human metabolic network
- Ma, H. Sorokin, A. Mazein, A. Selkov, A.

Selkov, E. Demin, O. Goryanin, I., The

Edinburgh human metabolic network reconstruction

and its functional analysis. Molecular Systems

Biology 2007, 3, 135.

Statistics for BiGG database

Process for reconstructing the Edinburgh human

metabolic network

Different graph representations of a simple

metabolic network

Currency metabolites

Ma H, Zeng A-P Reconstruction of metabolic

networks from genome data and analysis of their

global structure for various organisms.

Bioinformatics 2003, 19(2)270-277.

Currency metabolites

Currency metabolites

- Definition
- currency metabolites have high degree
- they make not meaningful shortcuts
- i.e. tie together distant parts of the

network - i.e. tie different modules together

Algorithm Remove vertices in order of

(currently) highest degree. The set of removed

vertices that gives the network the highest

modularity is the set of currency metabolites.

Huss M, Holme P Currency and commodity

metabolites Their identification and relation to

the modularity of metabolic networks. IET Systems

Biology 2007, 1280-285.

Human currency metabolites

Huss M, Holme P Currency and commodity

metabolites Their identification and relation to

the modularity of metabolic networks. IET Systems

Biology 2007, 1280-285.

- Steps for reconstructing a metabolic network
- Get reaction list
- Generate substrate - product pair list
- Delete currency metabolites
- Generate metabolic network
- Useful tool
- Text2pajek.exe

II. Network metrics and topological features

Zhao J, Yu H, Luo J, Cao Z, Li Y Complex

networks theory for analyzing metabolic networks.

Chinese Science Bulletin 2006, 51(13)1529-1537.

network science Measures of network structure.

How does a network that is too large to draw

.look. like? Real-world networks have both

randomness and structure. How can we quantify

network structure? Models of evolving networks.

How do networks get their structure? What

.microscopic. properties are responsible for the

macro-structure of the network. Models of

network changing events. Malicious attacks

overload breakdowns. Classication and functional

prediction. How can we classify vertices and

predict their function in the network? How does

the network structure affect dynamic systems of

the network? Running dynamic simulations on top

of the network and see how dynamic properties

correlates with the network structure.

- As for biochemical networks, what questions can

we ask? - how can the large-scale organization be

characterized? - are there any universal features over different

species? - do the differences tell us something about

evolution? - can we identify functional modules?
- . . the functions of molecules?

Degree distribution vs. scale-free networks

Degree distribution p(k) the occurrence

frequency of nodes with degree k, (k1,2,).

Random network

Scale-free network

hub

Barabasi, A.L., Albert, R., Emergence of scaling

in random networks, Science, 1999, 286509-512

- BA model for network evolution
- (1) Growth the continuous addition of new nodes.

- (2) Preferential attachment the rich get

richer principle. - The high-degree nodes should appear in the

earlier stage of network formation. - Thirteen hub metabolites in E.coli

metabolic network

Wagner, A., Fell, D.A., The small world inside

large metabolic networks, Proc R Soc Lond B,

2001, 2681803-1810.

- Performance of scale-free networks
- error tolerance high resistance to random

perturbations - attack vulnerability the removal of a few hub

nodes will destroy the whole network.

Albert, R., Jeong, H., Barabasi, A.-L., Error and

attack tolerance of complex networks, Nature,

2000, 406378-382.

Jeong, H., Mason, S.P., Barabasi, A.L., Oltvai,

Z.N., Lethality and centrality in protein

networks, Nature, 2001, 41141-42.

- Notice Computation of the exponent
- cumulative distribution

?

Log-log plot of the degree distribution (A) and

cumulative degree distribution (B) for a network

of 20000 nodes constructed by Barabasi-Albert

preferential attachment model.

- Clustering coefficient vs. Hierarchical modular

networks

How many triangles are there in the network?

N(v) the number of links between neighbours of

node v d(v) the degree of node v

Ravasz E, Somera A L, Mongru D A, Oltvai Z N,

Barabasi A L, Hierarchical organization of

modularity in metabolic networks,

Science,2002,297 1551-1556

(No Transcript)

Complex systems usually have a hierarchical

structure, the entities of one level being

compounded into new entities at the next higher

lever, as cells into tissues, tissues into

organs, and organs into functional systems. The

whole is greater than the sum of its parts! At

each new level of complexity in biology new and

unexpected qualities appear, qualities which

apparently cannot be reduced to the properties of

the component parts.

Lifes complex Pyramid from the particular to

the universal

Oltvai, Z.N., Barabási, A.-L., Lifes Complexity

Pyramid, SCIENCE, 2002, 298763-764.

- Mean path length vs. small-world networks

Small-world network small mean path length

high clustering coefficient

Small-world cell networksgtthe cell may react

quickly to changes of the surroundings

Watts, D.J., Strogatz, S.H., Collective dynamics

of small-world' networks, Nature, 1998,

393440-442.

- Assortativity coefficient vs. degree-degree

correlation -

- Are high-degree vertices connected to other

high-degree vertices? Or are these vertices

primarily connected to low-degree vertices? - ji , ki the degrees of the nodes at the ends of

the ith edge - M number of edges in the network
- rgt0 assortative network
- rlt0 disassortative network

Newman , M.E.J., Assortative mixing in networks,

Phys Rev Lett, 2002, 89208701.

Newman , M.E.J., Assortative mixing in networks,

Phys Rev Lett, 2002, 89208701.

The average connectivity ltknngt of the nearest

neighbors of a node depending on its connectivity

k for the 1998 snapshot of the Internet, the

generalized BA model and the fitness model.

Romualdo Pastor-Satorras, Alexei Vázquez, and

Alessandro Vespignani, Dynamical and Correlation

Properties of the Internet, PHYSI CAL REV IEW

LETTERS, VOLUME 87, NUMBER 25(2002)

Correlation profiles of protein interaction

network in yeast. Z-scores for connectivity

correlations Z(K0,K1) (P(K0,K1) -

Pr(K0,K1))/r(K0,K1) where r(K0,K1) is the

standard deviation of Pr(K0,K1) in 1000

realizations of a randomized network.

Maslov, S., Sneppen, K., Specificity and

Stability in Topology of Protein Networks,

Science, 2002, 296910-913.

Rich-club coefficient and rich-club

phenomenon rich-club coefficient

Notice Rich-club

Assortative mixing

Colizza V, Flammini A, Serrano MA, Vespignani A

Detecting rich-club ordering in complex networks.

Nat Phys 2006, 2(2)110-115.

Centrality Which nodes are important for

communication on the network? Assumption

Information transmission or material

transportation on the network are along shortest

paths.

Betweenness centrality

- Node betweenness measures the degree to which

a vertex is participating in the communication

between pairs of other vertices

the number of shortest paths from s

to t the number of shortest paths

from s to t with v as an inner vertex

Holme P, Kim BJ, Yoon CN, Han SK Attack

vulnerability of complex networks. Phys Rev E

2002, 65056109.

Edge betweenness measures the degree to which

an edge is participating in the communication

between pairs of other vertices

the number of shortest paths from s

to t the number of shortest paths

from s to t with v as an inner vertex

Holme P, Kim BJ, Yoon CN, Han SK Attack

vulnerability of complex networks. Phys Rev E

2002, 65056109.

- Nodes and edges of high betweenness centrality

could be bottlenecks of the network, thus could

be important enzymes or metabolites. - Edges of high betweenness centrality could be

bridges of modules.

Rahman, S.A., Schomburg, D., Observing local and

global properties of metabolic pathways 'load

points' and 'choke points' in the metabolic

networks, Bioinformatics, 2006,

221767-1774. Girvan M, Newman MEJ Community

structure in social and biological networks. Proc

Natl Acad Sci 2002, 99(12)7821-7826.

Closeness centrality

Closeness centrality measures the degree to which

a vertex is close to other vertices on average.

Service facility locating problem Find the

location of a shopping mall that the average

driving distance to the mall is minimal.

Solution the nodes which have the biggest

closeness centrality

Center Emergency facility locating problem

find the optimal location of a firehouse such

that the worst-case response distance of a fire

engine is minimal.

- k-core

1, 2 and 3-core. Two basic properties of cores

first, cores may be disconnected subgraphs

second, cores are nested for igtj, an i-core is a

subgraph of a j-core of the same graph.

gt The probability of nodes both being essential

and evolutionary conserved successively increases

toward the innermost cores.

Wuchty, S., Almaas, E., Peeling the yeast protein

network, Proteomics, 2005, 5444-449.

Reciprocity metric

aij 1 if there is an arc from nodes i to

j, aij 0 otherwise L the number of

total arcs in the network N the number of total

nodes in the network

? -1 for purely unidirectional networks ?

1 for purely bidirectional networks

Network null models

- Network structures are always relative
- Network structures how the network differs from

a random network, or a null model - One has to be clear about what to compare with a

null model - Null model 1 random graphs (Poisson random

graphs, - Erdos-Renyi graphs)
- Null model 2 random graphs constrained to

the set of - degrees of the original graph

Null Models random rewiring

Maslov, S., Sneppen, K., Specificity and

Stability in Topology of Protein Networks,

Science, 2002, 296910-913.

Maslov S, Sneppen K, Zaliznyak A Detection of

topological patterns in complex networks

correlation profile of the internet. Physica A

Statistical and Theoretical Physics 2004,

333529-540.

Z-score

Graph analysis and visualization software

Pajek http//vlado.fmf.uni-lj.si/pub/

networks/pajek/ txt2pajek.exe

pajek.exe UCINET http//www.analytictec

h.com/downloaduc6.htm NetMiner

http//www.netminer.com/NetMiner/home_01.jsp

- III. Modularity and network decomposition

Zhao J, Yu H, Luo J, Cao Z, Li Y Complex

networks theory for analyzing metabolic networks.

Chinese Science Bulletin 2006, 51(13)1529-1537.

2.1 Modularity From functional

view Modularity the system can be decomposed

in parts (modules), such that each part has its

own relatively independent function, while

different parts have some communications with

each other. From topological view Assumption

A densely connected subnetwork ? "part with

complex function." Modularity network could be

divided into groups of vertices that have a high

density of edges within them, with a lower

density of edges between groups.

Hartwell LH, Hopfield JJ, Leibler S, Murray AW

From molecular to modular cell biology. Nature

1999, 402C47-C52. Papin JA, Reed JL, Palsson BO

Hierarchical thinking in network biology the

unbiased modularization of biochemical networks

,Trends in Biochemical Sciences 2004, 29641-647.

For a given decomposition of a network, the

modularity metric is defined as

the sum is over the a partition into clusters and

eij is the fraction of edges that leads between

vertices of cluster i and j

The modularity metric of a network is

defined as the largest modularity metric of all

possible partitions of the network. The

modularity of networks must always be compared to

the null case of a random graph.

Newman M Detecting community structure in

networks EurPhysJB 2004, 38321-330. Guimera R,

Sales-Pardo M, Amaral LAN Modularity from

fluctuations in random graphs and complex

networks. Physical Review E 2004, 70025101.

2.2 Simulated annealing method

Guimera R, Nunes Amaral LA Functional

cartography of complex metabolic networks. Nature

2005, 433(7028)895-900.

2.3 Hierarchical clustering method Similarity

index(or dissimilarity index) to signify the

extent to which two nodes would like in the same

cluster. Agglomerative method to start off with

each node being its own cluster. At each step, it

combines the two most similar clusters to form a

new larger cluster until all nodes have been

combined into one cluster. Divisive method to

begin with one cluster including all the nodes,

and attempts to find the splitting point at which

two clusters are as dissimilar as possible.

Topological overlap algorithm Substrate graph

Jn(i,j) denotes the number of nodes to which

both i and j are linked ( plus 1 if there is a

direct link between i and j ) ki, kj is the

degree of i and j, respectively. Agglomerative

method.

Ravasz E, Somera AL, Mongru DA, Oltvai ZN,

Barabasi AL Hierarchical Organization of

Modularity in Metabolic Networks. Science 2002,

297(5586)1551-1555

Shortest path algorithm enzyme graph

d(i, j) is the number of arcs in the shortest

directed path from i to j . Agglomerative

method.

Ma H-W, Zhao X-M, Yuan Y-J, Zeng A-P

Decomposition of metabolic network into

functional modules based on the global

connectivity structure of reaction graph.

Bioinformatics 2004, 20(12)1870-1876.

Betweenness method substrate-reaction bipartite

graph

is the number of shortest

paths between s and t that passes through r,

is the total number of shortest paths

between s and t, is the in-degree of

node r. Divisive method.

Holme P, Huss M, Jeong H Subnetwork hierarchies

of biochemical pathways. Bioinformatics 2003,

19(4)532-538.

Corrected Euclidean-like dissimilarity algorithm

substrate graph

d(i, j) is the number of arcs in the shortest

directed path from i to j . Agglomerative

method.

Zhao J, Yu H, Luo J, Cao Z, Li Y Hierarchical

modularity of nested bow-ties in metabolic

networks. BMC Bioinformatics 20067386.

- IV. Topological diversity of networks with a

given degree sequence

Zhao J, Tao L, Yu H, Luo J-H, Cao Z-W, Li Y-X

The effects of degree correlations on network

topologies and robustness. Chinese Physics 2007,

16.

- Seed networks
- Seed network A the hierarchically modular

network constructed by Ravasz et al. (RB model)

in the 3rd iteration. - Seed network B a model network constructed by

the BA preferential attachment model . - Seed network C the biggest connected cluster of

the E.coli metabolic - Seed network D the biggest connected cluster of

the protein interaction network CCSB-HI1

Extreme networks of degree correlation

The Smax graph (A) and Smin graph (B) for a small

seed network. Nodes with different degrees are

shown in different colours.

Graphs with the same degree sequence have

significantly topological diversity.

Constructing network ensemble from the extreme

networks

Assortative coefficient (r) as function of the

randomization fraction (p).

Relationship between mean path length (L) and

assortative coefficient (r). The data shown in

the figures are averaged over 10 random

realizations of the rewiring process.

Relationship between clustering coefficient(C)

and assortative coefficient (r). The data shown

in the figure are averaged over 10 random

realizations of the rewiring process.

Relationship between modularity(M) and

assortative coefficient (r). The data shown in

the figures are averaged over 10 random

realizations of the rewiring process.

The effect of degree correlation on network

robustness. Figures in the first and second row

depict the robustness under attacks and failures

as a function of assortativity, respectively. The

data shown in the figures are averaged over 10

random realizations of the rewiring process.

Holme P, Zhao J Exploring the assortativity-clust

ering space of a network's degree sequence. Phys

Rev E 2007, 75 046111.

- Thanks!