1 / 33

- Microscopic Evolution of Social Network

Zheng Jiangchuan

Outline

Motivation

1

Introduction

2

Method Overview

3

4

Evaluation

Motivation

- Conventional social network study
- Primarily focus on the static structure of social

network - Reveal statistical network properties observed in

real-world data power-law degree distribution,

small world property, community.

Motivation

- What is missing or rarely studied
- How does the social network we have observed come

about - What force drives the social network to exhibit

the noted static macroscopic structural

properties? - How would the social network evolve in the

future?

Introduction

- Basic Intuition
- The answer lies in the laws governing the

temporal evolution of social network - Since the social network is a self-organized

network, such laws are primarily hidden in the

temporal behaviors of individual nodes

Introduction

- Temporal Behaviors of Individual Node

At what rate does a new node arrive?

Node Arrival Process

How long will a new node stay active during its

life time?

When a node creates a new edge, which target will

this node most likely connect to?

Edge Initiation and Selection Process

How long will a node sleep before creating a new

edge?

Introduction

- Individual node behavior overview
- Node arrives at some rate
- The newly arrived node decides its active

lifetime - The node initiates its first edge to a specific

target node - The node goes to sleep for some time
- The node wakes up, if its life time has not

expired, then it selects a target node to connect

to. - Each node carries out the above process

simultaneously, collectively leading to the

macroscopic evolution of social network

Introduction

- Basic Task
- Develop a generative model that is capable of

describing the evolution of a social network - This model is specified by the process in the

previous slide at an intuitive level - Need to quantify every step in this process

mathematically from empirical observations

Introduction

- What can this generative model be used for
- Provide mathematical insight into the question of

how the social network with the static properties

we have observed come about - Predict the future evolution of the current

social network - Explain the temporal behaviors of humans in

social domain

Method Overview

- How to quantify each step in the generative

process mathematically? - Figure out the mathematical expression for each

step by estimating from empirical temporal social

network data based on Maximum Likelihood

Estimation principle. - For each step, select a model with certain

parameters that maximize the likelihood of the

data we have observed

Data Set

- Four data sets with temporal information
- Flickr (03/2003-09/2005)
- Delicious (05/2006-02/2007)
- Answers (03/2007-06/2007)
- LinkedIn (05/2003-10/2006)

Edge Attachment Process

- Basic method
- Evolve the network edge by edge, and for every

edge arriving into the network, measure the

likelihood that the particular edge endpoints

would be chosen under some given model - Pick the model and associated parameters that

maximize the sample likelihood

Edge Attachment Process

- Candidate models
- Before proceeding to the MLE experiments, need to

propose some candidate models - Edge attachment by degree
- Edge attachment by age of the node
- Carry some simple experiments to justify the

effectiveness of the proposed models

qualitatively

Edge Attachment Process

- Edge attachment by degree
- The probability that a new edge connects to a

specific node is proportional to the degree of

that node at the moment - This is intuitively consistent with common sense

as people are more likely to know those

influential individuals

Edge Attachment Process

- Edge attachment by degree
- Simple experiments for justification, not MLE.
- Plots the probability that a new edge connects to

a node with a certain degree

The experiments match well with the degree

preferential attachment model

Edge Attachment Process

- Edge attachment by age of the node
- The probability that a new edge connects to a

specific node is proportional to age of that node

at the moment - The intuition is older, more experienced users

of a social network are also more engaged and

thus absorb more edges

Edge Attachment Process

- Edge attachment by age of the node
- Depicts how many number of edges are absorbed by

nodes of specific age normalized by the number of

nodes that have achieved that age

The experiments do not match well with the

proposed model, but anyway it is a possible choice

Edge Attachment Process

- Maximum likelihood estimation
- Four models
- D degree preferential attachment
- DR combination of degree preferential attachment

and uniformly at random attachment - A age preferential attachment
- DA combination of degree preferential attachment

and age preferential attachment - For each model and for each data set, plot the

sample likelihood w.r.t model parameters

Edge Attachment Process

- Maximum likelihood estimation

Conclude that model D performs reasonably well

compared to more sophisticated variants based on

degree and age

Locality of edge attachment

- Basic Intuition
- While the degree preferential attachment model

appears to be a reasonable model, it fails to

take into account the locality of edge attachment - Intuitively, people are more likely to connect to

people with common friends, that is, a new edge

tends to span a small number of hops

Locality of edge attachment

- Experiments that empirically justify this

intuition - Plots the probability that a newly created edge

spans a certain number of hops

The experiments on real data do not match well

with PA model in terms of decreasing rate

Locality of edge attachment

- Insight from the experiments
- The double exponential decrease of

suggest that newly created edges are very likely

to span only a small number of hops, forming

triangles - So the degree preferential attachment model

should be replaced by triangle-closing models,

i.e., each new edge connects to a node two hops

away

Locality of edge attachment

- Mathematical model of triangle-closing
- The edge creating process can be decomposed into

two steps select the neighbor by some random

rule, then select the neighbors neighbor by

possibly another rule - There are many possible triangle-closing models,

depending on how to select neighbor at each step

Locality of edge attachment

- Select best triangle-closing model using MLE

On average, the random-random triangle-closing

model performs relatively well, and will be used

to describe the evolution.

Node life time

- Selected Model
- By performing similar maximum likelihood

estimation experiments, found that node lifetimes

are best modeled by an exponential distribution

Time gap between edges

- Selected Model
- Intuitively, individuals with more friends are

likely to make new friends in a shorter time,

meaning the gap distribution for nodes with

different degrees should be different. - More precisely, the gap distribution should be

conditional on the degree of the node

Time gap between edges

- Selected Model
- For a specific data set, by estimating and

for each using maximum likelihood estimation,

we are able to find the possible function of

and with , respectively.

While a is a constant, independent of d, k is a

linear function of d, although the linear

coefficient b varies among different data sets

Complete Model

Evaluation

- Very novel and rigorous evaluation method
- Basic ideas For a specific data set, if the

evolution model is correct, then the static

properties of the final network that are computed

mathematically by such an evolution model should

be close to what we have observed in the data set

Evaluation

- Very novel and rigorous evaluation method
- 1. Based on the proposed evolution model,

analytically derive a mathematical expression for

the degree distribution of the final network as a

function of the parameters in the evolution model - 2. For a specific temporal social network data

set, estimate the parameters needed by the

evolution model. - 3.Substitute the estimated parameters into the

mathematical expression for the degree

distribution of the final network - 4. Compare the result with the true degree

distribution observed in the final snapshot of

this social network data set.

Evaluation

- Analytic derivation

Estimate and from the temporal social

network dataset, respectively

Compare this with the parameter of the degree

distribution directly estimated from the real

data set

Evaluation

- Results

Surprisingly Similar!

Thank you!

Q A