IST 511 Information Management Information and

Technology Networks and Social Networks

- Dr. C. Lee Giles
- David Reese Professor, College of Information

Sciences and Technology - Professor of Computer Science and Engineering
- Professor of Supply Chain and Information Systems
- The Pennsylvania State University, University

Park, PA, USA - giles_at_ist.psu.edu
- http//clgiles.ist.psu.edu

Special thanks to P. Tsaparas, P. Baldi, P.

Frasconi, P. Smyth, Michael Kearns, James Moody,

Anna Nagurney

Last Time

- What is AI
- Definitions
- Theories/hypotheses
- Why do we care
- Impact on information science
- Techniques used in information science
- Wide variety of subfields

Today

- What are networks
- Definitions
- Theories
- Social networks
- Why do we care
- Impact on information science

Tomorrow

- Topics used in IST
- Machine learning
- Text Information retrieval
- Linked information and search
- Encryption
- Probabilistic reasoning
- Digital libraries
- Others?

Theories in Information Sciences

- Enumerate some of these theories in this course.
- Issues
- Unified theory?
- Domain of applicability
- Conflicts
- Theories here are mostly algorithmic
- Quality of theories
- Occams razor
- Subsumption of other theories
- Theories of networks

Networked Life

- Physical, social, biological, etc
- Hybrids
- Static vs dynamic
- Local vs global
- Measurable and reproducible

- Points power stations
- Operated by companies
- Connections embody business relationships
- Food for thought
- 2003 Northeast blackout

North American Power Grid

- Purely biological network
- Links are physical
- Interaction is electrical

The Human Brain

The Premise of Networked Life

- It makes sense to study these diverse networks

together. - The Commonalities
- Formation (distributed, bottom-up, organic,)
- Structure (individuals, groups, overall

connectivity, robustness) - Decentralization (control, administration,

protection,) - Strategic Behavior (economic, free riding,

Tragedies of the Common) - An Emerging Science
- Examining apparent similarities between many

human and technological systems organizations - Importance of network effects in such systems
- How things are connected matters greatly
- Details of interaction matter greatly
- The metaphor of viral spread
- Dynamics of economic and strategic interaction
- Qualitative and quantitative can be very subtle
- A revolution of measurement, theory, and breadth

of vision

Whos Doing All This?

- Computer Information Scientists
- Understand and design complex, distributed

networks - View competitive decentralized systems as

economies - Social Scientists, Behavioral Psychologists,

Economists - Understand human behavior in simple settings
- Revised views of economic rationality in humans
- Theories and measurement of social networks
- Physicists and Mathematicians
- Interest and methods in complex systems
- Theories of macroscopic behavior (phase

transitions) - All parties are interacting and collaborating

Examples

- Theories
- Apps in all areas

The Networked Nature of Society

- Networks as a collection of pairwise relations
- Examples of (un)familiar and important networks
- social networks
- content networks
- technological networks
- biological networks
- economic networks
- The distinction between structure and dynamics

A network-centric overview of modern society.

- Points are still machines but are associated

with people - Links are still physical but may depend on

preferences - Interaction content exchange
- Food for thought free riding

Gnutella Peers

- A purely technological network?
- Points are physical machines
- Links are physical wires
- Interaction is electronic
- What more is there to say?

Internet, Router Level

- Points sovereign nations
- Links exchange volume
- A purely virtual network

Foreign Exchange

Contagion, Tipping and Networks

- Epidemic as metaphor
- The three laws of Gladwell
- Law of the Few (connectors in a network)
- Stickiness (power of the message)
- Power of Context
- The importance of psychology
- Perceptions of others
- Interdependence and tipping
- Paul Revere, Sesame Street, Broken Windows, the

Appeal of Smoking, and Suicide Epidemics

Graph Network Theory

- Networks of vertices and edges
- Graph properties
- cliques, independent sets, connected components,

cuts, spanning trees, - social interpretations and significance
- Special graphs
- bipartite, planar, weighted, directed, regular,
- Computational issues at a high level

What is a network?

- Network a collection of entities that are

interconnected with links. - people that are friends
- computers that are interconnected
- web pages that point to each other
- proteins that interact

Graphs

- In mathematics, networks are called graphs, the

entities are nodes, and the links are edges - Graph theory starts in the 18th century, with

Leonhard Euler - The problem of Königsberg bridges
- Since then graphs have been studied extensively.

Academic genealogy

Networks in the past

- Graphs have been used in the past to model

existing networks (e.g., networks of highways,

social networks) - usually these networks were small
- network can be studied visual inspection can

reveal a lot of information

Networks now

- More and larger networks appear
- Products of technological advancement
- e.g., Internet, Web
- Result of our ability to collect more, better,

and more complex data - e.g., gene regulatory networks
- Networks of thousands, millions, or billions of

nodes - impossible to visualize

The internet map

Understanding large graphs

- What are the statistics of real life networks?
- Can we explain how the networks were generated?

Measuring network properties

- Around 1999
- Watts and Strogatz, Dynamics and small-world

phenomenon - Faloutsos, On power-law relationships of the

Internet Topology - Kleinberg et al., The Web as a graph
- Barabasi and Albert, The emergence of scaling in

real networks

Real network properties

- Most nodes have only a small number of neighbors

(degree), but there are some nodes with very high

degree (power-law degree distribution) - scale-free networks
- If a node x is connected to y and z, then y and z

are likely to be connected - high clustering coefficient
- Most nodes are just a few edges away on average.
- small world networks
- Networks from very diverse areas (from internet

to biological networks) have similar properties - Is it possible that there is a unifying

underlying generative process?

Generating random graphs

- Classic graph theory model (Erdös-Renyi)
- each edge is generated independently with

probability p - Very well studied model but
- most vertices have about the same degree
- the probability of two nodes being linked is

independent of whether they share a neighbor - the average paths are short

Modeling real networks

- Real life networks are not random
- Can we define a model that generates graphs with

statistical properties similar to those in real

life? - a flurry of models for random graphs

Processes on networks

- Why is it important to understand the structure

of networks? - Epidemiology Viruses propagate much faster in

scale-free networks - Vaccination of random nodes does not work, but

targeted vaccination is very effective - Random sampling can be dangerous!

The basic random graph model

- The measurements on real networks are usually

compared against those on random networks - The basic Gn,p (Erdös-Renyi) random graph model
- n the number of vertices
- 0 p 1
- for each pair (i,j), generate the edge (i,j)

independently with probability p

Degree distributions

frequency

fk fraction of nodes with degree k p(k)

probability of a randomly selected node to

have degree k

fk

degree

k

- Problem find the probability distribution that

best fits the observed data

Power-law distributions

- The degree distributions of most real-life

networks follow a power law - Right-skewed/Heavy-tail distribution
- there is a non-negligible fraction of nodes that

has very high degree (hubs) - scale-free no characteristic scale, average is

not informative - In stark contrast with the random graph model!
- Poisson degree distribution, znp
- highly concentrated around the mean
- the probability of very high degree nodes is

exponentially small

p(k) Ck-a

Power-law signature

- Power-law distribution gives a line in the

log-log plot - a power-law exponent (typically 2 a 3)

log p(k) -a logk logC

a

log frequency

frequency

log degree

degree

Examples of degree distribution for power laws

Taken from Newman 2003

A random graph example

Exponential distribution

- Observed in some technological or collaboration

networks - Identified by a line in the log-linear plot

p(k) le-lk

log p(k) - lk log l

log frequency

?

degree

Average/Expected degree

- For random graphs z np
- For power-law distributed degree
- if a 2, it is a constant
- if a lt 2, it diverges

Maximum degree

- For random graphs, the maximum degree is highly

concentrated around the average degree z - For power law graphs

Collective Statistics (M. Newman 2003)

Clustering coefficient

- In graph theory, a clustering coefficient is a

measure of degree to which nodes in a graph tend

to cluster together. - Evidence suggests that in most real-world

networks, and in particular social networks,

nodes tend to create tightly knit groups

characterized by a relatively high density of

ties (Holland and Leinhardt, 19711 Watts and

Strogatz, 19982). - In real-world networks, this likelihood tends to

be greater than the average probability of a tie

randomly established between two nodes (Holland

and Leinhardt, 1971 Watts and Strogatz, 1998).

Clustering (Transitivity) coefficient

- Measures the density of triangles (local

clusters) in the graph - Two different ways to measure it, C1 C2
- The ratio of the means

Exampleundirected graph

1

4

3

2

5

Triangles one each centered at nodes, 1, 2,

3 Triples none centered for nodes 4, 5 node 1

213 node 2 123 node 3 134, 135, 234, 235,

132, 231

Clustering (Transitivity) coefficient

- Clustering coefficient for node i
- The mean of the ratios

Example

- The two clustering coefficients give different

measures - C(2) increases with nodes with low degree

1

4

3

2

5

Collective Statistics (M. Newman 2003)

Clustering coefficient for random graphs

- The probability of two of your neighbors also

being neighbors is p, independent of local

structure - clustering coefficient C p
- when z is fixed C z/n O(1/n)

The C(k) distribution

- The C(k) distribution is supposed to capture the

hierarchical nature of the network - when constant no hierarchy
- when power-law hierarchy

C(k) average clustering coefficient of nodes

with degree k

C(k)

degree

k

Millgrams small world experiment

- Letters were handed out to people in Nebraska to

be sent to a target in Boston - People were instructed to pass on the letters to

someone they knew on first-name basis - The letters that reached the destination followed

paths of length around 6 - Six degrees of separation (play of John Guare)
- Also
- The Kevin Bacon game
- The Erdös number
- Small world project http//smallworld.columbia.ed

u/index.html

Measuring the small world phenomenon

- dij shortest path between i and j
- Diameter
- Characteristic path length
- Harmonic mean
- Also, distribution of all shortest paths

Collective Statistics (M. Newman 2003)

Is the path length enough?

- Random graphs have diameter
- dlogn/loglogn when z?(log n)
- Short paths should be combined with other

properties - ease of navigation
- high clustering coefficient

Degree correlations

- Do high degree nodes tend to link to high degree

nodes? - Pastor Satoras et al.
- plot the mean degree of the neighbors as a

function of the degree

Degree correlations

- Newman
- compute the correlation coefficient of the

degrees of the two endpoints of an edge - assortative/disassortative

Collective Statistics (M. Newman 2003)

Connected components

- For undirected graphs, the size and distribution

of the connected components - is there a giant component?
- For directed graphs, the size and distribution of

strongly and weakly connected components

Network Resilience

- Study how the graph properties change when

performing random or targeted node deletions

Social Networks

- A social network is a social structure of people,

related (directly or indirectly) to each other

through a common relation or interest - Social network analysis (SNA) is the study of

social networks to understand their structure and

behavior

(Source Freeman, 2000)

Social Network Theory

- Metrics of social importance in a network
- degree, closeness, between-ness, clustering
- Local and long-distance connections
- SNT universals
- small diameter
- clustering
- heavy-tailed distributions
- Models of network formation
- random graph models
- preferential attachment
- affiliation networks
- Examples from society, technology and fantasy

The Web as a Network

- Empirical web structure and components
- Web and blog communities
- Web search
- hubs and authorities
- the PageRank algorithm
- The Main Streets and dark alleys of the web

The algorithmic and social implications of

network structure.

Towards RationalityEmergence of Global from

Local

- Beyond the dynamics of transmission
- Context, motivation and influence
- The madness/wisdom of crowds
- thresholds and cascades
- mathematical models of tipping
- the market for lemons
- private preferences and global segregation

Interdependent Security and Networks

- Security investment and Tragedies of the Commons
- Catastrophic events you can only die once
- Fire detectors, airline security, Arthur

Anderson,

Blending network, behavior and dynamics.

Network Economics

- Buying and selling on a network
- Modeling constraints on trading partners
- Local imbalances of supply and demand
- Preferential attachment, price variation, and the

distribution of wealth

The effects of network structure on economic

outcomes.

Modern Financial Markets

- Stock market networks
- correlation of returns
- Market microstructure
- limit and market orders
- order books and electronic crossing networks
- network, connectivity and data issues
- Quantitative trading
- VWAP trading, market making
- limit order power laws
- Herd behavior in trading
- Economic theory and financial markets
- Behavioral economics and finance
- Impacts of the Internet on financial markets

A study of the network that runs the world.

Definition of Social Networks

- A social network is a set of actors that may

have relationships with one another. Networks can

have few or many actors (nodes), and one or more

kinds of relations (edges) between pairs of

actors. (Hannemann, 2001)

History (based on Freeman, 2000)

- 17th century Spinoza developed first model
- 1937 J.L. Moreno introduced sociometry he also

invented the sociogram - 1948 A. Bavelas founded the group networks

laboratory at MIT he also specified centrality

Social Networking

- Large number of sites available throughout the

world

History (based on Freeman, 2000)

- 1949 A. Rapaport developed a probability based

model of information flow - 50s and 60s Distinct research by individual

researchers - 70s Field of social network analysis emerged.
- New features in graph theory more general

structural models - Better computer power analysis of complex

relational data sets

Foundations Theory

Structural Analysis from method and metaphor to

theory and substance.

H. White The presently existing, largely

categorical descriptions of social structure have

no solid theoretical grounding furthermore,

network concepts may provide the only way to

construct a theory of social structure. (p.25)

Integration of large-scale social systems

Form Vs. Content

Introduction

- Social network analysis is
- a set of relational methods for systematically

understanding and identifying connections among

actors. SNA - is motivated by a structural intuition based on

ties linking social actors - is grounded in systematic empirical data
- draws heavily on graphic imagery
- relies on the use of mathematical and/or

computational models. - Social Network Analysis embodies a range of

theories relating types of observable social

spaces and their relation to individual and group

behavior.

Introduction

What are social relations?

A social relation is anything that links two

actors. Examples include Kinship Co-membership

Friendship Talking with Love Hate Exchang

e Trust Coauthorship Fighting

Introduction

What properties relations are studied?

The substantive topics cross all areas of

sociology. But we can identify types of

questions that social network researchers

ask 1) Social network analysts often study

relations as systems. That is, what is of

interest is how the pattern of relations among

actors affects individual behavior or system

properties.

Introduction

High Schools as Networks

(No Transcript)

(No Transcript)

Introduction

Why do Networks Matter?

Local vision

Introduction

Why do Networks Matter?

Local vision

Representation of Social Networks

- Matrices
- Graphs

Ann

Sue

Nick

Rob

Graphs - Sociograms (based on Hanneman, 2001)

- Labeled circles represent actors
- Line segments represent ties
- Graph may represent one or more types of

relations - Each tie can be directed or show co-occurrence
- Arrows represent directed ties

Graphs Sociograms (based on Hanneman, 2001)

- Strength of ties
- Nominal
- Signed
- Ordinal
- Valued

Visualization Software Krackplot

Connections

- Size
- Number of nodes
- Density
- Number of ties that are present vs the amount of

ties that could be present - Out-degree
- Sum of connections from an actor to others
- In-degree
- Sum of connections to an actor
- Diameter
- Maximum greatest least distance between any actor

and another

Some Measures of Distance

- Walk (path)
- A sequence of actors and relations that begins

and ends with actors - Geodesic distance (shortest path)
- The number of actors in the shortest possible

walk from one actor to another - Maximum flow
- The amount of different actors in the

neighborhood of a source that lead to pathways to

a target

Some Measures of Power (based on Hanneman, 2001)

- Degree (indegree, outdegree)
- Sum of connections from or to an actor
- Closeness centrality
- Distance of one actor to all others in the

network - Betweenness centrality
- Number that represents how frequently an actor is

between other actors geodesic paths

Cliques and Social Roles (based on Hanneman,

2001)

- Cliques
- Sub-set of actors
- More closely tied to each other than to actors

who are not part of the sub-set - Social roles
- Defined by regularities in the patterns of

relations among actors

SNA applications

- Many new unexpected applications plus many of the

old ones - Marketing
- Advertising
- Economic models and trends
- Political issues
- Organization
- Services to social network actors
- Travel guides
- Jobs
- Advice
- Human capital analysis and predictions
- Medical
- Epidemiology
- Defense (terrorist networks)

Examples of Applications (based on Freeman, 2000)

- Visualizing networks
- Studying differences of cultures and how they can

be changed - Intra- and interorganizational studies
- Spread of illness, especially HIV

Foundations Data

The unit of interest in a network are the

combined sets of actors and their relations. We

represent actors with points and relations with

lines. Actors are referred to variously

as Nodes, vertices, actors or

points Relations are referred to variously

as Edges, Arcs, Lines, Ties

Example

b

d

a

c

e

Foundations Data

- Social Network data consists of two linked

classes of data - Nodes Information on the individuals (actors,

nodes, points, vertices) - Network nodes are most often people, but can be

any other unit capable of being linked to another

(schools, countries, organizations,

personalities, etc.) - The information about nodes is what we usually

collect in standard social science research

demographics, attitudes, behaviors, etc. - Often includes dynamic information about when the

node is active - b) Edges Information on the relations among

individuals (lines, edges, arcs) - Records a connection between the nodes in the

network - Can be valued, directed (arcs), binary or

undirected (edges) - One-mode (direct ties between actors) or two-mode

(actors share membership in an organization) - Includes the times when the relation is active
- Graph theory notation G(V,E)

Foundations Data

In general, a relation can be (1) Binary or

Valued (2) Directed or Undirected

The social process of interest will often

determine what form your data take. Almost all

of the techniques and measures we describe can be

generalized across data format.

Foundations Data and social science

Global-Net

Foundations Data

We can examine networks across multiple levels

1) Ego-network - Have data on a respondent (ego)

and the people they are connected to (alters).

Example terrorist networks - May include

estimates of connections among alters

2) Partial network - Ego networks plus some

amount of tracing to reach contacts of contacts

- Something less than full account of

connections among all pairs of actors in the

relevant population - Example CDC Contact

tracing data

Foundations Data

We can examine networks across multiple levels

- 3) Complete or Global data
- - Data on all actors within a particular

(relevant) boundary - - Never exactly complete (due to missing data),

but boundaries are set - Example Coauthorship data among all writers in

the social sciences, friendships among all

students in a classroom

Foundations Graphs

Working with pictures. No standard way to draw a

sociogram which are equal?

Foundations Graphs

Network visualization helps build intuition, but

you have to keep the drawing algorithm in mind

Spring-embeder layouts

Tree-Based layouts

Most effective for very sparse, regular graphs.

Very useful when relations are strongly directed,

such as organization charts, internet connections,

Most effective with graphs that have a strong

community structure (clustering, etc). Provides

a very clear correspondence between social

distance and plotted distance

Two images of the same network

Foundations Graphs

Network visualization helps build intuition, but

you have to keep the drawing algorithm in mind

Spring-embeder layouts

Tree-Based layouts

Two images of the same network

Foundations Graphs

Network visualization helps build intuition, but

you have to keep the drawing algorithm in

mind. Hierarchy Tree models Use optimization

routines to add meaning to the Y-axis of the

plot. This makes it possible to easily see who

is most central because of who is on the top of

the figure. Usually includes some routine for

minimizing line-crossing. Spring Embedder

layouts Work on an analogy to a physical system

ties connecting a pair have springs that pull

them together. Unconnected nodes have springs

that push them apart. The resulting image

reflects the balance of these two features. This

usually creates a correspondence between physical

closeness and network distance.

Foundations Graphs

Foundations Graphs

Using colors to code attributes makes it simpler

to compare attributes to relations. Here we can

assess the effectiveness of two different

clustering routines on a school friendship

network.

Foundations Graphs

As networks increase in size, the effectiveness

of a point-and-line display diminishes - run out

of plotting dimensions. Insights from the

overlap that results in from a space-based

layout as information. Here you see the

clustering evident in movie co-staring for about

8000 actors.

Foundations Graphs

This figure contains over 29,000 social science

authors. The two dense regions reflect different

topics.

Foundations Graphs

As networks increase in size, the effectiveness

of a point-and-line display diminishes, because

you simply run out of plotting dimensions. Ive

found that you can still get some insight by

using the overlap that results in from a

space-based layout as information. This figure

contains over 29,000 social science authors. The

two dense regions reflect different topics.

Foundations Graphs and time

Adding time to social networks is also

complicated, run out of space to put time in most

network figures. One solution animate the

network - make a movie! Here we see streaming

interaction in a classroom, where the teacher

(yellow square) has trouble maintaining

order. The SoNIA software program (McFarland and

Bender-deMoll)

Foundations Methods

Graphs are cumbersome to work with analytically,

though there is a great deal of good work to be

done on using visualization to build network

intuition. Recommendation use layouts that

optimize on the feature you are most interested

in.

A graph is vertices and edges

- A graph is vertices joined by edges
- i.e. A set of vertices V and a set of edges E
- A vertex is defined by its name or label
- An edge is defined by the two vertices which it

connects, plus optionally - An order of the vertices (direction)
- A weight (usually a number)
- Two vertices are adjacent if they are connected

by an edge - A vertexs degree is the no. of its edges

Directed graph (digraph)

- Each edge is an ordered pair of vertices, to

indicate direction - Lines become arrows
- The indegree of a vertex is the number of

incoming edges - The outdegree of a vertex is the number of

outgoing edges

E

210

M

450

190

60

B

200

130

L

P

Traversing a graph (1)

- A path between two vertices exists if you can

traverse along edges from one vertex to another - A path is an ordered list of vertices
- length the number of edges in the path
- cost the sum of the weights on each edge in the

path - cycle a path that starts and finishes at the

same vertex - An acyclic graph contains no cycles

Traversing a graph (2)

- Undirected graphs are connected if there is a

path between any pair of vertices - Digraphs are usually either densely or sparsely

connected - Densely the ratio of number of edges to number

of vertices is large - Sparsely the above ratio is small

E

M

B

L

P

Two graph representationsadjacency matrix and

adjacency list

- Adjacency matrix
- n vertices need a n x n matrix (where n V,

i.e. the number of vertices in the graph) - can

store as an array - Each position in the matrix is 1 if the two

vertices are connected, or 0 if they are not - For weighted graphs, the position in the matrix

is the weight - Adjacency list
- For each vertex, store a linked list of adjacent

vertices - For weighted graphs, include the weight in the

elements of the list

Representing an unweighted, undirected graph

(example)

0E

1M

2B

3L

4P

Representing a weighted, undirected graph

(example)

0E

210

1M

450

190

60

2B

200

130

3L

4P

Representing an unweighted, directed graph

(example)

0E

1M

2B

3L

4P

Comparing the two representations

- Space complexity
- Adjacency matrix is O(V2)
- Adjacency list is O(V E)
- E is the number of edges in the graph
- Static versus dynamic representation
- An adjacency matrix is a static representation

the graph is built in one go, and is difficult

to alter once built - An adjacency list is a dynamic representation

the graph is built incrementally, thus is more

easily altered during run-time

Algorithms involving graphs

- Graph traversal
- Shortest path algorithms
- In an unweighted graph shortest length between

two vertices - In a weighted graph smallest cost between two

vertices - Minimum Spanning Trees
- Using a tree to connect all the vertices at

lowest total cost

Graph traversal algorithms

- When traversing a graph, we must be careful to

avoid going round in circles! - We do this by marking the vertices which have

already been visited - Breadth-first search uses a queue to keep track

of which adjacent vertices might still be

unprocessed - Depth-first search keeps trying to move forward

in the graph, until reaching a vertex with no

outgoing edges to unmarked vertices

Shortest path (unweighted)

- The problem Find the shortest path from a vertex

v to every other vertex in a graph - The unweighted path measures the number of edges,

ignoring the edges weights (if any)

Shortest unweighted pathsimple algorithm

For a vertex v, dv is the distance between a

starting vertex and v

- 1 Mark all vertices with dv infinity
- 2 Select a starting vertex s, and set ds 0, and

set shortest 0 - 3 For all vertices v with dv shortest, scan

their adjacency lists for vertices w where dw is

infinity - For each such vertex w, set dw to shortest1
- 4 Increment shortest and repeat step 3, until

there are no vertices w

Foundations Build a socio-matrix

From pictures to matrices

Undirected, binary

Directed, binary

Foundations Methods

From matrices to lists

Arc List

Adjacency List

a b b a b c c b c d c e d c d e e c e d

Foundations Basic Measures

Basic Measures For greater detail,

see http//www.analytictech.com/networks/graphth

eory.htm

Volume

The first measure of interest is the simple

volume of relations in the system, known as

density, which is the average relational value

over all dyads. Under most circumstances, it is

calculated as

1???0

Foundations Basic Measures

Volume

At the individual level, volume is the number of

relations, sent or received, equal to the row and

column sums of the adjacency matrix.

Node In-Degree Out-Degree a

1 1 b 2 1 c

1 3 d 2 0 e

1 2 Mean 7/5 7/5

Foundations Data

Basic Measures

Reachability

Indirect connections are what make networks

systems. One actor can reach another if there is

a path in the graph connecting them.

a

b

d

a

c

e

f

Foundations Basic Matrix Operations

One of the key advantages to storing networks as

matrices is that we can use all of the tools from

linear algebra on the socio-matrix. Some of the

basics matrix manipulations that we use are as

follows

- Definition
- A matrix is any rectangular array of numbers. We

refer to the matrix dimension as the number of

rows and columns

(5 x 5)

(5x2)

(5x1)

Foundations Basic Matrix Operations

Matrix operations work on the elements of the

matrix in particular ways. To do so, the

matrices must be conformable. That means the

sizes allow the operation. For addition (),

subtraction (-), or elementwise multiplication

(), both matrices must have the same number of

rows and columns. For these operations, the

matrix value is the operation applied to the

corresponding cell values.

-1 0 -3 6 2 1

3 6 11 8 2 9

1 3 4 7 2 5

2 3 7 1 0 4

A-B

AB

A

B

2 9 28 7 0 20

3 9 12 21 6 15

AB

Multiplication by a scalar 3A

Matrix properties

- Addition contributes to the actors relations
- Multiplication sums over a trait.
- Negative values can occur
- (friend, dont care, enemy) (1,0,-1)
- Interpret operations carefully

Foundations Basic Matrix Operations

The transpose ( or T) of a matrix reverses the

row and column dimensions. AtijAji So a M x

N matrix becomes an N x M matrix.

T

a b c d e f

a c e b d f

Foundations Basic Matrix Operations

The matrix multiplication (x) of two matrices

involves all elements of the matrix, and will

often result in a matrix of new dimensions. In

general, to be conformable, the inner dimension

of both matrices must match. So A3x2 x B2x3

C3 x 3 But A3x3 x B2x3 is not defined

(actually a tensor) Substantively, adding

names to the dimensions will help us keep track

of what the resulting multiplications mean So

multiplying (send x receive)x (send x receive)

(send x receive), giving us the two-step

distances (the senders recipient's receivers).

Foundations Basic Matrix Operations

The multiplication of two matrices Amxn and Bnxq

results in Cmxq

a b c d

e f g h

aebg afbh cedg cfdh

a b c d e f

agbj ahbk aibl cgdj chdk cidl egfg

ehfk eifl

g h i j k l

(3x2) (2x3)

(3x3)

Foundations Basic Matrix Operations

The powers (square, cube, etc) of a matrix are

just the matrix times itself that many

times. A2 AA or A3 AAA We often use

matrix multiplication to find types of people one

is tied to, since the 1 in the adjacency matrix

effectively captures just the people each row is

connected to.

Foundations Data

Basic Measures

Reachability

The distance from one actor to another is the

shortest path between them, known as the geodesic

distance. If there is at least one path

connecting every pair of actors in the graph, the

graph is connected and is called a component.

Two paths are independent if they only have the

two end-nodes in common. If a graph has two

independent paths between every pair, it is

biconnected, and called a bicomponent. Similarly

for three paths, four, etc.

Foundations Data

Calculate reachability through matrix

multiplication. (see p.162 of WF)

Total of directed walks for power n

Minimal distance from one node to another

Foundations Data

Mixing patterns

Matrices make it easy to look at mixing patterns

connections among types of nodes. Simply

multiply an indicator of category by the

adjacency matrix.

e

d

c

f

B 4 to selves B 2 to G G 2 to B G 6 to selves

b

a

Foundations Data

Matrix manipulations allow you to look at

direction of ties, and distinguish symmetric

from asymmetric ties.

To transform an asymmetric graph to a symmetric

graph, add it to its transpose.

X 0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0

0 1 1 0

XT 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1

0 0

Max Sym MIN Sym 0 1 0 0 0 0 1 0 0

0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0

0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0

1 0 0

0 2 0 0 0 2 0 1 0 0 0 1 0 1 2 0 0 1 0 1 0 0 2 1 0

Interpretation?

Graphs / matrices

- Analysis of social structure
- Visualization tools
- Other methods
- Statistics
- Power laws
- Bayesian graphs

Global web Winners take all

- Pr(page has k inlinks) ? k-? ??2.1
- Popular few receive disproportionate share of

links - ? traffic, ? prob SE indexing, ? SE ranking

Category-specific webWinners dont (quite)

take all

- All US company homepages
- hist w/ exp ? buckets (const on log scale)
- Strong deviation from pure power law
- Unimodal (?l.n.) body,power law tail
- Less skewed many fare well against mode

Pennock, Giles, et.al PNAS 2002

Applications of Networked Life

- Social structure in organizations
- Economic and business behavior
- Epidemiology
- Information discovery
- Design and robustness of networks

SNA disciplines

- More diverse than expected!
- Sociology
- Political Science
- Business
- Economics
- Sciences
- Computer science
- Information science
- Others?

Citation Graph - social network

Bibliographic coupling

cites

Paper

cocitation

is cited by

Note that journal citations nearly always refer

to earlier work.

Graphical Analysis of Hyperlinks on the Web -

social network?

This page links to many other pages (hub)

2

1

4

Many pages link to this page (authority)

3

6

5

PageRank as social network analysis

- PageRank (social rank) flowing from pages to

the pages they cite.

.1

.09

SN on the web - services

- A social network service uses software to build

online social networks for communities of people

who share interests and activities or who are

interested in exploring the interests and

activities of others. - wikipedia - Friending
- MySpace
- Second life
- This will only increase!
- Large complex, heterogeneous networks
- Latours actor-network model
- Different entities connect actors
- Coauthorship network connected by papers

example networks of people and articles (e.g.,

citation and co-authorship networks)

this image is from the system ReferalWeb by

Henry Katz et al. at ATT Research http//foraker.r

esearch.att.com/refweb/version2/RefWeb.html

SNA and the Web 2.0

- Wikis
- Blogs
- Folksonomies
- Collaboratories
- What next?

Computational SNA Models

- New models are emerging
- Very large network analysis is possible!
- Deterministic - algebraic
- Early models still useful
- Statistical
- Descriptive using many features
- Diameter, betweeness,
- Probabilistic graphs
- Generative
- Creates SNA based on agency, documents,

geography, etc. - Community discovery and prediction

Graphical models

- Modeling the document generation

Existing three generative models. Three

variables in the generation of documents are

considered (1) authors (2) words and (3)

topics (latent variable)

Theories used in SNA

- Graph/network
- Heterogeneous graphs
- Hypergraphs
- Probabilistic graphs
- Economics/game theory
- Optimization
- Visualization/HCI
- Actor/Network
- Many more

Future of social networks?

- Top End User Predictions for 2010 - Gartner
- By 2012, Facebook will become the hub for social

networks integration and Web socialization. - Internet marketing will be regulated by 2015,

controlling more than 250 billion in Internet

marketing spending worldwide. - By 2014, more than three billion of the worlds

adult population will be able to transact

electronically via mobile and Internet

technology. - By 2015, context will be as influential to mobile

consumer services and relationships as search

engines are to the Web. - By 2013, mobile phones will overtake PCs as the

most common Web access device worldwide.

Open questions

- Scalability
- Data acquisition and data rights
- Search (socialnetworkrank?)
- CollabSeer
- Trust
- Heterogeneous network analysis
- Business models!

What next?

- More personal search
- Mobile search
- Specialty search
- Freshness search
- ?

Search as a problem is only 5 solved Udi

Manber, 1st Yahoo, 2nd Amazon, now Google

Social networks vs social networking

- Social networks are links of actors and their

relationships usually represented as a graph or

network - Social networking is the actual implementation of

social networks in the digital world or media - A social network service focuses on building and

reflecting of social networks or social relations

among people, e.g., who share interests and/or

activities. A social network service essentially

consists of a representation of each user (often

a profile), his/her social links, and a variety

of additional services. Most social network

services are web based and provide means for

users to interact over the internet, such as

e-mail and instant messaging.

Facebook vs Google

Web 2.0

- A perceived second generation of web development

and design, that aims to facilitate

communication, secure information sharing,

interoperability, and collaboration on the World

Wide Web. - Web 2.0 concepts have led to the development and

evolution of web-based communities, hosted

services, and applications such as

social-networking sites, video-sharing sites,

wikis, blogs, and folksonomies.

Social Media

- Information content created by people using

highly accessible and scalable publishing

technologies that is intended to facilitate

communications, influence and interaction with

peers and with public audiences, typically via

the Internet and mobile communications networks. - The term most often refers to activities that

integrate technology, telecommunications and

social interaction, and the construction of

words, pictures, videos and audio. - Businesses also refer to social media as

user-generated content (UGC) or

consumer-generated media (CGM).

Social Media on Web 2.0

- Multimedia
- Photo-sharing Flickr
- Video-sharing YouTube
- Audio-sharing imeem
- Entertainment
- Virtual Worlds Second Life
- Online Gaming World of Warcraft
- News/Opinion
- Social news Digg, Reddit
- Reviews Yelp, epinions
- Communication
- Microblogs Twitter, Pownce
- Events Evite
- Social Networking Services
- Facebook, LinkedIn, MySpace

Top 10 Websites

- Googe
- Microsoft
- Yahoo!
- AOL
- Wikimedia
- eBay
- CBS
- Fox Interactive Media
- Amazon Sites
- Source comScore as of August 2008

Top 10 Websites - 2009

Top 10 Websites - Google

Feb 2011

But not everyone agrees

But not everyone agrees

Top 10 Social Media Websites

Other opinions

Top Websites MPM

Social Network Service

- A social network service focuses on building

online communities of people who share interests

and/or activities, or who are interested in

exploring the interests and activities of others.

- Most social network services are web based and

provide a variety of ways for users to interact,

such as e-mail and instant messaging services.

Once Popular Social Networking Sites by Location

- North America
- MySpace and Facebook, Nexopia (mostly in Canada)
- South and Central America
- Orkut, Facebook and Hi5
- Europe
- Bebo,Facebook, Hi5, MySpace, Tagged, Xing and

Skyrock - Asia and Pacific
- Friendster, Orkut, Xiaonei and Cyworld

Usage of Social Network

Types of Social Networking Services

- Profile-based
- Content-based
- White-label
- Multi-user Virtual environments
- Mobile
- Micro-blogging
- Social Search

Profile-based

- Primarily organized around members profile pages

pages that mainly consist of information about

an individual member, including the persons

picture and details of interests, likes and

dislikes. - Bebo, Facebook and MySpace are all good examples

of profile-based services.

Content-based

- In these services, the users profile remains an

important way of organizing connections, but

plays a secondary role to the posting of content.

- Examples of content-based communities
- YouTube.com for video sharing
- Last.fm, in which the content is arranged by

software that monitors and represents the music

that users listen to. - Photo-sharing site like Flickr.com.

White-label

- Offer some group-building functionality, which

allows users to form mini-communities within

sites. - Example sites
- PeopleAggregator
- Ning
- Colexn
- Seem to be losing popularity

Multi-user Virtual Environment

- allow users to interact with each others avatars

- users have profile cards, their functional

profiles are the characters they customize or

build and control. Friends lists are usually

private and not publicly shared or displayed. - Example Sites
- Second Life
- World of Warcraft

Mobile Social Networking

- Many social networking sites, for example MySpace

and Twitter, offer mobile phone versions of their

services, allowing members to interact with their

friends via their phones. - mobile-led and mobile-only communities include

profiles and media-sharing just as with web-based

social networking services. - MYUBO, for example, allows users to share and

view video over mobile networks - MobiSNA, mobile video social networking

Micro-blogging

- Allows you to publish short (140 characters,

including spaces) messages publicly or within

contact groups. - These services are designed to work as mobile

services, but are popularly used on the web as

well. - Example Services
- Jaiku

Social Search

- Social search engines are an important web

development which utilise the popularity of

social networking services. - There are various kinds of social search engine,

but sites like Wink and Spokeo generate results

by searching across the public profiles of

multiple social networking sites, allowing the

creation of web-based dossiers on individuals. - This type of people search cuts across the

traditional boundaries of social networking site

membership, although any data retrieved should

already be in the public domain.

Things you can do in a Social Network

- Communicating with existing networks, making and

developing friendships/contacts - Represent themselves online, create and develop

an online presence - Viewing content/finding information
- Creating and customizing profiles
- Authoring and uploading content
- Adding and sharing content
- Posting messages public private
- Collaborating with other people

Future of social networks?

- Tribes - Seth Godin
- Internet mobbing
- Will there be social networking wars?
- MySpace
- Friendster
- Build your own Ning
- Borg

What weve covered

- Networks
- Physical, social, biological, etc
- Hybrids
- Static vs dynamic
- Local vs global
- Measurable and reproducible
- Social networking

Questions

- Role of networks in information science?
- Is certain social networking bad for society?
- Hurting our culture
- Future of social networking?