Compiling and
Analyzing Network Data using a Sample Infrastructure Network.
Tool Used: Gephi
Tool Used: Gephi
Dataset Used: GML
(Power Grid): An undirected, unweighted network representing the topology of the
Western States Power Grid of the United States.
Citation: D. J. Watts and S. H. Strogatz, Nature 393, 440-442 (1998).
Original Source:
Gephi Source: https://github.com/gephi/gephi/wiki/Datasets
Basic Description:
? nodes
? edges
Undirected
Static
Unweighted
Summary:
An undirected,
unweighted network representing the topology of the Western States Power Grid
of the United States. Data compiled by D. Watts and S. Strogatz.
Name N E
Directed Description
power 4941 6594 False Power grid: An undirected, unweighted network
representing the topology of the Western States Power Grid of the United
States. Data compiled by D. Watts and S. Strogatz and made available on the web
here. Please cite D. J. Watts and S. H. Strogatz, Nature 393, 440-442 (1998).
Retrieved from Mark Newman’s website.
Notes:
A Brief Description about GML by The University of Passau
A Brief Description about GML by The University of Passau
GML (Graph Modelling Language): There are many different programs
that work with graphs but almost all of them use their own file format. As a
consequence, exchanging graphs between different programs is almost impossible.
Simple tasks like exchange of data, externally reproducible results or a common
benchmark suite are much harder than neccessary.
Therefore, we have developed a new file format for the Graphlet system: GML. GML supports attaching arbitrary information to graphs, nodes and edges, and is therefore able to emulate almost every other format.
Therefore, we have developed a new file format for the Graphlet system: GML. GML supports attaching arbitrary information to graphs, nodes and edges, and is therefore able to emulate almost every other format.
In the following
the documentation, the parser and a technical report with respect to GML:
Terminologies used:
Distance: The average graph-distance
between all pairs of nodes. Connected nodes have graph distance 1.
Diameter: The diameter is the longest graph
distance between any two nodes in the network. (i.e. How far apart are the two
most distant nodes)
Betweenness Centrality: Measures how often a node appears on shortest paths between nodes in the network.
Closeness Centrality: The average distance from a given starting node to all other nodes in the network
Eccentricity: The distance from a given starting node to the farthest node from it in the network
Density:
Measure how close
the network is to complete. A complete graph has all possible edges and density
equal to 1
HITS:
Computes two
separates values from each node. The first value (called Authority) measures
how valuable information stored at that node is. The second value (called Hub)
measures the quality of the nodes links.
Modularity:
Community
detection algorithm
PageRank: Ranks nodes according to how often a user following links will non-randomly reach the node "page".
Connected Components:
Determines the
number of connected components in the network.
Clustering Coefficient:
The clustering
coefficient, along with the mean shortest path, can indicate a
"small-world" effect. It indicates how nodes are embedded in their
neighborhood. The Average give an overall indication of the clustering in the
network.
Eigenvector Centrality:
A measure of node
importance in a network based on a node's connections.
Generated Reports:
1. Import Report
1. Import Report
2. Overall report
3. Heat Map
Detailed Reports:
A: Network Overview::
1. Average degree-distribution
(Degree
Report) Results: Average Degree: 2.669
2. Average weighted-degree-distribution
(Weighted Degree Report)
(Weighted Degree Report)
Results: Average Weighted Degree: 2.669
3. Network Diameter
(Graph Distance Report)
Parameters:
Network
Interpretation: undirected
Results:
Diameter:
46
Radius:
23
Average
Path length: 18.989185424445708
Number
of shortest paths: 24408540
Algorithm: Ulrik Brandes, A Faster Algorithm for Betweenness Centrality, in Journal of Mathematical Sociology 25(2):163-177, (2001)
4. Graph Density
Graph Density Report
Parameters:
Network
Interpretation: undirected
Results:
Density: 0.001
5. HITS
HITS Metric Report
Parameters:
Ε =
1.0E-4
Algorithm: Jon M. Kleinberg, Authoritative Sources in a Hyperlinked Environment, in Journal of the ACM 46 (5): 604–632 (1999)
Results:
6. Modularity
Modularity Report
Parameters:
Randomize:
On
Use
edge weights: On
Resolution:
1.0
Results:
Modularity:
0.932
Modularity
with resolution: 0.932
Number
of Communities: 38
Algorithm:
Vincent
D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre, Fast
unfolding of communities in large networks, in Journal of Statistical
Mechanics: Theory and Experiment 2008 (10), P1000
Resolution:
R.
Lambiotte, J.-C. Delvenne, M. Barahona Laplacian Dynamics and Multiscale
Modular Structure in Networks 2009
7. PageRank
PageRank Report
PageRank Report
Parameters:
Epsilon
= 0.001
Probability
= 0.85
Algorithm:
Results:
8. Connected Components
Connected Components Report
Connected Components Report
Parameters:
Network
Interpretation: undirected
Results:
Number
of Weakly Connected Components: 1
Algorithm:
Robert
Tarjan, Depth-First Search and Linear Graph Algorithms, in SIAM Journal on
Computing 1 (2): 146–160 (1972)
B: Node Overview::
1.
Average Clustering Coefficient
Clustering Coefficient Metric Report
Clustering Coefficient Metric Report
Parameters:
Network Interpretation: undirected
Results:
Average Clustering Coefficient: 0.107
Total triangles: 651
The Average Clustering Coefficient is
the mean value of individual coefficients.
Algorithm: Matthieu Latapy, Main-memory Triangle
Computations for Very Large (Sparse (Power-Law)) Graphs, in Theoretical
Computer Science (TCS) 407 (1-3), pages 458-473, 2008
2. Eigenvector
Centrality
Eigenvector Centrality Report
Eigenvector Centrality Report
Parameters:
Network Interpretation: undirected
Number of iterations: 100
Sum change: 0.6473580145499725
Results:
C: Edge Overview::
1. Average Path Length
1. Average Path Length
Graph Distance Report
Parameters:
Network
Interpretation: undirected
Results:
Diameter: 46
Radius: 23
Average Path
length: 18.989185424445708
Number of
shortest paths: 24408540
Algorithm:
Ulrik Brandes, A Faster Algorithm for
Betweenness Centrality, in Journal of Mathematical Sociology 25(2):163-177,
(2001)
Download/View Data Table (Power Nodes)