Books & People

back Books & People Overview

Books, People

Archives

Book Reviews

 

Links & More

Print version Print version

Related Links

document Back to Review
document Review of The Craft of Information Visualization (Bederson & Shneiderman)
link Human-Computer Interaction Laboratory (HCIL, University of Maryland)
link Connected Action (Marc Smith)

Background Links

document Books
document People

Book Review: Analyzing Social Media Networks with NodeXL – Appendix

Definitions | Tabular Overview of Network Building Blocks, Types, and Metrics | Overview of the Book | Some Graph Examples | References | Back to Review

By Gerd Waloszek, SAP User Experience – November 10, 2010

This page provides an overview of Derek Hansen's, Ben Shneiderman's, and Marc Smith's book Analyzing Social Media Networks with NodeXL, offers some summary information that was extracted from the book, and presents graphs that the author of the review created with NodeXL.

 

Definitions

Short Definitions

Here are a few short definitions, taken from the book or adapted accordingly:

  • Network: A collection of things and their relationships to one another
    • Things: Nodes, vertices, entities, people, …
    • Relationships: Edges, ties, links, connections
  • Social networks: Are created whenever people interact, directly or indirectly with other people, institutions, and artifacts.
  • Social network analysis: Helps explore and visualize patterns found within collections of linked entities that include people.
  • Social media: A set of online tools that supports social interaction between users (see extended definition below)..
  • Social media networks: Are created whenever people interact, directly or indirectly with other people, institutions, and artifacts using social media, that is, online tools that support social interaction between users.
  • Social media network analysis: Helps explore and visualize patterns found within social media networks.

Social media is a catchall phrase intended to describe the many novel online sociotechnical systems that have emerged in recent years, including services like email, discussion forums, blogs, microblogs, texting, chat, social networking sites, wikis, photo and video sharing sites, review sites, and multiplayer gaming communities. Related terms that describe many of these systems include Web 2.0, the read/write web, social computing, social software, collective action tools, sociotechnical systems, computer-mediated communication, groupware, computer supported cooperative work (CSCW), virtual or online communities, user-generated content, and consumer-generated media.

 

Tabular Overview of Network Building Blocks, Types, and Metrics

Building Blocks of Networks

Entity

Synonyms

Description

Further Characteristics

Vertices

nodes, agents, entities, items

Building blocks of networks, can represent many things:

  • people or social structures such as workgroups, teams, organizations, institutions, states, or even countries
  • content: web pages, keyword tags, or videos.
  • physical or virtual locations or events

Attribute data may describe demographic characteristics of a person (age, gender, race), data that describe the person's use of a system (number of logins, messages posted, edits made) or other characteristics such as income or location.

In network visualization tools such as NodeXL, attribute data can be mapped to visual properties such as the size, color, or opacity of the vertices

Edges

links, ties, connections, relationships

Building blocks of networks; an edge connects two vertices together. Edges can represent many different types of relationships:

  • proximity
  • collaborations
  • kinship, friendship, trade partnership
  • citations
  • investments
  • hyperlinking
  • transactions
  • shared attributes

Types of edges:

  • Undirected or directed
    • Undirected (or symmetric) edges simply exists between two people or things
    • Directed (or asymmetric) edges have a clear origin and destination
  • Unweighted or weighted
    • Unweighted edge or binary edge: Indicates only if an edge exists or not
    • Weighted edge: Includes values associated with each edge that indicate the strength or frequency of a tie

Directed edges are represented on a graph as a line with an arrow pointing from the source vertex to the recipient vertex. Undirected edges are represented on a graph as a line connecting two vertices with no arrows.

Weighted edges are often represented visually as thicker or darker lines or as more or less opaque lines.

Network Types

Classification Type Description
From an Individual Member's Point of View Egocentric (ego)

Only include individuals who are connected to a specified ego. More generally, egocentric networks can extend out any number of "degrees" from ego. Types of ecocentric networks:

  • 1-degree ego network: Consists of the ego and their alters.
  • 1.5-degree ego network: Extends the 1-degree network by including connections between all of the alters.
  • 2-degree ego network: Extends the 1.5-degree network by including all of the alters' own alters (i.e., friends of friends), some of whom may not be connected to ego.
Full (complete) Contain all the people or entities of interest and the connections among them. All egos are treated equally.
Partial Created by selecting a sample or slice of the full network
Type of Entity Unimodal Connect the same type of entity, that is, include only one type of vertex
Multimodal Include different types of vertices
Bimodal Include exactly two types of vertices
Affiliation (bimodal subtype) Bimodal networks that include individuals and some event, activity, or content with which they are affiliated. Bimodal affiliation networks can be transformed into two separate unimodal networks: a user-to-user network and an affiliation-to-affiliation network.
Type of Connection Standard Networks with one type of connections
Multiplex Networks with multiple types of connections

Network Analysis Metrics

Classification Type Description

Aggregate Networks Metrics

This set of metrics describe entire networks

Density

Metrics used to describe the level of interconnectedness of the vertices.

Density is a count of the number of relationships observed to be present in a network divided by the total number of possible relationships that could be present.

Centralization

Metrics that characterizes the amount to which the network is centered on one or a few important nodes.

Centralized networks have many edges that emanate from a few important vertices, whereas decentralized networks have little variation between the numbers of edges each vertex possesses.

Metrics that integrate attribute data with network data Homophily: Looks at the similarity of people who are connected.

Vertex-Specific Networks Metrics

This set of metrics identifies individuals' positions within a network

Degree centrality Simple count of the total number of (unique) connections linked to a vertex (a kind of popularity measure, but a crude one that does not recognize a difference between quantity and quality)
Betweenness centralities: Bridge scores for boundary spanners

The distance between people who are not neighbors is measured by the smallest number of neighbor-to-neighbor hops from one to the other. The shortest path between two people is called the "geodesic distance" and is used in many centrality metrics.

Betweenness centrality is a measure of how often a given vertex lies on the shortest path between two other vertices. This can be thought of as a kind of "bridge" score, a measure of how much removing a person would disrupt the connections between other people in the network (idea of brokering). A "structural" hole is a missing bridge. Wherever two or more groups fail to connect, one can argue that there is a structural hole, a missing gap waiting to be filled.

Closeness centrality: Distance scores for broadly connected people

The average shortest distance between a vertex and every other vertex in the network (measures how close a person is to every other person the network; closeness is paradoxically a "distance" score).

In some cases the inverse of the average distance to others in the network is used as a measure of closeness centrality. In that case, higher values indicate a more central position.

Eigenvector centrality: Influence scores for strategically connected people Eigenvector centrality allows for connections to have a variable value, so that connecting to some vertices has more benefit than connecting to others (takes into consideration not only how many connections a vertex has, but also the degree of the vertices that it is connected to; the PageRank algorithm used by Google's search engine is a variant of Eigenvector centrality)
Clustering coefficient A measure of the density of a 1.5-degree egocentric network (measures how connected a vertex's neighbors are to one another: the number of edges connecting a vertex's neighbors divided by the total number of possible edges between the vertex's neighbors)

 

Overview of the Book

Part and Chapter
Short description
Preface (and more) Provides some details about NodeXL

Part I: Getting Started with Analyzing Social Media Networks

1. Introduction to Social Media and Social Networks

2. Social Media: New Technologies of Collaboration

3. Social Network Analysis: Measuring, Mapping, and Modeling Collections of Connections

Provides grounding in the history and core concepts of social media and social network analysis.

Part II: NodeXL Tutorial – Learning by Doing

4. Getting Started with NodeXL, Layout, Visual Design, and Labeling

5. Calculating and Visualizing Network Metrics

6. Preparing Data and Filtering

7. Clustering and Grouping

Focuses on the practical details of operating the free and open source NodeXL extension of the familiar Microsoft Excel spreadsheet application used for all exercises in the book.

Part III: Social Media Network Analysis Case Studies

8. Email: The Lifeblood of Modern Communication

9. Thread Networks: Mapping Message Boards and Email Lists

10. Twitter: Conversation, Entertainment, and Information, All in One Network! (by Vladimir Barash and Scott Golder)

11. Visualizing and Interpreting Facebook Networks (by Bernie Hogan)

12. WWW Hyperlink Networks (by Robert Ackland)

13. Flickr: Linking People, Photos, and Tags (by Eduarda Mendes Rodrigues and Natasa Milic-Frayling)

14. YouTube: Contrasting Patterns of Content, Interaction, and Prominence (by Dana Rotman and Jennifer Golbeck)

15. Wiki Networks: Connections of Creativity and Collaboration (by Howard T. Welser, Patrick Underwood, Dan Cosley, Derek Hansen, and Laura W. Black)

Each chapter focuses on one form of social media by describing each system, the nature of the networks that are created when people interact through it, and the kinds of analysis that can be performed to identify key people, documents, groups, and events.

Appendix: NodeXL for Programmers (by Tony Capone)

Index

Shows how users with the respective background in programming can customize the NodeXL Excel 2007 template to import network graph data from any data source and create own graphing applications using the NodeXL class libraries.

 

Some Graph Examples

My imported e-mails from 2010, an experiment with NodeXL:

Circle            Spiral

Circle

 

 

Spiral

 

Grid  

Grid

 

 

Fruchterman-Rheingold

 

Harel-Koren   Harel-Koren after refresh

Harel-Koren

 

Harel-Koren after refresh

 

Harel-Koren, filtered (>5 e-mails)   Circle, filtered (>5 e-mails)
Harel-Koren, filtered (>5 e-mails)   Circle, filtered (>5 e-mails)

 

 

References

NodeXL

Authors and Affiliations

 

 

top top