Visualization: Geospatial and Other

Introduction

Choosing the correct visualization is an important aspect of data presentation. However, many users often struggle to identify the most effective visualization for their data. Each type of visualization serves a different purpose, and selecting the appropriate one requires an understanding of the data, the audience, and the overall message you wish to convey.

This article aims to make the process of data visualization easier to understand. It will highlight the different types of graphs and their typical use cases. Additionally, it will provide you with the dataset used for each visualization, along with the Python and R code involved in creating the graph. You can see the full article here.

Acknowledgements
Each dataset used in this document (unless otherwise stated) can be found on vincentarelbundock.github.io, which is a large repository for datasets that can be used in R. I would like to thank the people responsible for making this information open access and accessible. The link to the google sheet will be provided throughout the document.

How the Guide is Formatted

The guide will be formatted where it will list a general group (i.e., comparison charts, correlation, etc.) followed by a list of visualizations that fall under that group. For example, bar/column charts are known as a type of comparison chart. Then, after a short introduction on the chart, a visualization will follow. Below the figure, the R and Python code will be displayed that was used to generate the graph. The code that is related to the visualization is listed directly underneath the figure. For all visualizations, make sure that you upload the file when you start the chat, as some of the code does not reflect that initial step.

Geospatial & Other

These visualizations are designed to represent data with spatial components, such as coordinates, GPS, longitude, and latitude, helping to analyze geographic patterns and relationships.

Flow charts and network diagrams, show users how ideas or concepts are related to one another, connecting them with lines.

1. Geographic Heat Map

A geographic heat map illustrates where points are most concentrated within a specific geographic location by using colours to represent density. This type of map is useful for highlighting patterns, trends, and hotspots in spatial data.

For this visualization, we will use a dataset that includes the locations of 1000 seismic events near Fiji since 1964. This dataset, part of the Harvard PRIM-H project dataset, was obtained by Dr. John Woodhouse from the department of Geophysics. This dataset can be accessed here.

R Example

# Load necessary libraries
library(readxl)
library(ggplot2)
library(maps)

# Load the earthquake data from the Excel file
quakes <- read_excel('quakes.xlsx', sheet = 'Sheet1')

# Display the first few rows of the data to verify
print(head(quakes))

# Get world map data
world <- map_data("world")

# Create the density plot
p <- ggplot() +
  geom_map(data = world, map = world,
           aes(long, lat, map_id = region),
           color = "darkgrey", fill = "lightgrey", size = 0.1) +
  stat_density_2d(data = quakes, aes(x = long, y = lat, fill = after_stat(level)), 
                  geom = "polygon", alpha = 0.7) +
  scale_fill_viridis_c() +
  coord_fixed(1.3, xlim = c(165, 190), ylim = c(-40, -10)) +
  theme_minimal() +
  labs(title = "Density of Fiji Earthquakes",
       x = "Longitude", y = "Latitude",
       fill = "Density") +
  theme(legend.position = "right")

# Print the plot
print(p)

2. Choropleth map

A choropleth map is a thematic map where areas are shaded or patterned based on the values of a variable, such as population density, income level, or election results.

For this visualization, we will use data from the 2017 American Census Society. This dataset can be accessed here.

R Example

# Load necessary libraries
library(ggplot2)
library(dplyr)
library(maps)
library(mapdata)
library(readxl)

# Read the data from the Excel file
alabama_df <- read_excel('choropleth.xlsx', sheet = 'Sheet1')

# Load the map data for Alabama counties
alabama_map <- map_data("county", region = "alabama")

# Prepare the data to join with the map
alabama_df <- alabama_df %>%
  mutate(subregion = tolower(gsub(" County", "", County)))

# Join the map data with our county data
alabama_data_map <- left_join(alabama_map, alabama_df, by = "subregion")

# Create a choropleth map of population
p_pop <- ggplot(data = alabama_data_map, aes(x = long, y = lat, group = group, fill = TotalPop)) +
  geom_polygon(color = "black") +
  coord_fixed(1.3) +
  scale_fill_viridis_c(option = "plasma", name = "Population") +
  theme_minimal() +
  labs(title = "Alabama County Population")

# Display the population map
print(p_pop)

3. Network diagram

A network diagram is a visualization tool used to show connections between multiple different elements, illustrating how different entities (nodes) are connected to one another.

For this visualization, we will use a document that outlines the sequence of tasks in a project. It defines the nodes (tasks), dependencies, and gives a short description of the dependencies. This document can be accessed here. You can simply copy and paste the description directly into the chat box to generate the image.

R Example

#R CODE
library(igraph)
library(ggplot2)
library(grid)
library(gridExtra)

# Create a directed graph
nodes <- data.frame(
  id = c("A", "B", "C", "D", "E", "F", "G", "H", "I"),
  label = c("Define Requirements", "Design Website Layout", "Develop Front-End", 
            "Develop Back-End", "Set Up Database", "Integrate Front-End and Back-End", 
            "Test Website", "Deploy Website", "Launch Marketing Campaign")
)

edges <- data.frame(
  from = c("A", "B", "B", "D", "C", "D", "E", "F", "G", "H"),
  to = c("B", "C", "D", "E", "F", "F", "F", "G", "H", "I")
)

# Combine id and label for vertex labels
nodes$combined_label <- paste(nodes$id, nodes$label, sep = ": ")

# Create the graph
g <- graph_from_data_frame(d = edges, vertices = nodes, directed = TRUE)

# Plot the graph with combined labels
plot(g, vertex.label = V(g)$combined_label, vertex.size = 30, vertex.label.cex = 0.8, 
     vertex.label.dist = 1.5, edge.arrow.size = 0.5, layout = layout_with_fr)

# Add description box in parts
description_part1 <- "Description:\nA: Define Requirements - Initial phase where project requirements are gathered and documented.\nB: Design Website Layout - Creating wireframes and design mockups based on requirements.\nC: Develop Front-End - Development of the user interface and client-side logic.\nD: Develop Back-End - Development of server-side logic and APIs.\nE: Set Up Database - Establishing the database schema and setting up the database server."

description_part2 <- "\nF: Integrate Front-End and Back-End - Connecting the front-end interface with the back-end functionality.\nG: Test Website - Comprehensive testing of the website for functionality, usability, and performance.\nH: Deploy Website - Launching the website on a live server.\nI: Launch Marketing Campaign - Initiating marketing activities to promote the website."

description <- paste(description_part1, description_part2, sep = "")

# Move the description box down by adjusting the y-coordinate
grid.text(description, x = 0.98, y = 0.01, just = c("right", "bottom"), 
          gp = gpar(fontsize = 8, col = "black", fill = "wheat", alpha = 0.5))

# Save the plot to a file
ggsave("network_diagram_with_combined_labels.png", width = 16, height = 12, dpi = 300)

# Print confirmation
print("Combined labels added and plot saved successfully.")

Python Example

#PYTHON CODE
import networkx as nx
import matplotlib.pyplot as plt
import io
import base64
from IPython.display import HTML

# Create a directed graph
G = nx.DiGraph()

# Add nodes (tasks)
tasks = {
    'A': 'Define Requirements',
    'B': 'Design Website Layout',
    'C': 'Develop Front-End',
    'D': 'Develop Back-End',
    'E': 'Set Up Database',
    'F': 'Integrate Front-End and Back-End',
    'G': 'Test Website',
    'H': 'Deploy Website',
    'I': 'Launch Marketing Campaign'
}

G.add_nodes_from(tasks.keys())

# Add edges (dependencies)
edges = [
    ('A', 'B'), ('B', 'C'), ('B', 'D'), ('D', 'E'),
    ('C', 'F'), ('D', 'F'), ('E', 'F'), ('F', 'G'),
    ('G', 'H'), ('H', 'I')
]

G.add_edges_from(edges)

# Set up the plot
plt.figure(figsize=(16, 12))
pos = nx.spring_layout(G, k=0.9, iterations=50)

# Draw the graph
nx.draw(G, pos, node_color='lightblue', 
        node_size=4000, arrows=True, edge_color='gray')

# Add labels to nodes
for node, (x, y) in pos.items():
    plt.text(x, y+0.05, node, ha='center', va='center', fontsize=12, fontweight='bold')
    plt.text(x, y-0.02, tasks[node], ha='center', va='center', fontsize=8, wrap=True)

# Add a title
plt.title("Website Development Project Network Diagram", fontsize=16)

# Add description box
description = (
    "Description:\\n"
    "A: Define Requirements - Initial phase where project requirements are gathered and documented.\\n"
    "B: Design Website Layout - Creating wireframes and design mockups based on requirements.\\n"
    "C: Develop Front-End - Development of the user interface and client-side logic.\\n"
    "D: Develop Back-End - Development of server-side logic and APIs.\\n"
    "E: Set Up Database - Establishing the database schema and setting up the database server.\\n"
    "F: Integrate Front-End and Back-End - Connecting the front-end interface with the back-end functionality.\\n"
    "G: Test Website - Comprehensive testing of the website for functionality, usability, and performance.\\n"
    "H: Deploy Website - Launching the website on a live server.\\n"
    "I: Launch Marketing Campaign - Initiating marketing activities to promote the website."
)

# Create a text box for the description
props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
plt.text(0.98, 0.02, description, transform=plt.gca().transAxes, fontsize=8,
         verticalalignment='bottom', horizontalalignment='right', bbox=props)

# Save the plot to a bytes buffer
buf = io.BytesIO()
plt.savefig(buf, format='png', dpi=300, bbox_inches='tight')
buf.seek(0)

# Encode the image to base64
img_base64 = base64.b64encode(buf.getvalue()).decode('utf-8')

# Create HTML with the embedded image
html_content = f'<img src="data:image/png;base64,{img_base64}" alt="Website Development Project Network Diagram">'

# Save the HTML content to a file
with open('network_diagram_with_right_aligned_description.html', 'w') as f:
    f.write(html_content)

print("Updated network diagram with right-aligned description has been created and saved as 'network_diagram_with_right_aligned_description.html'.")

4. Flowchart

A flowchart is a visual representation of a process, workflow, or system. It uses symbols and arrows to signify a sequence of steps, decisions, or actions.

For this example, we will create a flowchart outlining the process of online purchases. The Google document can be accessed here, which contains all the information you need to create the flowchart.

Python Example

#PYTHON CODE
from graphviz import Digraph

# Create a new directed graph
flowchart = Digraph(comment='Online Purchase Process')

# Add nodes for each step with detailed descriptions
flowchart.node('A', 'Start: User visits the e-commerce website')
flowchart.node('B', 'Browse Products: The user browses the list of available products\
Input: Product catalog data is displayed\
Output: Displayed list of products')
flowchart.node('C', 'Select Product: The user selects a product they are interested in')
flowchart.node('D', 'Add to Cart: The selected product is added to the shopping cart')
flowchart.node('E', 'View Cart: The user views the contents of their cart\
Input: Cart contents are displayed\
Output: Displayed cart items')
flowchart.node('F', 'Is Cart Correct?')
flowchart.node('G', 'Proceed to Checkout: The user decides to proceed with the purchase')
flowchart.node('H', 'Enter Shipping Information: The user provides their shipping details')
flowchart.node('I', 'Enter Payment Information: The user enters payment details (e.g., credit card information)')
flowchart.node('J', 'Is Payment Successful?')
flowchart.node('K', 'Confirm Order: The order is confirmed, and a confirmation message is displayed\
Output: Order confirmation message')
flowchart.node('L', 'End: The process ends with the order being placed')

# Add edges to represent the flow
flowchart.edges(['AB', 'BC', 'CD', 'DE', 'EF', 'FG', 'GH', 'HI', 'IJ', 'JK', 'KL'])
flowchart.edge('F', 'B', label='No')
flowchart.edge('J', 'I', label='No')

# Save the flowchart to a file
flowchart.render('detailed_online_purchase_process_flowchart', format='png')

print("Detailed flowchart created and saved as 'detailed_online_purchase_process_flowchart.png'")

This post is part of a multi-series compilation. You can find the other posts below:

Visualization: Data Over Time (Temporal)

Visualization: Distribution Charts

Visualization: Part-to-Whole Charts

Visualization: Correlation Charts

Visualization: Comparison Charts

Happy graphing!

2 Likes