Interactive Data Visualisations with R

2nd WARG event, Perth, Australia

Andrey Kostenko, June 25, 2015

Download

Agenda

  • Intro

  • Package htmlwidgets

  • Package rCharts

  • Package googleVis

  • Package plotly

  • Coda

R

R is a programming language and software environment for statistical computing and graphics. R and its libraries implement a variety of statistical and graphical techniques. Dynamic and interactive graphics are available through R packages.

Data Visualisation

Data visualization is the presentation of data in a pictorial or graphical format to understand information more easily and quickly.

Interactive

Interactive charts. What does that mean?

INTERACTIVE: tooltipable, clickable, transformable

INTERACTIVE: pan-zoomable aka zoom-panable

INTERACTIVE: info-legends and series highlighting

JavaScript libraries make it possible

D3.js - Data-Driven Documents

Some of the key technologies

0. Standalone R wrappers for JS libraries

1. htmlwidgets framework to bridge R and JS

2. rCharts makes use of many JS libraries

3. Google Chart API

4. Plotly API

Technology 1: Htmlwidgets

htmlwidgets is one of the latest interactive data visualisation technologies, developed specifically for R users and developers. It aims to make it easy to:

  • produce a D3 graphic or Leaflet map with a few lines of R code

  • use JS visualization libraries at the R console, just like plots

  • embed widgets in R Markdown docs and Shiny web apps

  • develop new widgets using a framework that bridges R and JS

There are already several R packages based on the htmlwidgets framework.

R packages based on htmlwidgets

  • dygraphs for time series plots
  • leaflet for geo-spatial mapping
  • MetricsGraphics - D3 scatterplots, line charts, and histograms
  • networkD3 - D3 network graphs
  • d3heatmap - interactive heatmaps with D3
  • DT - HTML tables with filtering, pagination, sorting
  • threejs - 3D scatterplot and 3D globe
  • DiagrammeR - diagrams & flowcharts using Graphviz & Mermaid
  • sparkline - small inline charts

More examples: http://www.htmlwidgets.org/

A dygraph tool in action

R code for the previous plot

fcast1 <- predict(HoltWinters(ldeaths), n.ahead = 36)
fcast2 <- predict(HoltWinters(ldeaths,alpha=0.1), n.ahead = 36)
fcast3 <- predict(HoltWinters(ldeaths,alpha=0.9), n.ahead = 36)
fcast4 <- predict(HoltWinters(ldeaths,alpha=0.1,beta=0.1), n.ahead = 36)
fcast5 <- predict(HoltWinters(ldeaths,gamma=FALSE), n.ahead = 36)
fcast6 <- forecast::forecast(forecast::auto.arima(ldeaths), h = 36)$mean
ts <- cbind(ldeaths,fcast1,fcast2,fcast3,fcast4,fcast5,fcast6)

dygraph(ts, "Deaths from Lung Disease (UK)") %>%
  dySeries("ldeaths", label = "Deaths") %>%
  dyLegend(show = "onmouseover")

A threejs tool in action (load vs temp vs rhum)

R code for the previous plot

HUM<-round(M$HMAX,2);TEMP<-round(M$MAX,2);LOAD<-round(M$Load,2)

pal <- leaflet::colorQuantile("YlOrRd", NULL, n = 9)

scatterplot3js(x=HUM,y=TEMP,z=LOAD, color=pal(TEMP),
               labels=sprintf("HUM=%.2f, TEMP=%.2f, LOAD=%.2f, DATE=%s",
                              HUM,TEMP,LOAD,M$Date), renderer="canvas",
               size = LOAD/30,num.ticks = c(6, 6, 6),
               x.ticklabs=paste(seq(min(HUM),max(HUM),length.out=6),"%"),
                y.ticklabs=paste(seq(max(TEMP),min(TEMP),length.out=6),"C"),
                z.ticklabs=paste(seq(min(LOAD),max(LOAD),length.out=6),"MW"),
               signif = 6,bg="#fffaf0")

A leaflet tool in action (numer of R users by suburb)

R code for the previous plot

tmp <- tempdir();file <- "walocalitypolygon.zip";unzip(file, exdir = tmp)
wa <- readOGR(dsn = tmp, layer = "WA Locality Polygon", encoding = "UTF-8")
[email protected]$num<-sample.int(100L,nrow([email protected]), TRUE)

pal <- colorQuantile("YlGn", NULL, n = 10)
state_popup <- paste0("<strong>LOC_PID: </strong>", wa$LOC_PID, 
        "<br><strong>Suburb/Locality: </strong>", wa$WA_LOCAL_2,
        "<br><strong>Numer of R uses :) </strong>", wa$num)

mb_tiles <- "http://a.tiles.mapbox.com/v3/kwalkertcu.l1fc0hab/{z}/{x}/{y}.png"
mb_attribution <- 'Mapbox <a href="http://mapbox.com/about/maps" 
target="_blank">Terms &amp; Feedback</a>'

leaflet(data =wa) %>% addTiles(urlTemplate = mb_tiles,  
  attribution = mb_attribution) %>% addPolygons(fillColor = ~pal(num), 
  fillOpacity = 0.8, color = "#BDBDC3", weight = 1, popup = state_popup)

Technology 2: rCharts

rCharts is an R package to create, customize and publish interactive JavaScript visualizations from R, using a familiar lattice style plotting interface. It aims to

  • make the process of creating, customizing and sharing interactive visualizations easy
  • access the power of many different JavaScript libraries, each with its own strengths
  • make sensible conventions like using hPlot() function for the Highcharts library, mPlot() function for the Morris.js etc.
  • be easily embeddable into Shiny apps, rmarkdown docs, HTML5 slides etc.

rCharts builds on earlier projects like rHighcharts, rVega, rNVD3 and incorporates many other names that a good web developer would be familiar with.

JavaScript libraries behind R package rCharts

  • jquery.dataTables.js - Table plug-in for jQuery JavaScript library
  • d3.js - JavaScript library for Data-Driven Documents
  • dimple.js - a simple charting API for d3 data visualisations
  • highcharts.js - easy interactive charts for web projects
  • leaflet.js - for mobile-friendly interactive maps
  • morris.js - pretty time-series line graphs
  • raphael.js - simplifies work with vector graphics on the web

More details and examples: http://rcharts.io/

  • nv.d3.js - re-usable charts for d3.js
  • polychart.js - combines data, layers, guides and interactions to create charts
  • rickshaw.js - JavaScript toolkit for creating interactive real-time graphs
  • timeline.js - Beautifully crafted timelines that are easy and intuitive to use.
  • uvcharts.js - charting library based on d3.js
  • vega.js - interactive views using either HTML5 Canvas or SVG.
  • xcharts.js - xCharts is a D3-based library for building custom charts and graphs.

An nvd3 tool in action

R code for the previous plot

p2 <- nvd3Plot(Sepal.Length ~ Sepal.Width, group = 'Species', 
        data = iris, type = 'scatterChart')
p2$show('inline',include_assets = TRUE)

A Highcharts tool in action

R code for the previous plot

x <- data.frame(USPersonalExpenditure)
colnames(x) <- substr(colnames(x), 2, 5)
x$industry<-rownames(x)
xx <- reshape2::melt(x,id="industry")
nPlot(value~variable, group = 'industry', data = xx, type = 'lineChart')

Technology 3: googleVis

R package googleVis is the interface to Google Charts API for creating interactive charts based on data frames (see examples). Google Chart tools are powerful, simple to use, and free. Note some facts:

  • data are visualised using a large number of ready-to-use chart types, from simple line charts to complex hierarchical tree maps
  • charts are rendered using Flash/HTML5/SVG technology to provide cross-browser compatibility and cross platform portability
  • Flash based (Motion Charts, Annotated Time Lines, Geo Maps) and HMTL5/SVG based (Maps, Geo Charts, Intensity Maps, Tables, Gauges, Tree Maps, Line-, Bar-, Column-, Area- and Combo Charts, Scatter-, Bubble-, Candlestick-, Pie- and Org Charts)
  • five vignettes have been written on how to use googleVis
  • a Slidify presentation about the details and usage of googleVis is available on GitHub

gvisMotionChart() in action

plot(gvisMotionChart(Fruits, "Fruit", "Year", options = list(width = 600, height = 400)))

gvisPieChart() in action

gvisGeoChart() in action

df=data.frame(state=c("AU-WA","AU-VIC", 
"AU-NT", "AU-NSW", "AU-SA","AU-QLD",
"AU-TAS","NZ"), 
R_users=c(323,425,154,486,201,195,87,123))
Geo <- gvisGeoChart(df, 
locationvar="state",colorvar=c("R_users"),
options=list(region="AU",
dataMode="regions",resolution="provinces",
width=500, height=400))
plot(Geo)

Technology 4: plotly

Plotly is an online data visualization tool, with scientific graphing libraries available for Python, R, MATLAB, Perl, Julia and other languages. Plotly was built using Python and the Django framework, with a front end using JavaScript and the visualization library D3.js, HTML and CSS. Files are hosted on Amazon S3.

With plottly,

  • all plots are online and editable by you and your collaborators
  • you can analyse and visualize data, together
  • you can publish your ggplot2 figures to the web with one line
  • your Shiny application would have plots one can zoom, pan and more

Details and examples: https://plot.ly/r/user-guide/

Too many static ggplots suggest a Shiny app

Adding some interactivity is another good idea

The end. Take-home messages

htmlwidgets | rCharts | googleVis | plotly - all are good but the first seems best.

Start your own exploration for htmlwidgets and associated packages.

Try them in various environments, e.g. web apps, rmd docs, HTML5 presentations.

Develop your own widget, either to learn by doing or start a really useful project.

An R package for animated heatmaps for weather data with turf.js sounds cool.

What is the best theme for the 3rd WARG event?

I am sure you will be able to deductively conclude that the correct answer is ...

  1. Dashboards in R with Shiny & plotly

  2. Exploring Hadley's recent R packages (readr, readxl, rvest, etc)

  3. Building R packages in the modern way (with RStudio and roxigen2)

  4. Twitter's AnomalyDetection vs. Hyndman's anomalous R packages

  5. S4 vs. S3, object.size(), gc() and other advanced topics

  6. Maps and spatial data analysis using R

Well, right now, I personally would like to learn more about 2, 4 and maybe 6, or even 1.

That's right! I love time series, especially if it's about detecting anomalies.