Basis and principle of D3partitionR

D3partitionR is to plot sequential and hierarchical data using treemap (and circle Tree Map), sunburst and partition chart, collapsible trees (indented or not).

The package only has one all-in function called D3partitionR(…) to create a partition chart. Two others functions as renderD3partitionR(…) and D3partitionROutput(…) are used to render partition charts in Shiny.

The goal of the tutorial will be to create a Shiny app plotting the evolution of Japan Trade from 1988 to 2015. The goal is to create the following shiny app :

japantradedemo

You can find all the code here:

https://gitlab.com/ant-guillot/ExploringJapanTradeApp_Kaggle/tree/master

Exploring Japan Trade from 1988 to 2015: downloading the data

You can find the data we are going to use are available on Kaggle: here .

These data are exploring Japan import and export by countries and area and by type of goods (with several classifications).
Download and unzip the data at the root of your project directory, we are going to need these later. You should have the following files in the directory:

2016-11-20-19_34_36-japantrade_article

Creating the project.

First, let’s create a Rstudio Shiny Project, then you can download the package we’ll need:

  • shinydashboard
  • highcharter
  • data.table
  • D3partitionR

Creating the layout:

japan_trade

That’s what we want the final application to look like. Basically it contains:

  • tabBox with two tabs, in each tabs:
    • A box with a time interval input and a radio button to switch between import and export
    • A partition chart to have a global overview over the period
    • A time series to see the evolution on the different area and segment (linked with the partition chart).

To create the layout, we need to:

  1. create an ui.r and server.r
  2. properly modify the ui.r
    1. add the package we need (also do this in server.R):
      library(shiny)
      library(D3partitionR)
      library(data.table)
      library(highcharter)
      require(shinydashboard)
      
    2. Create the dashboard layout (the body will be create later)
      dashboardPage(
       dashboardHeader(disable = TRUE),
       dashboardSidebar(disable = TRUE),
       body
      )
      
    3. creating the body and the boxes
      body=dashboardBody(fluidPage(
      h2("Exploring Japan trade from 1988 to 2015",align="center",style="font-variant: small-caps;"),
      tabBox(width = 12,
      tabPanel(
      "Export and import by country",fluidRow(
      box(width=12,title="Options",solidHeader = T,status = "primary")
      ,box(width = 6,height = 800),
      box(width=6,height = 800)
      )
      ),
      tabPanel(
      "Export and import by type of product",fluidRow(
      box(width=12,title="Options",solidHeader = T,status = "primary")
      ,box(solidHeader=T,width = 6,height = 700),
      box(solidHeader=T,width=6)
      )
      )
      )
      )
      ) 

Your code should look this way:


library(shiny)
library(D3partitionR)
library(data.table)
library(highcharter)
require(shinydashboard)
#Body which contains the boxes
body=dashboardBody(fluidPage(
h2("Exploring Japan trade from 1988 to 2015",align="center",style="font-variant: small-caps;"),
tabBox(width = 12,
tabPanel(
"Export and import by country",fluidRow(
#Options box where our slider input and the import/export switch will be put
box(width=12,title="Options",solidHeader = T,status = "primary")
,box(D3partitionROutput("D3Part1"),width = 6,height = 800),
box(highchartOutput("Graph",height = "600px"),width=6)
)
),
tabPanel(
"Export and import by type of product",fluidRow(
box(width=12,title="Options",solidHeader = T,status = "primary")
,box(solidHeader=T,width = 6,height = 700),
box(solidHeader=T,width=6)
)
)
)
)
)
dashboardPage(
dashboardHeader(disable = TRUE),
dashboardSidebar(disable = TRUE),
body
)

When running server.r, you should get:

step1_app

Well that’s not very useful, let’s add some data and visualisation.

Data processing (server.R)

The data from kaggle need some preprocessing before being used.

  • Replace hs2, hs4, hs6, area, country codification by their current name and save the data.table as an .RDS
year_latest = data.table(read.csv("year_latest.csv"))
hs2_eng = data.table(read.csv("hs2_eng.csv"))
hs4_eng = data.table(read.csv("hs4_eng.csv"))
hs6_eng = data.table(read.csv("hs6_eng.csv"))
country_eng = data.table(read.csv("country_eng.csv"))
year_latest = merge(year_latest,hs2_eng,by="hs2")
year_latest = merge(year_latest,hs4_eng,by="hs4")
year_latest = merge(year_latest,hs6_eng,by="hs6")
year_latest = merge(year_latest,country_eng,by="Country")
  • Put the data under the proper format to use it with D3partitionR: data for the first tab
#Selecting variable we want to plot
year_latest_proc=year_latest[,.(hs2_name,hs4_name,hs6_name,Country_name,Area,Year,VY,exp_imp)]
#Summing the value of exchanges
year_latest_proc_year=year_latest[,.(Value=sum(VY)),by=c("Country_name","Area","Year","hs2_name","exp_imp")]
year_latest_proc_year[,tot_value:=sum(Value),by=c("Country_name","Area","hs2_name","exp_imp")]
year_latest_proc_year[,prev_value:=sum(Value),by=c("Country_name","Area","exp_imp")]
#Deletion of small exchange (to have a fluid visualisation)
year_latest_proc_year[tot_value/prev_value<0.02,hs2_name:="Other"]
year_latest_proc_year=unique(year_latest_proc_year[,.(Value=sum(Value)),by=c("Country_name","Area","hs2_name","Year","exp_imp")])
#Path construction, the path need to be a list with the different steps
year_latest_proc_year[,path_str:=paste(paste("World",Area,Country_name,sep="/"),hs2_name)]
year_latest_proc_year[,path:=strsplit(path_str,"/")]

Adding the top inputs: time range and import/export switch

We wan the users of our apps to be able to select a specific time range to understand the japan commerce during this time range. Our input will be a slider one:

  • The maximum and the minimum will be the ones from our data
  • The step size will be one since we only have yearly data.
#Line to add in our options box
column(3, sliderInput("DateRange1", "Time selection:",min = 1988, max = 2015, value = c(1988, 2015)))

We also want to add a switch input to let the user choose between import and export:

#Line to add in our options box, just after the previous line
column(width=3,radioButtons("exchangeToShow",label="Exchanges to show",choices=c("import","export")))

This input need to be converted in a condition that will be used to do some subsetting.

#Converting the input in a "string condition"
#To ba added to the server
 exchangesType=reactive({
switch(input$exchangeToShow,
"import"="exp_imp==2",
"export"="exp_imp==1"
)
})

Selecting the data accordingly to our input

The input being built, the data need to be properly subsetted.

year_latest_proc_noyear_reac=reactive({
#Selecting data in the time range
year_latest_proc_noyear=unique(year_latest[Year>=input$DateRange1[1] & Year <=input$DateRange1[2],.(Value=sum(VY)),by=c("Country_name","Area","hs2_name","exp_imp")])
#Summing accordingly (we need to sum to have non timed data for the partition plot)
year_latest_proc_noyear[,prev_value:=sum(Value),by=c("Country_name","Area","exp_imp")]
year_latest_proc_noyear[Value/prev_value<0.02,hs2_name:="Other"]
year_latest_proc_noyear=unique(year_latest_proc_noyear[,.(Value=sum(Value)),by=c("Country_name","Area","hs2_name","exp_imp")])
#Building the path for the partition chart
year_latest_proc_noyear[,path_str:=paste("World",Area,Country_name,hs2_name,sep="/")]
year_latest_proc_noyear[,path:=strsplit(path_str,"/")]
#Subsetting to get only the import or export
#Keeping only the value and path columns that are needed by the partition chart
year_latest_proc_noyear[eval(parse(text=exchangesType())),.(Value=sum(Value),path),by=c("Country_name","Area","hs2_name","path_str")]
})

Building the partition chart:

Once the data have been pre-processed building the partitionChart is easy.

output$D3Part1 = renderD3partitionR(
D3partitionR(data =list(path=year_latest_proc_noyear_reac()$path,
value=year_latest_proc_noyear_reac()$Value),Input=list(enabled=T,Id="D3Part1",clickedStep=T,currentPath=T,visiblePaths=T,visibleLeaf=T,visibleNode=T),width = 600,height = 600))

Some comments on the param:

  • The data param is taking a list with a value item and the list of path linked to each value.
  • The input param allow us to use the partition chart as an input:
    • Clicked step send the name of the clicked step
    • Current path send back the position in the chart
    • Visible path, visible leaf and visible node send all the visible  paths/leaf/nodes from the current point of view.

When running the app, you should now see the partition chart summarizing the japan trade over your selected range.

Building the evolution over time plot:

output$Graph<-renderHighchart(
hchart(unique(year_latest_proc_year[path_str%like%input$D3Part1$clickedStep & eval(parse(text=exchangesType())) & Year>=input$DateRange1[1] & Year <=input$DateRange1[2],
.(Value=sum(Value)),
by=c("Year","hs2_name")][order(Year)]), "line", x = Year, y = Value, group = hs2_name)
)

As you can see, you can directly access the input from the partition chart using input$D3part1. The inputs are send to shiny as a list, so you need to select the inputs you’re interested in. Here, we want to see the evolution of trade between Japan and our selected area/country, using :

path_str%like%input$D3Part1$clickedStep

we are selecting all the path our clicked step belong to.

Important comment:

The app is now working, however the data preprocessing may take a big amount of time (around 1 to 2 minutes on my computer). To avoid that, you can run the preprocess step and save the aggregated data as a RDS. Thus, you’ll just have to load this RDS, which will be faster !

year_latest= readRDS("data_aggr.RDS")

Conclusion and comments:

With all this code, you should get a 1st working tab. The process to create the second one is pretty much the same.
You can also try the modify the type of partition visualisation using type=”treeMap” or type=”partitionChart” or type=”sunburst”.

Thanks for reading, and I hope you enjoyed this walk-through !

Antoine

Advertisements