I created my first shiny app. It creates some nice images with a few clicks.

I tried making a shiny app about 2 years ago, but I never really got it online for some reason. But after talking to a friend I tried again and it was surprisingly easy! Maybe I just got a lot better using R or the server infrastructure got a lot easier to use.

Anyway here you can play around with it.

I still could improve a few things UI-vice and maybe add more options. But honestly for a prove of concept it is good enough and I am quite happy how well it works. And I really like the results.

Visualizing my podcast listening habits

I listen to podcasts all the time. But I didn’t really have a clue how much time I spend on it and what kind of podcasts I am even listening. So I started to track my habits. To do so I switched my podcast-app to Podcast-Addict. The app tracks how much time you spent on each podcast. Unfortunately there is no way to properly export that data, so I had to type in the times manually once a month. I did that for one and a half year of which I used one year. (My phone broke so there was a 10 days gap and I thought one year is prettier anyway.)

The main problem was that the interesting information was which podcast I listen to and how much. But at the same time there were 75 of them, too many to give space to all of them, because it would look to crowded. I tried anyway and the result is the graph “Podcasts by Category”. And it definitely is too crowded, but I am still quite happy with it because it manages to show a lot of information at once without being too confusing.

An alternative was to summarize the Podcast into categories, but this would be mainly interesting for myself but for no one else, also because the categorization is a bit arbitrary at times. For me, the result was interesting: I could clearly recognize the time I was working, the time I started my master thesis and the time in between. In the month with the huge spike I went back to university but most of my friends weren’t back yet. I was also listening to the Versailles Anniversary Project (100 years), which followed the making of the Versailles treaties nearly day by day and because I started a month to late I had a lot of catch up to do (amazing work by the way. Check it out)

One goal of this visualization was to try to properly apply one theme to all the graphs, so they look like they belong together. I think that worked quite well beside some minor errors. I also tried to get better at using colours. I wasn’t to successful there, the colours I used in the end  for the languages are quite ugly and I should have played around with it a little longer.

Generating beautiful patterns with R

Motivated by my last experiments I decided to look a bit more into generating images with R. One of my favorite musicians Max Cooper released another absolutely gorgeous music video (https://www.youtube.com/watch?v=O7bKq03bAsg). Check it out, the animation is fantastic. I was intrigued by the simple basic structure. It was just a rectangle divided by rectangles divided by rectangles. Something I can absolutely do with my R skills. So, I tried. The pictures are a result of that.

My tactic was to create a data frame just starting with the first rectangle, defined by start and end coordinates then just splitting them. I filled them randomly with colors. Honestly quite simple. Doing this and other experiments, I got a lot better making code run faster in R. Of course there is still a lot of space to improve. The code in my last post for example was super inefficient and I made a lot of basic mistakes (I improved it and now it runs a lot faster).

Here is the code if anyone is interested:

library(tidyverse)

rectanglesplitter=function(data,i){
#data is one line of the total dataset, i is the number of the loop.
#because I don't want super long rectangles, I always first check which is the longe side.
yaxsplit=ifelse((data$z.x-data$a.x)^2>=(data$z.y-data$a.y)^2,F,T )
#create the divider which is a value which defines the proportions of the two new rectangles
divider=1/(random[i]+1)

#rectangle 1
########
#ifelse is necessary to change if it is a vertical split or not.
#here the new x.a1 and new y.a1 is created. (naming was a bit stupid I admit)
data[2,1]=ifelse(yaxsplit==T, data[1,1] , data[1,1]+(data[1,3]-data[1,1])*divider)
data[2,2]=ifelse(yaxsplit==T, data[1,2]+(data[1,4]-data[1,2])*divider, data[1,2])

#Points stay stay the same no mater the orientation.
#x.z1 and y.z1 are created
data[2,3]=data[1,3]
data[2,4]=data[1,4]

#rectangle 2
#are the same like in the first created recangle (x.a and y.a)
data[3,1]= data[2,1]
data[3,2]= data[2,2]
#y.z2 and y.z2 
data[3,3]=ifelse(yaxsplit==T, data[1,3] , data[1,1] )
data[3,4]=ifelse(yaxsplit==T, data[1,2] , data[1,4])


#add level (for potential animations)
data[2,5]=i
data[3,5]=i
#add a color. one of the rectangles keeps the color of the bigger rectangle, not necessary
data[2,6]=random[i]
data[3,6]=data[1,6]

#this changes which one of the rectangles is saved first. this should change it up and make sure there aren't more splits on one side.
if(random[i]%%2==0){
data[3:2,]
}
else{data[2:3,]}
}

# a list of color palettes
pallist=list(
palette=c("black","#CDCFE2","#423E6E","#FF352E"),
palette=c("#233142","#455d7a","#f95959","#e3e3e3",NA)
)

#how many splits should be done?
loops=50000
#create empty dataframe
df=data.frame(a.x=rep(NA,loops*2),
a.y=NA,
z.x=NA,
z.y=NA,
level=NA,
color=NA,
alpha=NA)
#fill first row
df[1,]=c(0,0,100,100,1,1,1)


#precreate random vector used for proportions and colors.
random=sample(1:4,loops,replace = T)

i=1
while(i <loops){
#filling up dateframe with simple loop and splitter functions
df[((2*i):((2*i)+1)),]=rectanglesplitter(df[i,],i)

#this skips every few rows, so there stay a few bigger rectangles.
i=ifelse(i%%17==0&i>881,i+2,i+1)
}


#this is just for me to choose one palettes in the list
farbe=1
ggplot(df)+
geom_rect(aes(xmin=a.x,ymin=a.y,xmax=z.x,ymax=z.y),
alpha=9,show.legend = F,fill=pallist[[farbe]][df$color],col=pallist[[farbe]][5])+
coord_fixed()+
theme_void()

Trying to create Glitch Aesthetics with R and failing in a beautiful way

This newest project doesn’t have much to do with data visualization, besides I used R for it. I am into Glitch Art  since some time already. I usually used apps or some plugins to create my own. I was interested if I could write anything like this on my own. R is probably far from the  optimal tool to work on images, but it is the best “programming” language I know, so it must do. The blog Fronkostin did something with images in R recently and inspired me to try it too. He has actually a clue what he is doing with R and also knows some math, so he is worth checking out.

Here is my first try: I basically, selected a random square of the image and randomly shuffled to color channels around (RGB) or inverted them. And then I repeated the progress between 10 and 80 times. It is quite basic, but I love the look of it. And I just love that I can churn them out automatically, randomly creating an infinite amount of variants.  The whole thing is quite slow and is basically unusable with bigger images, probably have to look into it or start learning some Python.

I also wrote a little tool to create scanlines randomly, which basically involved creating a gap in the image and then filling it up with one line. I didn’t care too much about those result, so I just let them be. I will keep working on ideas, so I probably can bring it back with something else.

 

Code

library(imager)
library(tidyverse)

setwd("path")

#convert image to dataframe. Add additional colorchannels, which will be changed.
img=as.data.frame(image,wide="c")%>%rename(red1=c.1,green1=c.2,blue1=c.3)%>%
mutate(red=red1, blue=blue1, green=green1)

 #this function randomly changes colorchannels of random squares. also uses negative of the colorchannel


colorswitches = function(data,negative=T){

#create cordinates for squares. choose randomly two points on the x and y axis. 
liney=sample(1:max(data$y),2)%>%sort()
linex=sample(1:max(data$x),2)%>%sort()

#there are 6 color variables. three of them are the originals. three of them are the ones who are changed
#variable which defines where color is picked from. chooses from orignal and changable variables. it is important that it also picks from original from time to time. because else at one point all becomes grey.
fromcolor=sample(3:8,1)
#randomly selects one of 3 changable variables
tocolor=sample(6:8,1)

#add 1 to 6 chance the negative of the color is used.
minuscolor=ifelse (sample(1:6,1)==1,T,F)

#this is just that the for counter doesn't has to start at one. small speedup
startbla=max(data$x)*(liney[1]-2)
startbla=ifelse(startbla<1,1,startbla)

for(i in startbla:nrow(data)){

#check if x and y is inside the defined square
if (data$y[i] > liney[1]){
if(data$x[i]<linex[2] & data$x[i]>linex[1]){

#two version of changing the color value of the selected channel. one negative on normal. 
if(minuscolor==T &negative==T) data[i,tocolor]=1-data[i,fromcolor]
else data[i,tocolor]=data[i,fromcolor]
}
}
#if y bigger then selected square, stop loop
if(data$y[i]>liney[2]){
break 
}
}
data
}


#repeating the colorsquare function
for(i in 1:50){
img=colorswitches(img)
}


#create proper RGB code from the three color channels
img=img%>% mutate(rgb=rgb(red,green,blue))

#display it with ggplot.
p<- ggplot(img,aes(x,y))+geom_raster(aes(fill=rgb))+scale_fill_identity()+
scale_y_reverse()+
theme_void()
p



Where have I been in 2018

Last year I made a post about where I have been in 2017. I was worse with R so I solved it with a combination of cleaning up the data with R, importing it into Qgis and finally edited it in HitFilms.

This time I just did it all in R. The difficult part this time was to get the Google API running for the import of Google maps. (You need to enable billing to get the whole thing running.) It was the first time I used the new ggAnimate. It is great and easy to use. Less of a hazzle then the last times I used it.

I could reuse some of last years code, so I was done quite fast. (not the greatest code tough.)


library(tidyverse)
library(jsonlite)
library(ggplot2)
library(ggmap)
library(gganimate)
library(gifski)
library(zoo)
library(lubridate)

register_google(key = “AIzaSyCTCk3yYCPEo1UKVkZm_iQk_r4wPJCHlA4”)

system.time(x <- fromJSON(“GoogleLoc.json”))

# extracting the locations dataframe
loc = x$locations

# converting time column from posix milliseconds into a readable time scale
loc$time = as.POSIXct(as.numeric(x$locations$timestampMs)/1000, origin = “1970-01-01”)

# converting longitude and latitude from E7 to GPS coordinates
loc$lat = loc$latitudeE7 / 1e7
loc$lon = loc$longitudeE7 / 1e7

# calculate the number of data points per day, month and year
loc$date <- as.Date(loc$time, ‘%Y/%m/%d’)
loc$year <- year(loc$date)
loc$month_year <- as.yearmon(loc$date)

#new dataframe with the important units
maps<- data.frame(loc$lat,loc$long,loc$date,loc$time,loc$year)

#filter out the year and convert the longitude to the proper unit.
maps1<-maps%>%filter(loc.year==2018) %>% mutate(longitude = loc.long/10^7)

#choose the 10. measurement of each day. not very elegant, but good enough.
maps2<- maps1 %>% group_by(loc.date) %>%
summarise(long=(longitude[10]),
lat=(loc.lat[10]))

#get background map. set size, zoom, kind of map)
mamap <- get_map(location=c(mean(maps2$long,na.rm=T),mean(maps2$lat,na.rm=TRUE)+3), maptype = “satellite”,zoom=5)

#put it all together.
ggmap(mamap)+
geom_point(data=maps2,aes(x=long,y=lat),size=4, col=”red”)+
geom_label(data=maps2,x=1.5,y=56,aes(label=format(as.Date(loc.date),format=”%d.%m”)),size=10,col=”black”)+
theme_void()+
#the animation part
transition_time(loc.date)+
shadow_trail(alpha=0.3,colour=”#ff695e”,size=2,max_frames = 6)

a=animate(m, renderer = ffmpeg_renderer(),duration=20)
anim_save(filename = “my2018/2018video.mp4”)

American Cities named after big German, Austrian or Swiss Cities

Recently the Swiss ambassador in the United States posted an interesting tweet. It showed a map of places in the US which might have Swiss roots. A little later I found the same map on Reddit too. Reading the discussion and after checking some things myself, I noticed that there were mistakes or the cities weren’t existing anymore. Unfortunately, the ambassador didn’t share any source.

The map made me curious and I thought to check myself. I downloaded a list of places in the United States and Canada from www.geonames.org. I imported it into R and filtered it just for the cities and town.

I then created three lists with search-term for Germany, Switzerland and Austria. For Switzerland I used the names of the Kantons, for Germany the names of the 30 biggest cities and for Austria I took the 10 biggest cities and the name of some of the regions. I also translated some names of more famous places to English. I had to do some filtering because a name like “Uri” just gives a ton of wrong results. I then used the search-terms to look through the cities in the US and Canada.

The first result was, that a big part of cities were dead an had a population of zero. I think it was more than half. I had no clue, that there were so many empty towns in the US. I decided to filter those out, because it cluttered everything. Then I had to filter some more for names like “BERNard” and in the end, I went manually trough the list to remove false positives.

After that I just had to visualize it. I used the packages ggplot, ggrepel and ggmap to create the map. I finished it in Gimp. If anyone is interested in the code let me know.

The Reign of Roman Emperors

I visualized the reign and end of Roman emperors. I am quite a fan of history, so when the task in the /r/dataisbeautiful-DataViz-Battle was to visualize the reigns of the Roman Empire, I was excited.

Continue reading “The Reign of Roman Emperors”

To which European countries do Europeans migrate?

 


Migration is a huge topic in Europe and I wanted to know where people go, when they leave the country they grew up in. Luckily Eurostat has some Data about that.

There is the problem of huge population differences between the countries, so I wasn’t able to just use the absolut numbers. So I created to graphs. Once it show the migrant-population  relative to the host country and once relative to their origin country.


I created the graphs with the help of R and Ggplot. Code of the second graph:

ggplot(mig1,aes(y=Host,x=Origin,fill=sharehostpop))+
 geom_raster()+
 theme_gray()+
 coord_equal()+
 scale_fill_distiller(palette="YlOrRd", direction = 1,na.value=NA,trans='log1p')+
 theme( 
 axis.text.x=element_text(angle = 45, hjust=.1),
 legend.position = "bottom")+
scale_x_discrete(position="top")+
 scale_y_discrete(limits=names(table(droplevels(mig1$Host)))[length(names(table(droplevels(mig1$Host)))):1])+
 labs(y="Host-Country",x="Origin-Country",fill="Share of Population in Host-Country (%)",
 title="Biggest groups of European immigrants in Europe (2017)",caption="Note: Missing countries had no Data avaible or were so small, that they distored the scale.
 Source: Eurostat")