* This post is the replication of the Making Sense of My Data (Climate Data) which i wrote in LinkedIn
As I promised in my previous article, I am back with more to say about Geo-Data science. This time its about visualization of the data that we have extracted. Being independent researcher(Climate Change) and guy who loves data, i would love to show how beautiful(rather worrying for climate scientist) visualization of data could be using Tableau software. As i told in previous post that we have extracted 15 years of LST,NDVI and Snow extent data. Although input was EOS-HDF files of MODIS satellite, we got CSV file as final output. If you need to know how we got CSV, please refer https://www.linkedin.com/pulse/geo-data-science-shivaprakash-yaragal.
Data that we extracted as elevation, class and time as bins. So there could lot of trend analysis using such data. Data can be aggregated, grouped , regrouped based on elevation. Its basically multidimensional data. Tableau software is one of the most powerful that market can provide for us. For students , Tableau provide 1 year free subscription of this software. Thanks to my dear friend Savan Kumar for introducing me to Tableau a year back. Tutorial for Tableau is also free. If you are not student you can still get Tableau public. Given below is the tutorial for Tableau public(https://public.tableau.com/en-us/s/resources).
Having talked about Tableau , lets move on to what we came out with. As this project involves other researchers also , i should restrict myself for not putting all the data. For purpose of this article there are enough visualization to put through . From our data we have done trend analysis , which are quite interesting for Climate Change researcher and as well as data science guy. Below is one such graph. I have reduced granularity in these so that people could get broad picture of whats happening. In the below picture its month wise(Ex: all January month of 15 years) trend at elevation of 4000m to 4500m. This is just for 4 months. We have done it for elevation between 2000m to 6000m having bins of 500m . This data is for 15 years. Month data is from grouped average data. Its quite interesting to see that at 4000m to 4500m there is decreasing in the temperature during Winter and starting of summer. There is much more to infer if NDVI , SNOW extent at same elevation is analyzed(which we have done). I leave inference to readers.
Below picture is beautiful visualization of how snow and no-snow pixels will vary month wise. Each line represents single year. So we have 15 line showing how snow and no-snow pixels vary with Georgian calender. Snow graph clearly shows in which month we have high snow pixel(starting from December and decreasing in March). It also show low number of pixel in summer months. If granularity is increased there is lot much infer upon. We will not go into it. These could be more beautified like bride if you know how to makeup ;-) ;-)
I love multidimensional data, visualization of these is big task for analytics industry. I need to appreciate what Tableau can do with this kind of data. Below image is one of the most lovely visualization( of-course this graph is from me only). she is mine from core ;-) :-) she was my first love , my first love of climate data ;-) ;-) It because of this beautiful graph , i started loving visualization. Lot things can be inferred from it. Its a scatter plot of no of snow pixel vs no-snow pixel categorized by months given by color and size. Size is reversed in order to high the negative aspect, which may be low numerical value, low average etc. Basically its reverse symbolism. Giving big symbol to numerically smaller entity to highlight that entity. From this graph it's inferred that its only in month of June,July, and august there are large number of pixels that dont follow the usual trend.
The graph which is at the top ie at the starting of the article is derived from snow data. It shows variation of 15 years data. All our effort is in just that graph , its as simple. I love that even tough not much can be inferred. As people say "Devil is in details" , all the other graphs are derived from detailed and granular data rather than aggregated data.
I will explain another 2 graph (tree map) which are quite good.These are tree map. These give distribution of data from left to right along with varying size of cells. The red color tree map is about 10 top hottest months of the study area. Blue color tree map shows 10 coldest months. Its feels awesome to come out with such tree map with just click of button using Tableau.
From these Tree maps its clear that June 2009, was hottest(of-course 291.059 Kelvin is total average of large geographical area spread over large area). And February 2008 was the coldest with 267.010. Please note that i havent given you the elevation of the data ;-) ;-) . I leave it to the imagination of the readers. There are numerous such visualization that we are able to do. I am not great climate guy or data mining guy, but what i know is : "what,how and where" to extract data and how to visualize data. My concept is simple "Any one can be come Data science guy".
I have been doing or done other data visualization as well, I will put that(PM2.5 visualization of Industrial city) in coming days, till then happy visualization. And please ignore any mistakes ;-) ;-) . If any suggestion or talk on the these or any other thing please do contact me on LinkedIn or mail me @ shivaprakash.ssy@gmail.com .