Sunday 19 November 2017

Nehru, The City Designer

(1955) A simple suggestion by Nehru to Nikita Khrushchev -about making public parks entry free- created a silent social revolution within communist government. Making public parks entry free created diversified pool of dating couples, monotonous dating pattern went away and more young couples were able to enjoy romantic time in over crowded cities like Moscow. It also increased happiness among children and elderly, as they were not charged any more. It also increase number of picnic spots and affordability of picnic itself.

Fiscally it did create burden on city's budget but it revolutionized "City Design" concept in Soviet Union as we know now. It also revolutionized "Public Good" concept for city design. Urban designers now include this(affordability of park and public spaces) in "Livability index" and in "Walkability Index". Who knew that even Nehru could influence "City Design" policies of USSR!!!!.
#NehruTheCityDesigner
#Nehruvian
#MoscowFiles
#Parks

Wednesday 15 November 2017

Should We Need To Hate Nehru


History of modern India wouldn't be complete without Nehru. Why should we forget Nehru. Should we forget because he introduced Hindu Marriage and Succession act, which transformed Hindu society as we know it today. Even though conservative forces which are in power today opposed it vehemently. Or Should we forget him because he followed socialist model of economic development in line with "Bombay plan". Or should we forget because he didn't allow India to slip into revolutionary communism when our neighbours either slipped into dictatorship or communism. Or should we forget him because he stood against imperialism/ west and asked for complete independence not only of India but also of African and Asian countries.

Should we forget him because he gave idea of self development and non-alignment when every country wanted to align post WW2. Should we forget him because he vehemently opposed Gandhian softness towards British in matter of Dominian status to India.

Should we forget him because he declared Purna Swaraj. Or should we hate him because he along with Patel didn't accept balkanisation plan of India. Or Should we hate him for not listening to Patel on Kashmir matter and listening to Mountbatten instead ,which brought all current mess.

Or Should we forget him because he supported Ambedkar who designed modern marvel in law , "The Indian Constitution" , when every constitution born in the world during that period, has crumbled.

Or should we hate him because he listened to field Marshal Karriyappa about capability of Indian army to intervene in the neighbours conflict. Or should we hate him for being optimistic and idealistic towards young communist China which behaved like North Korea.

Or should we hate him because he didn't become authoritarian even though most world leader he meet were. Or Should we forget him because he created states based on language or should we hate him because he taught how to be secular even though it was not part of constitution. Or should we hate him because he was not militarily aggressive towards our smaller neighbours.

Thursday 4 May 2017

Drought Analytics of Karnataka Using MODIS Satellite Level 2 Data(2001-2015) using ArcPy python package and Tableau Public


 Humans have been changing the face of the earth since they emerged out out of Africa. Most of the recent changes has been happening since 1750s. Be it industrial revolution, be it green revolution or white revolution ; all has happened in this short period of time. However post 1950s the change has accelerated in such as way that its being called as Anthropocene : The Age of Humans.

The Age of Humans also brought with the complication of guiding the whole mankind through the problem that came along with Anthropocene. Drought is one of the biggest challenge the mankind has, is and will be facing. There are lot of index to define different type of droughts; this would be a topic for some other day. In the present article i am going to come out with simple index on drought and how to extract this index using MODIS Satellite Level 2 data for entire India using ArcGIS tools and visualize data using Tableau Public.

In previous article i had explained point extraction of data on MODIS level 3 data. Here we would be concentrating on MODIS Level 2 data. Thanks to our work at NIIT University's GIS Lab, we are able to extract and process MODIS Level 2 data for entire India. I am also happy to announce that NIIT University GIS Lab has now capabilities to work on Level 2 data of 4 MODIS products, in large scale running in 100s of GBs. Existing index of Drought is a simple ratio of NDVI and LST(In Kelvin). It does have its own flaws but its simplest and easiest way to conduct preliminary analysis on huge Spatio-temporal data. As this project has other stack holder I would abstain from disclosing the full methodology. I would rather discuss this in brief.

Method:

As previously said we have capability to extract all years data of MODIS Level 2 satellite data for whole India(South Asia) or any part of the world. Resolution of this product varies from 250meters to 500 meters and 1km. NDVI product that has been derived for whole India, has resolution of 500 meters. LST which is in Kelvin has 1 km resolution. This data is 8 days(LST) and NDVI (16 days) composite data. We have codes to convert 8 days LST to 16 days LST and NDVI 500m data to 1 Km resolution. Hence end product will be 1km resolution and 16 days composite data. Drought Index will be calculated pixel wise. Has its index, we intend to know how many number of pixel are present in the defined range of the Index. Drought Index has both Minimum and Maximum, all other data will vary between these. Minimum would be -9 and maximum would be 36. This index is just number but this is power full number that could give a lot of insight in the drought.

If index is less which means NDVI is less and LST is high, and if index is more means NDVI is high and LST is less. Every other things would vary between these interpretation. We have created numerical class for putting all values of pixels index less that 0 in one class and then defining 7 class with the gap of 5. In all 8 classes Class 0 to Class 7. This is still not standard, more research with more parameters could be added to come out with brand new index. Here we are interested in having capability to extract data and analysis district wise using raster data, rather than on Index itself. We have also developed scripts that can take shape file(GIS) of whole state or nation or region containing 1 level of subdivision and extract the raster data based on the subdivision. This extracted data would be in the form of CSV file. The data extracted is just simple conversion of attribute table of each raster file that has been clipped based of shape file given. This is done using ArcPy a python package provided by ESRI. CSV file is fed into Tableau software of both GIS and non-GIS visualization. I am glad to tell readers that this integration of Arcpy and Tableau software for analyzing Raster data(MODIS) is uniquely developed by us in NIIT GIS lab. This method uses processing power of ArcPy at ArcGIS server and visualization power of Tableau.

 Link 1
Dry Areas(here u can interact with data by changing years and months)

Link 1: Please Click above Link to see interactive graphs
Our data processing is done for whole India. Just extraction of data from drought raster file took more than 24 hr of processing for entire India. I can tell that i have district wise data for all district of India in this Drought Index. In coming day LST for all districts would be generated. Here in Tableau public I have presented data for my state ie Karnataka state in Republic Of India. Same can be done for any part of the world. Tableau public provides unique opportunity to share data publicly. Given above figure (links) are off Tableau public . These are interactive graphs users can change the selection given in the Tableau. Tableau public is free. Tableau also comes with GIS feature which i have tried to showcase. Lot of interpretation can be drawn out of these graphs. Figure 1(Link 1) will give the spatio-temporal variation of 15 years of Drought Index for all district of Karnataka, for all months and all years. User could filter the visualization. Figure 2(link2) gives plotting. Figure 3 (Link3 ) gives GIS visulaiztaion of Dry spell for entire data. Dry spell data is for post-Rabi season. Northern part of Karnataka state is prominently effected by drought like situations. Reader, I would encourage you to use filters to see how this Index changes year wise and month wise.

Link 2: DroughtAnalytics of Karnataka Using MODIS Satellite Data 2001-2015

There could be lot of discussion on just visualization but typing constrain. But i promise reader that I would come back with more such interesting ESDA(Earth Science Data Analytics) works. Next time it would be LST for districts of karnataka or could be at sub-division level or could be at panchayat level. I have scripts which could work till Panchayat level, even i would be eager to work on this but only processing capability is blocking the move at Panchayat level. Anyway till next time happy learning , Happy ESDA.

Link 3 Here reader can select the graph based upon District and Month. Interpretation of the graph will give rain fall pattern of each districts of Karnataka
Readers please do contact me if you which to derive data for your own area. I would more than happy to help you. mail me at shivaprakash.ssy@gmail.com or contact me via LinkedIn. I would also entourage readers to download Tableau Workbook of these analytics and you can work on my data. We would also provide assistance for those who want to extract data for their region.

Sunday 2 April 2017

Extraction and Visualization of MODIS level 3 AOD data using ArcGIS and Tableau


It has been a month since I wrote "Making Sense of My Climatic Data". In order to spread the word that "Any one can become Data science guy", I need to keep writing about the work that i have done or doing as part of my projects. The present post is about a project that we(Arun Kale, Manoj Divakaran and Me ) took under the guidance of Dr Parul Srivastav at NIIT University. This project was about investigation of CO(Carbon Monoxide), NO2(Nitrogen Dioxide) concentration over industrial cities near NIIT University using Remotely sensed data of MODIS sensor on board Terra satellite. We also looked into AOD over these industrial cities. Here we have used level 3 data which could be downloaded from on of the NASA's website (https://neo.sci.gsfc.nasa.gov/).
Main objective of this project was to extract data from CO, NO2 and AOD over Neemrana City(RICCO industrial Area), Bawal and Behror. I followed Earth Science Data Analytics methodology but with my own taste to it. I included ArcGIS and Tableau software for the project. ArcGIS Desktop was used to extract the data from the TIFF files and convert the extraction into CSV file which was then fed to Tableau software for Visualization in the form of temporal graph.
Figure show the methodology that was followed. Its a simple methodology. Following are the steps
  1. Data in TIFF formate has to be downloaded. Here we have taken monthly data for 15 years form 2001 to 2015.
  2. A point shape file representing cities was prepared.Using ArcMap tool (Extract Multivalues To Point), we used point shape file to extract the values present at that X,Y in TIFF files.
  3. This gave us 3 shape files. 1 for CO, 1 for NO2 and another for AOD. Attribute table of these shape files contained data, this table should be extracted to table using Table to Table or Table to excel conversation tool in ArcMap. The output of this conversion in either CSV file or Excel file.
  4. This csv or excel file is input to the Tableau which is used for visualization of the data. However there is little modification to done to the csv or Excel files. In order to get temporal visualization curve, time period must be in a single column. However the output of the extraction doesnt give that format. so the whole extracted data as to be transposed using Transpose function in the Excel.
  5. And extraction of data would create a column of extracted values for all points with column name same as the file name so name convention of the file must be such that it must define month and year of the file, so that this information could be used for temporal visualization using Tableau.
2 Things that need to kept in mind is file naming convention and transposing of the data. Once the data is ready; Tableau will takecare of the rest. Above method involved extraction by ArcGIS, there is no data preparation involved but downloading TIFF data and renaming using python script could be considered as Data preparation. Data cleaning is not necessary as level 3 data doesnt have irregular data. if 0 or NoData exist in data , please verify before using it into tableau. For AOD, the no-data would be extracted as 0 as No-Data's gray-scale would be 0. Hence, if 0 value exist in the data , it should be converted into No-Data. This could be done both at Image level , which would be before extraction and at excel level after extraction.
Earth Science data analytics(ESDA) doesnt mention at what stage there should be data cleaning, or extraction. This could be adapted based on project. once data preparation, data extraction and data cleaning is done, its time to Visualization and analysis of the extracted data.
This graph is temporal visualization of AOD over RICCO area near Neemrana City. There are lot of spikes which are high value of AOD, which is naturally in the month of November, June and July.
June and July are due to intense sand storm which increase the AOD of the area and November and December is due to temperature inversion in the area compounded by burning of straws in the fields. There is change in recent trend, as high values of AOD have started getting consistently in the month of November and December only after 2009 , from when industries are being operationalized in the area . This need to be investigated further. This could be highlighted using Tree map of some of the highest AOD and their months. These graphs are obtained for Behror and Bawal also.
Carbon Monoxide(CO) is also important measure for industrial area. Below are the graph of 3 cities.
There is clear declining trend in both RICCO Neemrana and as well as Bawal. Even Behror also shows the similar trend. Why there is decreasing trends needs to be further investigated. Another interesting thing came out of these data was CO would be highest in the summer months ie in the Month of April, May and June. which varies between 120 ppbv to 160 ppbv.
While evaluating NO2 concentration which is directly linked with industrialization ie as industries increase there will be increase of NO2 gas. This, we got clearly from the temporal graph of NO2 over these cities. Even though data was available only from 2005.

Even in this(2005 to 2015) data there is clear trend which shows that till 2009-2010 the curve remained steady only to increase from 2010. This is consistent even in other cities.
From Tree map that been generated for Neemrana RICCO has most of the highest values of NO2 from the year of 2011 to 2015. Only a few of highest values are from year before 2010.From the above data visualization we can draw some conclusions.
  • Highest AOD values of 3 cities (RICCO Neemrana, Behror, and Bawal) before 2010 have been in month of June and July. This is in consistence with the environment as Rajasthan experiences sand storms and strong winds carry dust in these time. But this trend changes from 2010. The highest recorded AOD from 2010 has been in months of November. This is consistence in 2010,2012,2013,2015. This is due to smog that gets created during the winter, which starts from November. Burning of the crops in Punjab and Haryana during November creates particulate matter that are main reason for the smog in this area along with addition of air born particles from the Industries. This gives clear indication that from 2010 there has been huge increase in pollution levels relative to the previous years.
  • Nitrogen Dioxide in RICCO Neemrana, Behror and Bawal area is on rise from 2005. However the slope till 2009 is very small however after 2009 there is rapid rise in Nitrogen Dioxide levels, which is clear indication of addition of Nitrogen Dioxide from industrial sources. Nitrogen Dioxide values also increase during winter months due to temperature inversion.
  • Carbon Monoxide in RICCO Neemrana, Behror, and Bawal area shows constant decrease trend from 2001. Highest values in most year has been in month of May. This decreasing trend and peaks in May needs to studied further.
These type of data extraction and visualization could be done using this simple methodology. This study could be extended to all the cities in order to know trends of AOD, NO2 and CO. I need to thank Manoj Diwakaran and Arun Kale for generating these graphs and data. And if reader has doubt please mail me at shivaprakash.ssy@gmail.com, I would be more than happy to help you people. Happy learning and happy ESDA.

Friday 3 March 2017

Making Sense of Climate Data using ArcGIS and Tableau software


 * This post is the replication of the Making Sense of My Data (Climate Data) which i wrote in LinkedIn

As I promised in my previous article, I am back with more to say about Geo-Data science. This time its about visualization of the data that we have extracted. Being independent researcher(Climate Change) and guy who loves data, i would love to show how beautiful(rather worrying for climate scientist) visualization of data could be using Tableau software. As i told in previous post that we have extracted 15 years of LST,NDVI and Snow extent data. Although input was EOS-HDF files of MODIS satellite, we got CSV file as final output. If you need to know how we got CSV, please refer https://www.linkedin.com/pulse/geo-data-science-shivaprakash-yaragal.

Data that we extracted as elevation, class and time as bins. So there could lot of trend analysis using such data. Data can be aggregated, grouped , regrouped based on elevation. Its basically multidimensional data. Tableau software is one of the most powerful that market can provide for us. For students , Tableau provide 1 year free subscription of this software. Thanks to my dear friend Savan Kumar for introducing me to Tableau a year back. Tutorial for Tableau is also free. If you are not student you can still get Tableau public. Given below is the tutorial for Tableau public(https://public.tableau.com/en-us/s/resources).

Having talked about Tableau , lets move on to what we came out with. As this project involves other researchers also , i should restrict myself for not putting all the data. For purpose of this article there are enough visualization to put through . From our data we have done trend analysis , which are quite interesting for Climate Change researcher and as well as data science guy. Below is one such graph. I have reduced granularity in these so that people could get broad picture of whats happening. In the below picture its month wise(Ex: all January month of 15 years) trend at elevation of 4000m to 4500m. This is just for 4 months. We have done it for elevation between 2000m to 6000m having bins of 500m . This data is for 15 years. Month data is from grouped average data. Its quite interesting to see that at 4000m to 4500m there is decreasing in the temperature during Winter and starting of summer. There is much more to infer if NDVI , SNOW extent at same elevation is analyzed(which we have done). I leave inference to readers.
Below picture is beautiful visualization of how snow and no-snow pixels will vary month wise. Each line represents single year. So we have 15 line showing how snow and no-snow pixels vary with Georgian calender. Snow graph clearly shows in which month we have high snow pixel(starting from December and decreasing in March). It also show low number of pixel in summer months. If granularity is increased there is lot much infer upon. We will not go into it. These could be more beautified like bride if you know how to makeup ;-) ;-)

I love multidimensional data, visualization of these is big task for analytics industry. I need to appreciate what Tableau can do with this kind of data. Below image is one of the most lovely visualization( of-course this graph is from me only). she is mine from core ;-) :-) she was my first love , my first love of climate data ;-) ;-) It because of this beautiful graph , i started loving visualization. Lot things can be inferred from it. Its a scatter plot of no of snow pixel vs no-snow pixel categorized by months given by color and size. Size is reversed in order to high the negative aspect, which may be low numerical value, low average etc. Basically its reverse symbolism. Giving big symbol to numerically smaller entity to highlight that entity. From this graph it's inferred that its only in month of June,July, and august there are large number of pixels that dont follow the usual trend.
The graph which is at the top ie at the starting of the article is derived from snow data. It shows variation of 15 years data. All our effort is in just that graph , its as simple. I love that even tough not much can be inferred. As people say "Devil is in details" , all the other graphs are derived from detailed and granular data rather than aggregated data.

I will explain another 2 graph (tree map) which are quite good.These are tree map. These give distribution of data from left to right along with varying size of cells. The red color tree map is about 10 top hottest months of the study area. Blue color tree map shows 10 coldest months. Its feels awesome to come out with such tree map with just click of button using Tableau.

From these Tree maps its clear that June 2009, was hottest(of-course 291.059 Kelvin is total average of large geographical area spread over large area). And February 2008 was the coldest with 267.010. Please note that i havent given you the elevation of the data ;-) ;-) . I leave it to the imagination of the readers. There are numerous such visualization that we are able to do. I am not great climate guy or data mining guy, but what i know is : "what,how and where" to extract data and how to visualize data. My concept is simple "Any one can be come Data science guy". 

I have been doing or done other data visualization as well, I will put that(PM2.5 visualization of Industrial city) in coming days, till then happy visualization. And please ignore any mistakes ;-) ;-) . If any suggestion or talk on the these or any other thing please do contact me on LinkedIn or mail me @ shivaprakash.ssy@gmail.com .

Friday 7 October 2016

Geo Data Science

I have always wondered why and how people establish new nomenclature in industry. Geo-spatial industry is not new, neither is data science. I was one of the few lucky guys who tested both at the same time way back in 2011-12, while i was still trainee with Tata Consultancy Services. Although i was trained as Java developer, but was given opportunity to work on data mapping using Informatica. Before i could know its part of BI, i had one of myu feet in GIS in maintaining spatial repository for GIS application . It was a accidental entry as non in the existing team were interested in working on GIS.

Between 2012-15 it was altogether different experience. It was all about understanding sociology, political science, anthropology, history, environment etc. This has given me strong belief that complexity exist in problem solving, reductionist attitude doesn't provides solution but clarity of the complexity could. Clarity of complexity is just have high resolution data of large area. I always dreamt of working in a domain that integrates all that i have learnt. It has been 1 year in my master in GIS, we have been working on 2 projects on Climate change
  • Detecting Greening effect in Uttarakhand area of India.
  • AOD study of cities of India
Both of these projects are first of its kind in terms of scale both spatially and temporally. We use both level 2 and level 3 data of MODIS Terra Satellite. Task in both were similar conceptually
  • compile 15 years of temporal data with 500m spatial resolution and 16 days temporal resolution
  • 3 parameters were used in Uttarakhand project and 3 other parameters in AOD project
For both project we applied different methodology but we used same approach of data science which included following steps
  1. Data preparation
  2. Data extraction
  3. Data visualizations
  4. Data interpretation
For Data preparation and Data extraction we used Ubuntu server and bash programming. Huge amount of data was downloaded from LPDAAC. In GIS or in Geo-spatial domain Data preparation also involves
  • creating vector data which include point and polygon on the study area. This included creating point cloud of 150 cities.
  • data preparation also included downloading or ftping huge amount of data from the servers
Data Extraction

We used ArcGIS 10.4.1 software to extract data . As we had large number of files to process we used python scripting. In Uttarkhand project we developed series of code which performed not only better then MRT tool- given by LPDAAC(USGS, NASA)- but also was running on the local machine. All was possible by using ArcPy packages in ArcGIS packages for data processing. This processed data which was in the image format was using ArcGIS itself which is Geo-visualization software as well. But we were able to create lot classified images which were extracted to .csv file. And these images were in order of thousands. Working on thousands of image was huge challenges.

.csv that were generated out of the classified images were fed into Tableau 9.1(Now Tableau 10) for visualization. I am happy to announce that we finally found Greening effect in Himalayas of Uttarakhand. We had another huge task of creating bins fro 3 parameters ---> NDVI, Temperature and Snow Extent & Days. These bins were made on bases of elevation. Bin width was 500meters. It was computationally heavy as each image of single parameter needed to be classified in 10 bins. This data was as previous said was converted into .csv file. All was done using python code in ArcGIS.
I am also happy to tell viewers that there is increase in NDVI pixels at elevation above 5000meters in Himalayan mountains of Uttarakhand. This is clearly visible from temporal graphs that were out come of the Tableau Visualization. This gives us confidence to put another aspect of Data Science ie
Data Visualization 

As i explained we have used
Data Preparation--->Data Extraction---->Data Visualization
Interpretation is our result of Green effect, the question of weather and how these parameters can be interrelated and to which extent.

Same logic was followed while carrying out Aerosol Optical Depth(AOD) for more than 150 cities of India. It included

Data Preparation
  • using ArcGIS software
  • FME desktop
  • Ubuntu Server
Data Extraction
  • Python script(ArcPy)
Data Visualization
  • Tableau
  • ArcMap
Data Interpretation: We were able to prove following
  • there exist co-relation between temperature and AOD. This correlation is geographical correlation.
  • AOD of most Indian cities except some South Indian cities is greater than 0.3 for most of the month of the Year
  • Brown cloud (AOD) exist all along the Indo-Gangetic plain staring from Punjab and ending up in West Bengal. Hence most cities that comes under this region dont have clear air to breath.
We are conducting similar projects which i will update in coming months. From above projects its clear that Data Science methods are and has been part of Geo-Data or Geo-spatial data. Geo Data Science is inherent part of GIS or Geo-spatial world.




Monday 18 January 2016

GI

Geographical Intelligence 


Concept of AI (Artificial Intelligence)seems to be old one. Turing Test was hypothesised way back in 1960s. GI is similar concept to it. AIs and GIs were to be reality . people seems to know how, but few knew when. Human conscious is function of time. Machine consciousness is function of both computing, search and time. We have reached a critical juncture where first 2 are met, but its about time now. We have AIs now which can pass Turing test in one component like speech etc. We till now haven't been able to disprove Moore law , but I hope world is close to at least challenge it. World is waiting for next big thing. There are things out there, but its gradual reorganization of these related elements that can create AI . AI could just be abstraction of human consciousness, but what about abstraction of nature. Will we be able to create abstraction of this?. We really don't know this. What form it may take?!!!, no one knows. What it should be called?!!!;, A God, or Natural Intelligence, GI ie geographical intelligence etc . its very intuitive way of answering the question of "What's Next after AI". If u ask me I would prefer a cup of coffee with GI if at all its in human form .smile emoticon smile emoticon smile emoticon

 smile emoticon