Sunday, 2 April 2017

Extraction and Visualization of MODIS level 3 AOD data using ArcGIS and Tableau


It has been a month since I wrote "Making Sense of My Climatic Data". In order to spread the word that "Any one can become Data science guy", I need to keep writing about the work that i have done or doing as part of my projects. The present post is about a project that we(Arun Kale, Manoj Divakaran and Me ) took under the guidance of Dr Parul Srivastav at NIIT University. This project was about investigation of CO(Carbon Monoxide), NO2(Nitrogen Dioxide) concentration over industrial cities near NIIT University using Remotely sensed data of MODIS sensor on board Terra satellite. We also looked into AOD over these industrial cities. Here we have used level 3 data which could be downloaded from on of the NASA's website (https://neo.sci.gsfc.nasa.gov/).
Main objective of this project was to extract data from CO, NO2 and AOD over Neemrana City(RICCO industrial Area), Bawal and Behror. I followed Earth Science Data Analytics methodology but with my own taste to it. I included ArcGIS and Tableau software for the project. ArcGIS Desktop was used to extract the data from the TIFF files and convert the extraction into CSV file which was then fed to Tableau software for Visualization in the form of temporal graph.
Figure show the methodology that was followed. Its a simple methodology. Following are the steps
  1. Data in TIFF formate has to be downloaded. Here we have taken monthly data for 15 years form 2001 to 2015.
  2. A point shape file representing cities was prepared.Using ArcMap tool (Extract Multivalues To Point), we used point shape file to extract the values present at that X,Y in TIFF files.
  3. This gave us 3 shape files. 1 for CO, 1 for NO2 and another for AOD. Attribute table of these shape files contained data, this table should be extracted to table using Table to Table or Table to excel conversation tool in ArcMap. The output of this conversion in either CSV file or Excel file.
  4. This csv or excel file is input to the Tableau which is used for visualization of the data. However there is little modification to done to the csv or Excel files. In order to get temporal visualization curve, time period must be in a single column. However the output of the extraction doesnt give that format. so the whole extracted data as to be transposed using Transpose function in the Excel.
  5. And extraction of data would create a column of extracted values for all points with column name same as the file name so name convention of the file must be such that it must define month and year of the file, so that this information could be used for temporal visualization using Tableau.
2 Things that need to kept in mind is file naming convention and transposing of the data. Once the data is ready; Tableau will takecare of the rest. Above method involved extraction by ArcGIS, there is no data preparation involved but downloading TIFF data and renaming using python script could be considered as Data preparation. Data cleaning is not necessary as level 3 data doesnt have irregular data. if 0 or NoData exist in data , please verify before using it into tableau. For AOD, the no-data would be extracted as 0 as No-Data's gray-scale would be 0. Hence, if 0 value exist in the data , it should be converted into No-Data. This could be done both at Image level , which would be before extraction and at excel level after extraction.
Earth Science data analytics(ESDA) doesnt mention at what stage there should be data cleaning, or extraction. This could be adapted based on project. once data preparation, data extraction and data cleaning is done, its time to Visualization and analysis of the extracted data.
This graph is temporal visualization of AOD over RICCO area near Neemrana City. There are lot of spikes which are high value of AOD, which is naturally in the month of November, June and July.
June and July are due to intense sand storm which increase the AOD of the area and November and December is due to temperature inversion in the area compounded by burning of straws in the fields. There is change in recent trend, as high values of AOD have started getting consistently in the month of November and December only after 2009 , from when industries are being operationalized in the area . This need to be investigated further. This could be highlighted using Tree map of some of the highest AOD and their months. These graphs are obtained for Behror and Bawal also.
Carbon Monoxide(CO) is also important measure for industrial area. Below are the graph of 3 cities.
There is clear declining trend in both RICCO Neemrana and as well as Bawal. Even Behror also shows the similar trend. Why there is decreasing trends needs to be further investigated. Another interesting thing came out of these data was CO would be highest in the summer months ie in the Month of April, May and June. which varies between 120 ppbv to 160 ppbv.
While evaluating NO2 concentration which is directly linked with industrialization ie as industries increase there will be increase of NO2 gas. This, we got clearly from the temporal graph of NO2 over these cities. Even though data was available only from 2005.

Even in this(2005 to 2015) data there is clear trend which shows that till 2009-2010 the curve remained steady only to increase from 2010. This is consistent even in other cities.
From Tree map that been generated for Neemrana RICCO has most of the highest values of NO2 from the year of 2011 to 2015. Only a few of highest values are from year before 2010.From the above data visualization we can draw some conclusions.
  • Highest AOD values of 3 cities (RICCO Neemrana, Behror, and Bawal) before 2010 have been in month of June and July. This is in consistence with the environment as Rajasthan experiences sand storms and strong winds carry dust in these time. But this trend changes from 2010. The highest recorded AOD from 2010 has been in months of November. This is consistence in 2010,2012,2013,2015. This is due to smog that gets created during the winter, which starts from November. Burning of the crops in Punjab and Haryana during November creates particulate matter that are main reason for the smog in this area along with addition of air born particles from the Industries. This gives clear indication that from 2010 there has been huge increase in pollution levels relative to the previous years.
  • Nitrogen Dioxide in RICCO Neemrana, Behror and Bawal area is on rise from 2005. However the slope till 2009 is very small however after 2009 there is rapid rise in Nitrogen Dioxide levels, which is clear indication of addition of Nitrogen Dioxide from industrial sources. Nitrogen Dioxide values also increase during winter months due to temperature inversion.
  • Carbon Monoxide in RICCO Neemrana, Behror, and Bawal area shows constant decrease trend from 2001. Highest values in most year has been in month of May. This decreasing trend and peaks in May needs to studied further.
These type of data extraction and visualization could be done using this simple methodology. This study could be extended to all the cities in order to know trends of AOD, NO2 and CO. I need to thank Manoj Diwakaran and Arun Kale for generating these graphs and data. And if reader has doubt please mail me at shivaprakash.ssy@gmail.com, I would be more than happy to help you people. Happy learning and happy ESDA.