Data analysis is defined as a process of cleaning, transforming, and modeling data to discover useful information for business decision-making. The purpose of Data Analysis is to extract useful information from data and taking the decision based upon the data analysis.
Performing Analysis on Meteorological data makes us to predict the future weather condition.Being able to predict and forecast the weather also allows for data to be gathered to build up a more detailed picture of a nation’s climate, and trends within it.
Dataset:
The dataset has hourly temperature recorded for last 10 years starting from 2006–04–01 00:00:00.000 +0200 to 2016–09–09 23:00:00.000 +0200. It corresponds to Finland, a country in the Northern Europe.
Download the weather dataset from this Google drive link
Null Hypothesis (H0)
“Has the apparent temperature & humidity compared monthly across 10 years of the data, indicate an increase due to Global warming.”
The H0 means we need to find whether the average Apparent temperature for the month of a month say April starting from 2006 to 2016 and the average humidity for the same period have increased or not. This monthly analysis has to be done for all 12 months over the 10 year period. So you are basically resampling your data from hourly to monthly, then comparing the same month over the 10 year period. Support your analysis by appropriate visualizations using matplotlib and / or seaborn library.
Data Analysis with python:
Step 1: importing the Necessary Liberaries & Data
Step 2:Data cleaning
2.1 Find all Missing values from the Dataset and fill them Accordingly
Here I’ve changed all missing values to ‘N/A or na or NA or n/a’. To avoid any complications while analysis.
2.2 Change the format of data for better analysis
2.3 Removing unwanted columns (All except temperature and humidity).
Step 3: Plotting of Data
3.1 plot the whole dataset for all the month
3.2 Monthly analysis for all 12 months over the 10 year period
- January:
2.February
3 . March
4. April
5. May
6. June
7.July
8. August
9.September
10. October
11. November
12. December
Conclusion:
- There almost no change in average humidity in 10 year for any month
- Increase in average apparent temperature can be seen in the year 2009 then again it dropped in 2010 then there was a slight increase in 2011 then a significant drop is observed in 2015 .Hence we can conclude that global warming has caused an uncertainty in temperature over the past 10 years while the average humidity as remained constant throughout the 10 years.
- According to Null Hypothesis (H0) both temperature and humidity increases due to Global Warming is proven wrong here, and thus null hypothesis failed.