I’m interesting in Estonia’s renewable energy generation from this dataset: https://andmed.stat.ee/en/stat/majandus__energeetika__energia-tehususe-naitajad/KE36
My questions are
What is the average percentage of electricity generated from renewable energy from 1999-2020?
What is the year with the less percentage of electricity generated from renewable energy?
What is the year with the most percentage of electricity generated from renewable energy?
What is the distribution of electricity generated from renewable energy percentage from 1999-2020?
What is the trend of electricity generated from renewable energy, that likely to happen, after 2020?
I downloaded the csv data and import it into R as a tibble.
rawData <- read_csv("KE36_20221019-135752.csv", show_col_types = FALSE)
rawData
## # A tibble: 1 × 24
## Indica…¹ `1999` `2000` `2001` `2002` `2003` `2004` `2005` `2006` `2007` `2008`
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Electri… 0 0.1 0.1 0.5 0.6 0.5 1.1 1.4 1.4 2
## # … with 13 more variables: `2009` <dbl>, `2010` <dbl>, `2011` <dbl>,
## # `2012` <dbl>, `2013` <dbl>, `2014` <dbl>, `2015` <dbl>, `2016` <dbl>,
## # `2017` <dbl>, `2018` <dbl>, `2019` <dbl>, `2020` <dbl>, `2021` <chr>, and
## # abbreviated variable name ¹​Indicator
Raw data is not easy to use. I want to transpose the table, swapping row and column and clean up unused data. I will remove indicator column because we only have one indicator which is the Electricity generated from renewable energy sources.
I treat year as a numerical step data since the range
between each year can be treat as equal. It’s also allow us to plot a
line chart to see changes in each year.
I treat percentage as continues numerical data because
it can range between 0-100 with any decimal points.
cleanedData <- as_tibble(cbind(names(rawData), t(rawData))) # Transpose table
colnames(cleanedData) <- c('year', 'percentage') # Rename column
cleanedData <- cleanedData[2:23,] # Remove first and last column
cleanedData$year <- as.integer(cleanedData$year) # Change year type to integer
cleanedData$percentage <- as.double(cleanedData$percentage) # Change percentage type to double
cleanedData
## # A tibble: 22 × 2
## year percentage
## <int> <dbl>
## 1 1999 0
## 2 2000 0.1
## 3 2001 0.1
## 4 2002 0.5
## 5 2003 0.6
## 6 2004 0.5
## 7 2005 1.1
## 8 2006 1.4
## 9 2007 1.4
## 10 2008 2
## # … with 12 more rows
First, I want to find mean and median of the percentage. Mode is not that useful since percentage is a continues data.
mean(cleanedData$percentage)
## [1] 8.954545
median(cleanedData$percentage)
## [1] 8.15
Q1 Average percentage of renewable energy generation
from 1999-2020 is 8.95%Next I want to get a row with min and max percentage.
cleanedData[cleanedData$percentage == min(cleanedData$percentage),] # Record with min percentage
## # A tibble: 1 × 2
## year percentage
## <int> <dbl>
## 1 1999 0
cleanedData[cleanedData$percentage == max(cleanedData$percentage),] # Record with max percentage
## # A tibble: 1 × 2
## year percentage
## <int> <dbl>
## 1 2020 28.3
Q2 Estonia generate the less
percentage of renewable energy generation in 1999 with value of
0%Q3 Estonia generate the most
percentage of renewable energy generation in 2020 with value of
28.3%sd(cleanedData$percentage)
## [1] 8.713685
Next, I plot a histogram percentage to see the distribution.
ggplot(cleanedData, aes(x=percentage)) +
geom_histogram(binwidth = 5, boundary = 0) +
ggtitle("Distribution of annual renewable electricity generated")
Q4 From the histogram we can see that most of the years
(10 out of 22) Estonia only produce electricity from renewable energy
less than 5%. While only 2 years can produce 20% or more. These data
make the distribution skewed a lot into the right.Finally, I plot a line chart which allow us to easily see the changes in each year and the trend for the future.
ggplot(cleanedData, aes(x=year, y=percentage)) +
geom_line(color='blue') +
ggtitle("Annual percentage of electricity generated from renewable energy sources")
Q5 From the line chart, we can see the trend that the
proportion of electricity generated from renewable energy are
likely to increase in the future.Percentage of electricity generated from renewable energy in Estonia is increasing, which is a good sign for both people and environment. But alone this percentage is not enough, answering the following questions should allow us to see the bigget picture.