y <- read.csv("KK032_20221029-163824.csv")
y
## Year Indicator Wastewater.management
## 1 2010 National expenditure for environmental protection 97.2
## 2 2014 National expenditure for environmental protection 125.2
## 3 2015 National expenditure for environmental protection 138.1
## 4 2016 National expenditure for environmental protection 126.8
## 5 2017 National expenditure for environmental protection 126.9
## 6 2018 National expenditure for environmental protection 138.9
## 7 2019 National expenditure for environmental protection 132.2
y <- y[,-2]
y
## Year Wastewater.management
## 1 2010 97.2
## 2 2014 125.2
## 3 2015 138.1
## 4 2016 126.8
## 5 2017 126.9
## 6 2018 138.9
## 7 2019 132.2
a) what does it represent
ans: This is a data showing the national expenditure for wastewater management between year 2010 to 2019
b) Does my data need cleanup?
ans: Yes, i had to clean up my data so i could get a more useful set of data
c) How many rows and columns are present in my data.
ans: 7 rows and 2 columns
d) what year did we have the highest national experditure for wastewater management between the year 2010 to 2019
ans: (138.9) 2018
e) what data tyoes do i have represented my data
ans: numeric (year), character (indicator), double int.(wastewater management)
f) check for the first 6 rows
g) check for the last 6 rows
to check the first 6 rows of my data, i use the function head()
head(y)
## Year Wastewater.management
## 1 2010 97.2
## 2 2014 125.2
## 3 2015 138.1
## 4 2016 126.8
## 5 2017 126.9
## 6 2018 138.9
to check the last 6 rows of my data, i use the function tail()
tail(y)
## Year Wastewater.management
## 2 2014 125.2
## 3 2015 138.1
## 4 2016 126.8
## 5 2017 126.9
## 6 2018 138.9
## 7 2019 132.2
to check the first 6 rows of my data, i use the function str()
str(y)
## 'data.frame': 7 obs. of 2 variables:
## $ Year : int 2010 2014 2015 2016 2017 2018 2019
## $ Wastewater.management: num 97.2 125.2 138.1 126.8 126.9 ...
3.Provide brief descriptive statistical analysis of your data set (like measures of central tendency and dispersion).
mean (y[,2])
## [1] 126.4714
median(y[,2])
## [1] 126.9
hist(y[,2])
# to get the boxplot
boxplot(y[,2])
barplot(y[,2])