group by - Misconception regarding Group_by and Summarize function in R[DPLYR Package] -


i had plot graph of fatalities per year. took out year date , grouped , summarized fatalities per year. when run it gives me fatalities throughout dataset.

i don't understand why? , other alternate fatalities per year.

in dataset,fatalities given per incident , every year lot of incidents happened.

crash_data=read.csv("https://raw.githubusercontent.com/gluque/analytics_task2/master/airplane_crashes_and_fatalities_since_1908.csv")     > crash_data$date <- as.date(crash_data$date, "%m/%d/%y")     > crash_data$date <- format(crash_data$date, '%y')     > cd<-subset(crash_data,select = c(fatalities,date))     > ab<-group_by(cd,date)     > ef<-summarize(ab,fatalities=sum(fatalities,na.rm = true))     > ef       fatalities     1     105479 

> group_by(cd,date) %>% summarize(fatalities = sum(fatalities, na.rm = true)) #    # tibble: 98 x 2 #       date fatalities #      <chr>      <int> #  1   1908          1 #  2   1912          5 #  3   1913         45 #  4   1915         40 #  5   1916        108 #  6   1917        124 #  7   1918         65 #  8   1919          5 #  9   1920         24 #  10  1921         68 # ... 88 more rows 

Comments

Popular posts from this blog

magento2 - Magento 2 admin grid add filter to collection -

Android volley - avoid multiple requests of the same kind to the server? -

Combining PHP Registration and Login into one class with multiple functions in one PHP file -