Data Science Based on R Studio Discussion
Lab: Airplane!The dataset
The dataset “Airplane_Crashes_and_Fatalities_Since_1908.csv” includes airplane crash data for all incidents
since 1908. It was downloaded from here: https://www.kaggle.com/cgurkan/airplane-crash-data-since-1908
The dataset will be loaded into a data frame called airplane.
Having done this, we can create a new variable, time of day, that is categorical, i.e. day or night
airplane$dayornight = 6 & airplane$Time “1985-01-01”) %>%
We can now look at what we have now.
Company dayornight Time Aboard Fatalities very_deadly
## 1 Boeing
## 2 Boeing
## 3 Boeing
Notice that wherever the Time (hour) is missing/unknown, the day/night variable is also unknown. We can
see that this happens 18 times.
##  18
Two-way tables: Does the company matter?
Now, let’s look at a two-way table of the plane company and whether the crash was “very deadly” according
to the given criteria (that more than 10 people aboard died). We can use table() as follows. Note that we
will store this table for use later on.