Anu Anu - 8 days ago 4
R Question

T tests in R- unable to run together

I have an airline dataset from stat computing http://stat-computing.org/dataexpo/2009/the-data.html which I am trying to analyse.

There are variables DepTime and ArrDelay (Departure Time and Arrival Delay). I am trying to analyse how Arrival Delay is varying with certain chunks of departure time. My objective is to find which time chunks should a person avoid while booking their tickets to avoid arrival delay

My understanding-If a one tailed t test between arrival delays for dep time >1800 and arrival delays for dep time >1900 show a high significance, it means that one should avoid flights between 1800 and 1900. ( Please correct me if I am wrong). I want to run such tests for all departure hours.

**Totally new to programming and Data Science. Any help would be much appreciated.

Data looks like this. The highlighted columns are the ones I am analysing

enter image description here

Answer

Sharing an image of the data is not the same as providing the data for us to work with...

That said I went and grabbed one year of data and worked this up.

flights <- read.csv("~/Downloads/1995.csv", header=T)

flights <- flights[, c("DepTime", "ArrDelay")]
flights$Dep <- round(flights$DepTime-30, digits = -2)
head(flights, n=25)

# This tests each hour of departures against the entire day. 
# Alternative is set to "less" because we want to know if a given hour
# has less delay than the day as a whole.

pVsDay <- tapply(flights$ArrDelay, flights$Dep, 
                 function(x) t.test(x, flights$ArrDelay, alternative = "less"))

# This tests each hour of departures against every other hour of the day. 
# Alternative is set to "less" because we want to know if a given hour
# has less delay than the other hours.
pAllvsAll <- tapply(flights$ArrDelay, flights$Dep, 
                           function(x) tapply(flights$ArrDelay, flights$Dep, function (z) 
                             t.test(x, z, alternative = "less")))

I'll let you figure out multiple hypothesis testing and the like.

enter image description here

All vs All

enter image description here