There are 66 and 65 destination airports in the delay and cancellation datasets, respectively. For the efficiency purpose in statistical analysis, we do not include destination airport as a predictor in our models.
However, it is still an interesting factor to explore.


Overall Trend

First we would like to check if delay and cancellation counts differ in different destination airports.


Flights from JFK to LAX have the highest delay occurrences with a number of 2293 and flights to BGR have the highest delay occurrences with a number of 6.


Flights from JFK to SFO have the highest cancellation occurrences with a number of 76 and flights to BZN have the highest cancellation occurrences with a number of 1.


delay_dest = function(dest){
  delay %>%
    filter(
      airline_name == dest
    ) %>% 
    group_by(destination_airport) %>% 
    summarize(
      count = n()
      ) %>% 
  mutate(
    destination_airport = fct_reorder(destination_airport, count),
    text_label = str_c("Airport: ", destination_airport, "\nCount: ", count)
      ) %>% 
  plot_ly(x = ~destination_airport, y = ~count, text = ~text_label, hoverinfo = "text", 
          color = ~destination_airport, type = "bar", alpha = .5) %>% 
  layout(
    xaxis = list(title = "Destination Airport"),
    yaxis = list(title = "Count"),
    title = "Distribution of Delay by Destination Airport")
}

cancel_dest = function(dest){
  cancel %>%
    filter(
      airline_name == dest
    ) %>% 
    group_by(destination_airport) %>% 
    summarize(
      count = n()
      ) %>% 
    mutate(
    destination_airport = fct_reorder(destination_airport, count),
    text_label = str_c("Airport: ", destination_airport, "\nCount: ", count)
      ) %>% 
    plot_ly(x = ~destination_airport, y = ~count, text = ~text_label, hoverinfo = "text",
          color = ~destination_airport, type = "bar", alpha = .5) %>% 
    layout(
      xaxis = list(title = "Destination Airport"),
      yaxis = list(title = "Count"),
      title = "Distribution of Cancellation by Destination Airport")
}


Stratification by Airline

We can also take a look at whether different airlines could have different trends in delay and cancellation counts among all the destination airports.

Alaska

American

Delta

Endeavor

JetBlue

Republic

United


LAX and SFO

We found that LAX and SFO have outstanding delay and cancellation counts, so we decided to take a closer look at the underlying factors behind those delays and cancellations.


We can clearly observe that the delay times are clustered before 180 minutes. For the following explorations, we filtered the delay minutes between 0 to 180 minutes.


Delay by Airline

The airlines which departure from JFK to the two airports are different, and there is no distinct trend in delay time in minutes among different airlines between the two airports.

Delay by Month

Both airports show an increasing trend in delay minutes from November to January. There is no distinct difference in delay time in minutes in different months between the two airports.

Cancellation by Hour

We can observe a distinct difference in cancellation counts in each scheduled hour between the two airports.