Skytrax Data #1
%matplotlib inline
Skytrax ratings¶
Following up on this (http://www.quangn.com/exploring-reviews-of-airline-services/) investigation into Skytrax ratings, which are a collection of ratings by made by airline passengers, I wanted to answer some interesting questions I had of my own.
Specifically, what I was interested in is the relative contribution of each of the different metrics (e.g., ground staff, seat comfort, value) to (a) a passenger's overall rating of the airline, and (b) their decision to recommend the airline to another person. Consequently, which criteria should airlines focus on to improve customers' overall perceptions of them (the equivalent of model lift in logit regression)?
Along the way, I thought I'd sate my own curiousity about the following:
- How do passenger ratings differ between types of aircraft?
- How do passenger ratings differ by cabin class?
- How do passenger artings differ depending on where they come from?
- How have ratings changed over time?
But first, let's import the data:
import psycopg2
import pandas as pd
import numpy as np
from sqlalchemy import create_engine
from matplotlib import pyplot as plt
import math
I first loaded the data (obtained from https://github.com/quankiquanki/skytrax-reviews-dataset) into a postgres database. Pandas has a very helpful .to_sql() method that makes this easy:
#This snippet of code loads all the .csv files into a postgres database that I'd set up.
engine = create_engine("postgres://skytraxadmin:skytraxadmin@localhost/skytrax")
databases = ['airline', 'airport', 'seat', 'lounge'] #csv files
for x in databases:
data = pd.read_csv("{0}.csv".format(x))
data.to_sql(x, engine, index = False)
#This connects to the postgres database above.
connection = psycopg2.connect("dbname = skytrax user=skytraxadmin password=skytraxadmin")
cursor = connection.cursor()
Ratings by airline¶
The first thing I was interested in doing was a sanity check to just make sure that I was looking at the same data that the original blog post analyzed (http://www.quangn.com/exploring-reviews-of-airline-services/).
This was the case, although it appears that the original author of that post took only airlines where there were more than 100 ratings, which is what I happened to do.
First, we select the average rating of each airline, as long as that airline has more than 100 ratings. 100 is arbitrary, but if an airline has only a few ratings, the average rating is likely to be extremely variable across samples:
connection.rollback()
cursor.execute("SELECT airline_name, count(airline_name), avg(overall_rating) FROM airline \
GROUP BY airline_name HAVING count(airline_name) > 100 ORDER BY avg(overall_rating) desc;")
airline_name_data = cursor.fetchall()
column_names = [desc[0] for desc in cursor.description] #column names are stored in the .description attribute
airline_name_data = pd.DataFrame(airline_name_data)
airline_name_data.columns = column_names
Next, we use matplotlib to plot a simple chart of the top 10 rated airlines:
top_ten_airlines = airline_name_data.loc[range(0,9), ['airline_name','avg']]
figure = plt.figure(figsize=(13,8))
bar_x = range(0,9)
plt.bar(range(0,9), top_ten_airlines['avg'], align = 'center')
plt.xticks(range(0,9),top_ten_airlines['airline_name'], rotation ='vertical', fontsize=15)
plt.xlabel('Airline Name',fontsize=15)
plt.ylabel('Rating out of 10', fontsize=15)
plt.yticks(fontsize=15)
print top_ten_airlines #the bar graph does look a little pointless when you can just print out a list of the top 10 airlines.
#but whatever.
Ratings by aircraft type¶
The next thing I wanted to do was to see if ratings differed as a function of what aircraft the passenger took.
connection.rollback()
cursor.execute("SELECT aircraft, count(aircraft), avg(overall_rating) AS ovr, \
avg(seat_comfort_rating) AS cmft, avg(inflight_entertainment_rating) AS inft FROM airline \
WHERE aircraft != 'None' GROUP BY aircraft HAVING count(aircraft) >25 ORDER BY avg(overall_rating) asc;")
aircraft_type_data = cursor.fetchall()
column_names = [d[0] for d in cursor.description]
aircraft_type_data = pd.DataFrame(aircraft_type_data)
aircraft_type_data.columns = column_names
aircraft_type_data
From the above there are only 11 aircraft types that have more than 25 ratings. A couple of aircraft have multiple listings. These appear to be the updated versions of the same aircraft (737-800 vs. 737 and A330-300 vs A330; I assume the generic types are the older models).
There can be many reasons for this, of course. Newer aircraft tend to be more comfortable and reliable. Furthermore, an airline that runs older aircraft may also be associated with other negative experiences that arise because they're struggling. Alternatively, older aircraft may also be associated with "value" airlines, in which case, if customer expectations are commensurately low, these older aircraft may actually be rated more positively. That seems like a future question worth investigaing.
Boeing vs. Airbus¶
Just for fun, let's compare the average ratings of Boeing vs. Airbus aircraft. We can use regex to select the different rows:
airbus_aircraft = aircraft_type_data.loc[aircraft_type_data['aircraft'].str.contains(r'A.*')]
boeing_aircraft = aircraft_type_data.loc[aircraft_type_data['aircraft'].str.contains(r'Boeing.*')]
print airbus_aircraft.mean()
print boeing_aircraft.mean()
Interestingly, even though Boeing aircraft have comparable (or even slightly higher) ratings on "comfort" and "inflight entertainment", Airbus aircraft have higher overall ratings.
Ratings by cabin type¶
Finally, let's take a look at whether the flying experience is different for the top 1% than it is for the rest of us. As an aside, if you go on Youtube and listen to air traffic controller recordings at JFK, the ATC uses "Top 1%" as a colloquialism for private jets. We go through the same process to import the data:
connection.rollback()
cursor.execute("SELECT avg(overall_rating) AS ovr, avg(seat_comfort_rating) AS seat, avg(cabin_staff_rating) AS cstaff,\
avg(food_beverages_rating) AS food, avg(inflight_entertainment_rating) AS inflight, avg(ground_service_rating) AS gstaff,\
avg(wifi_connectivity_rating) AS wifi, avg(value_money_rating) AS value, cabin_flown, count(*)\
FROM airline GROUP BY cabin_flown;")
data = cursor.fetchall()
column_names = [desc[0] for desc in cursor.description]
data = pd.DataFrame(data)
data.columns = column_names
data['ovr'] = data['ovr']/2 #rescaling the overall rating (out of 10) to be out of 5
data
The last thing we're going to do for now is to compare ratings between passengers who end up recommending the airline to friends vs. those who do not:
connection.rollback()
cursor.execute("SELECT avg(overall_rating)/2 AS ovr, avg(seat_comfort_rating) AS seat, avg(cabin_staff_rating) AS cstaff,\
avg(food_beverages_rating) AS food, avg(inflight_entertainment_rating) AS inflight, avg(ground_service_rating) AS gstaff,\
avg(wifi_connectivity_rating) AS wifi, avg(value_money_rating) AS value, cabin_flown, count(*), recommended \
FROM airline GROUP BY (cabin_flown, recommended) ORDER BY recommended, cabin_flown;")
cabin_recommend_data = cursor.fetchall()
column_names = [desc[0] for desc in cursor.description]
cabin_recommend_data = pd.DataFrame(cabin_recommend_data)
cabin_recommend_data.columns = column_names
The data below suggest a high degree of polarization. One might imagine that people have an implicit "threshold" value that, once crossed, results in a recommendation. Instead, what we find, perhaps unsurprisingly, is that flights that are not recommended are rated a lot worse than those that are recommended.
This is consistent with, but not exclusively because of, a process whereby people judge their flights as "good" or "bad", and then use the rating system to express their satisfaction/disatisfaction with each of the criteria.
cabin_recommend_data