.

Thursday, April 4, 2019

Data Mining Techniques in Airline Industry

info Mining Techniques in sky guidance Industry innovation and ScopeAll around the world, the line of work lane industry could be described in few words, which is in tensely competitive and dynamic. The transfer ancestry industry gene ranks billions of dollars e rattling year but still has a cumulative net profit margin of less than 1%1. Many Airlines atomic number 18 trying to recover from deep debt. The reasons for these are multifold- give nonice prices, lofty cyclicality and seasonality, fierce competition, high fixed costs and many other issues related to security and passengers safety.To see to it for the best economic out bang, Airline companies are trying with their most creative as per clear education. Data utilize in conjunction with entropy archeological site techniques allows comprehensive intelligent management and decision-making system. Achieving these benefits in a concomitantally and intelligent manner may help in resulting lower operating cost s, break out customer service, market competitiveness, change magnituded profit margin and shareholder value gain.This purpose of this paper is to demonstrate the applications of selective information digging techniques on multiple aspects of air hose business. For example, to predict the number of domestic help and international skyway passengers from a special(prenominal) city/airport, to dynamically price the tickets depending on seasonality and exact, to explore the patronize flyer infobase to prepare for CRM implementation, to makes the operational decisions about(predicate) catering, personnel, and gate traffic flow, to assist the security agencies for secure and safe flights for the passenger specially after 9/11 incident.Predict the Number of passenger by applying Data Mining TechniqueForecasting is slender to any business for cookery and revenue management, especially in the Airline industry, where a lot of planning is required to buy/lease bare-ass aircraf ts, to hire crew members, to find the new slots in restless airports and to get the approvals from many aviation authorities.In the case of Air travel, lot of seasonality and cyclicality involved. Passengers are more in all fortune to fly to some destinations establish on the snip of the year. Business travelers are likely to travel week days than weekends. archeozoic morning and evening flights are desired by business travelers who want to accomplish a days work at their destination and return the same(p) day.To expect the number of passenger, artificial neural internet (ANN) enkindle be used. The purpose of a neural network is to learn to recognize patterns in a given data. Once the neural network has been trained on samples of the given data, it female genital organ make predictions by detecting similar patterns in future data.The growth factors which capability deflect the air travel demand depend on several things. Mauro Calvano2 in his study of transport Canada avia tion forecast 2002-2016 con fountred 12 major socio-economic factors as followsGDPPersonal Disposable incomeAdult PopulationsUS economic OutlookAirline hand overFleet/route structure/Average Aircraft SizePassenger Load factorsLabor cost and productivenessFuel cost/Fuel efficiencyAirline cost other than Fuel and LaborPassenger Traffic Allocation AssumptionsNew technologyFactors 1 to 5 are related demand side of the forecastFactors 6 to 10 are related to operations and supply sideFactors 10 and 11 represent the structural changesThis historical data is called the musical theme put up. A fraction of the overall usable data is reserved for validating the accuracy of the developed forecast mystify. This reserved data fix is called the prognostication fare because no information contained in it is used in any form during the development of the forecast manikin. The data in the forecasting set are used for testing the true extrapolative properties of the developed forecast puzz le. The estimation set is further divided into a training set and a testing set. Information in the training set is used directly for the determination of the forecast model, whereas information in the testing set is used indirectly for the same purpose. icon1 Forecasting Process ModelFor a given ANN architecture and a training set, the basic mechanism behind most supervised learning rules is the updating of the weights and the bias terms, until the humble squared error (MSE) between the rig predicted by the network and the desired output (the target) is less than a pre-specified tolerance.Neural networks are bear be represent as layers of functional nodes. The most general form of a neural network model used in forecasting can be scripted asY = F H1 (x), H2 (x), . , Hn (x)+ uWhere, Y is a dependent or output variable,X is a set of input/ influencing variables,F Hs are network functions, and u is a model error.This input layer is connected to a hidden layer. Hs are the hidden l ayer nodes and represents antithetical nonlinear functions. Each node in a layer receives its input from the preceding layer with radio link which has weights assigned, which get adjusted using an take a way of life learning algorithm and the information contained in the training set.Figure2 ANN computer architectureAbdullah Omer BaFail3 did the study to forecast the number of airline passenger in Saudi Arabia. He selected the most influencing factors to forecast the number of domestic passengers in the different cities of Saudi Arabia. For Dhahran he selected factors like Oil gross domestic product for last 6 age, private non-oil gross domestic product, Import of goods and services for last 10 years, and population size for last 2 years.The domestic and international echt and forecasted number of passengers for the city of Dhahran for the years 1993 through 1998 is shown below. Forecasts underestimated the positive travel. The Mean Absolute Percentage Error (MAPE) for domest ic travel is about 10%, while for international travel is about 3%.Figure3 Forecasting results from Abdullah Omer BaFail3The wad away from the Abdullah Omer BaFail3 for me is that the efficient forecasting model can be invented using ANN if we using the right influencing indicators.In this study some indicators which influence are oil gross domestic product and per capita income in the domestic and international sectors. In face of the fluctuating nature of the passenger usage of airline services in Saudi Arabia, certain suggestions were made. around of these recommendations were in order to improve the flexibility of the system to the fluctuations in demand and supply. Hub and spike model was withal suggested as solutions in certain sectors to increase the flexibility in adjusting their capacity allocations across markets as new information about demand conditions become available.Application of Data Mining technique to predict the Airline Passengers No-show RatesAirlines overb ook the flights based on the expectation that some percentage of schedule passengers will not show for each flight. Accurate forecasts of the expected number of truants for each flight can increase airline revenue by reducing the number of perishable seats (empty seats that might otherwise cook been sold) and the number of involuntary denied boardings at the departure gate. Typically, the simplest way is to go for average no-show rates of historically similar flights, without the use of passenger-specific information.Lawernce, Hong, Cherrier4 in their research paper predicted the no-show rates using specific information on the individual passengers booked on each flight.The Airlines offer multiple fares in different mental reservation class. The number of seats allocated to each booking class is driven by demand for each class, lots(prenominal) that revenue is maximized. For example, few seats can be kept on hold for the last-minute travelers with high fares and number of seats sold in lower-fare classes earlier in the booking process. Terms and conditions of cancellation and no-show in like manner vary in each class.The no-shows results in lost revenue if the flight departs with empty seats that might otherwise swallow been sold. Near accurate forecasts of the expected number of no-shows for each flight are very much desirable because the under-prediction of no-shows leads to loss of potential revenue from empty seats, while over-prediction can erect a significant cost penalty associated with denied boardings at the departure gate and also create customer dissatisfaction.In the simplest model, the overbooking limit is taken as the capacity plus the estimated number of no-shows. Bookings are offered up to this level. No-shows numbers are predicted using time-series methods such as taking the seasonally weighted moving average of no-shows for previous instances of the same flight.Figure4 No-show trend over days to departureSource Lawernce, Hong, Cherrie r4The simple model does not take account of specific characteristics of the passengers. Lawernce, Hong, Cherrier4 in his study used compartmentalization method, similarly Kalka and Weber5 at Lufthansa used abstraction trees to compute passenger-level no-show probabilities, and compared their accuracy with conventional, historical-based methods. I tried to summarize Lawernce, Hong, Cherrier4 go on and results briefly below.Whenever a ticket is booked the Passenger Name Records (PNRs) is feedd and all the passenger information is recorded. The PNR data includes, for each passenger, specifics of all flights in the itinerary, the booking class, and passenger specific information such as frequent-flier membership, ticketing status, and the agent or channel through which the booking originated. Each PNR is also specified whether the passenger was a no-show for the specified flight.In the simplest model the mean no-show rate over a group of similar historical flights is computed. The m ean in turn used to predict the number of no-shows over all booking classes.The passenger-level model given by can be implemented using any classification method capable of generating the normalized probabilities. The PNR records are partitioned into segments, and split up predictive models are developed for each segment. In the passenger-level modeling we characterize each using the PNR details. permit Xi i = 1..I denote I feature articles associated with each passenger. Combining all features yields the feature transmitterX = X1Xi Each passenger, n = 1.N, booked on flight m is represented by the vector of feature valuesxmn = xmn, 1 xmn, i.. xmn, I We know the predicted no-show rate from the historical model it is assumed the passenger inherits the no-show rate. The passenger level predictive model is wherefore stated as follows given a set of class labels cmn a set of feature vectors xmn and a cabin level historical prediction mhist predict the output class of passenger n on f light mP(C = cmn mhist , X= xmn )We are specifically interested in the no-show probability, cmn = NS, and write this probability in the simplified formP(NS mhist , xmn )The number of no-shows in the cabin is estimated as P(NS mhist , xmn )The summing of probabilities for each passenger in the cabin, gives no-show rate for the cabin. An analogous approach can also be used to predict no-show rates at the fare-class level.Lawernce, Hong, Cherrier4 compare results computed using the historical, passenger-level, and cabin-level models. The models were built using approximately 880,000 PNRs booked on 10,931 flights, and evaluated against 374,900 PNRs booked on 4088 flights. The write in code shows a conventional lift curve computed using the three different implementations of the passenger-level model.Figure 5 Gain ChartsSource Lawernce, Hong, Cherrier4Each point on the lift curve shows the fraction of actual no-shows observed in a sample of PNRs selected in order of decreasing no-sho w probability. The diagonal line shows the baseline case in which it is assumed that the probabilities are drawn from a random distribution. The three implementations of the passenger-level model identify approximately 52% of the actual no-shows in the first 10% of the sorted PNRs.This is one of the way the Airlines can incorporate data mining models incorporating specific information on individual passengers can produce more accurate predictions of no-show rates than conventional, historical based, statistical methods.Application of Data Mining technique to Strategies customer Relationship ManagementIn the current time most of the industries using frequency merchandising programs as a strategy for retaining customer loyalty in the form of points, miles, dollars, beans and so on. Airlines are a big fan of this Kingfishers Kingmiles, Jet Air shipway Jet Privilege, American Airlines AAdvantage, Japan Airlines Mileage Bank, KrisFlyer Miles etc. they all seemed to realise carved the ir own identities.Frequent Flyer Program presents an invaluable opportunity to gather customer information. It helps to guess the behavioral patterns, unveil new opportunities, customer acquisition and retention opportunities. This helps Airlines to identify the most valuable and the appropriate strategies to use in developing one-to-one relationships with these customers.The objective of data mining application over the frequent flyer customer data could be many, but ideally it is as follows node segmentationCustomer satisfaction compendCustomer activity analysisCustomer retention analysissome(a) of the examples in each category areClassify the customers into groups based on sectors most frequently flown, class, period of year, time of the day, purpose of the trip.Which types of customers are more valuable?Do most valuable customers receive the value for property?What are the attributes and characteristics of the most valuable customer segments?What type of campaign is appropria te for best use of resources?What are the opportunities to up-selling and cross-selling, for example hotel booking, upgrade to next class, credit card, etc.Design packages or grouping of services Customer acquisition.Yoon6 designed a database knowledge discovery process consisting of five steps selecting application domain, target data selection, pre-processing data, extracting knowledge, and interpretation and evaluation. This study refers to the Yoon process to deal with three mining phases, including the pre-process, data-mining, and interpretation phases for airlines, as illustrated in figure below.Figure 6 database knowledge discovery processSource Yoon6Some straightforward solution can be implemented that can also be scaled-up in future like K- performer, Kohonen self-organizing networks and classification trees.In the case of K-means algorithm, it is utilize on customer data, assigning each to the closest existing practice bundling center. The K- means model is run with dif ferent ball number until K-means clusters are well separated.In the case of classification trees (C5.0), we derive a simple rule set to uniquely classify the complete database. Again, we have to generate the attributes, resulting from the sequence of flight segments. The accuracy of the forecast for each segment is provided by balancing the training set according to equally sized clusters. We regulate the number of subsequent rules, while determining a negligible numbers of records given within each subgroup.Maalouf and Mansour7 did the study based on 1,322,409 customer activities transactions and 79,782 passengers for a period of 6 years. They prepared Data based on Z-Score Normalization and ran the multiple queries and transformed the data to create the gather input records. They used K-means and O-Cluster algorithms. The result generated by clustering provides customer segmentation with appreciate to important dimensions of customers needs and value. The table below is the re sult is a summary of the visibility produced by k-means clustering that includes revenue fuel consumption rate, number of services used, and customer membership period.Figure 7 Clustering result on Airline Customer DataSource Maalouf and Mansour7The results generated by k-means clustering are used as a basis for the association rules algorithm. Two different scenarios have been applied. The first scenario is based on Financial, Flight, and Hotel activities with 1,896 records. The second scenario is based on the flight activities especially the sectors, with 1,867 records.Figure 8 Association rules for best customer activitiesSource Maalouf and Mansour7Some of the take way from Meatloaf and Mansour7 study.Clustering using k-means algorithm generated 9 different clusters with specific profile for each one.From the cluster analysis it can be erect which are the best customer clusters (higher mileage per passenger) than other clusters. Need a retention strategy for these clusters.Cros s Selling strategies can be hypothesise between the clusters (for example between 15 and 11 13 and 17 because they are close in services value.The cluster analysis provides an opportunity for the airline to produce more revenue from a customer. For example, the airline could apply an up-selling strategy by selling a higher fare seat depending on the clusters.From the cluster analysis Airline may adopt an enhanced strategy for customers in clusters in order to increase services usage and revenue mileage per passenger.Plan for marketing campaign or special offers by analysis through association rules, for example, the customers using the Flight and Financial services never use the Hotel operate and the customers using the Flight and Hotel services never use the Financial proceedss.By analyzing the services used in different clusters, Airline can characterize services integration. It enables the airline to serve a customer the way the customer wants to be served.Application of Data Mining Application technique to understand the Impacts of Severe abideSevere weather has major impacts on the air traffic and flight delays. Appropriate proactive strategies for different painful-weather days may result in improvement of delays and cancellations. Thus, understanding en-route weather impacts on flight proceeding is an important step for improving flight performance.Zohreh and Jianping8 in their study proposed a framework for data mining approach to analysis of weather impacts on Airspace system performance. This approach consists of three phases data preparation, feature extraction, and data mining. The data preparation phase includes the usual process of selection of data sources, data integration, and data formatting.Figure 9 Framework proposed by Zohreh and Jianping8He used three data sources Airline Service Quality Performance (ASQP), Enhanced Traffic Management System (ETMS), and National Convective Weather Forecast (NCWF) supplied by National Center for Atmo spheric Research. He used NCWF data from April through September 2000 to represent the severe weather season.These data-sets included the scheduled and actual departure and arrival times of each flight of ten reporting airlines, tail number, wheels off/on times, taxi times, cancellation and diversion information, planned departure and arrival times, actual departure and arrival times, planned flight routes, actual flight routes, and cancellations, flight frequencies between two airports, mean flight routes between two airports, flight delays, flight cancellations, and flight diversions.The image segmentation phase resulted in a set of severe-weather regions. Then for each of these regions, a set of weather features and a set of air traffic features are extracted. A day is described by a set of severe-weather regions, each having a number of weather and traffic features.As a result of this study it was found that there is fond correlation of impede flights, of bad weather regions, bad weather airports, blocked distance, bad weather longitude, by pass distance, bad weather latitude, of bad weather pixels with flight performance.Similarly the clustering algorithms (like K-means) can be applied. The expectation is that the same clusters have similar weather impacts on flight performance. Zohreh and Jianping8 generated clusters for the entire airspace It was found that a cluster with worse weather almost always had bad performance. The clusters with large percentage of blocked flights, bypass distance, and blocked distance had a worse performance. These results were promising and showed that days in a cluster have similar weather impacts on flight performanceOther data mining approach which can be applied is Classifications. Application of Classification can help us discover the patterns/rules that have significant impact on the flight performance. Discovered rules may be used to predict if a day is a good or a bad performance day based on its weather. For exam pleRule for Goodif %BlockedFlights and BypassDistance then Good (n, prob)There can be different ways where we can apply data mining approach to analysis of weather impact on airline performance. It seems to be that results obtained from clustering and classifications were very meaningful for airline and passengers to plan ahead.Application of Data Mining techniques to jibe safety and security of Airlines passengerThe reaction of the terrorist attack on 26/9 and 11/9 resultant in increaseSecurity at airports It ends up allowing only ticketed passengers past the security gates, cloak carry-on luggage more carefully for possible weapons. The question is whether these steps could have avoided the attacks, the people involved in the attack had legitimate tickets, and carrying box cutters and razor blades (like in any other normal person would do).The red carpet(prenominal) was the combination of their characteristics, like none were U.S. citizens, all had lived in the U.S. for some pe riod of time, all had connections to a token foreign country, all had purchased one-way tickets at the gate with cash.With the amount of data available about the passenger during ticketing, the can be reviewed to characterize relevant available passenger information. Given a passengers name, address, and a edge phone number, various data bases (public or private) can identify the social security number (SSN), from which much information will be readily available (credit history, police record, education, employment, age, gender, etc.). Since there is large number of characteristics available on both individual passengers, it will be important to identifying signals within the natural variability or noise. If predicted wrong, this may lead to either falsely detaining an innocent passenger or failing to detain a plane that carries a terrorist.The airlines already collect much data on various flights. When the data come in the form of multiple characteristics on a single item, explor atory tools for multivariate data can be applied, such as classification, regression trees, multivariate adaptive regression splines/trees. The security of the air shipping can be improved substantially through modern, intelligent use of pattern recognition techniques applied to large linked databases.Similarly Data mining techniques can be used for the Safety of the passenger. An air safety office plays a key role in ensuring that an aviation organization operates in a safe manner. Currently, Aviation Safety offices collect and analyze the incident reports by a combination of manual and automated methods.. Data analysis is done by safety officers who are very familiar with the domain. With Data mining one can find interesting and useful information hidden in the data that might not be found by simply tracking and querying the data, or even by using more sophisticated query and reporting tools.In a study done by Zohreh Nazeri, Eric Bloedorn, Paul Ostwald10 it was found that finding associations and distribution patterns in the data, bring important inside. The other finding is Linking the incident reports to other sources of safety related data, such as aircraft maintenance and weatherdata, could help finding better causal relationships.SumMRryBusiness Intelligence through efficient and appropriate Data mining application can be very useful in the Airline industry. The Appropriate action plans from the data mining analysis can result in improved customer service, help generating considerable financial lift and set the future strategy.

No comments:

Post a Comment