s7 airlines - vertica.com

3
About the company S7 Airlines is among the top three airlines in Eastern Europe, according to the prestigious international Skytrax rating. Based on the 2018 results, S7 Airlines has become the most punctual Russian airline, ranking sixth in the OAG’s Punctuality League 2019, the European rating of airline punctuality. A data-driven airline “The company’s data culture has been evolv- ing for quite some time. It unites people, processes, and technologies,” says Roman Ryzhikh, S7 Airlines Data Architect. “Today, all the major departments, including sales, mar- keting and strategy, are active and highly de- manding data users.” There are more than 100 internal and external data sources at S7 Airlines. These include re- lational databases, NoSQL data, flat files, HTTP services, data in S3 object storage, and more. The aggregate of all that data amounts to tens of terabytes. Some time ago, the company created a classic data warehouse based on a popular relational database. Informatica tools were used for data management and integration, Tableau BI appli- cations were used as data displays, and Kafka was used as an internal bus for data exchange. User requirements increased — a more powerful database was needed As the data grew, the problem of scaling the legacy data warehouse was compounded. The previous database was single-node, not clustered because of the high licensing cost. To scale the data warehouse, S7 Airlines had to purchase new server equipment each year and transfer the database over to it. Expanding the functionality of the data warehouse was also expensive because many of the required options cost extra. The data in the warehouse was only updated once per day. Meanwhile, there was a data and analytics culture boom across all business units: Many users wanted quality data that could be accessed quickly. At the end of 2018, the company introduced a powerful new source of data from one of its contractors — a volume of 5 TB and daily growth of 5-10 GB. The business needed fast, ad hoc analytics across the entire array of its data and the former database could not work effectively with such a large source. A different database was required. The new platform: columnar, clustering, standard SQL, and low code To reduce the total cost of ownership, when making their selection, the company’s spe- cialists developed a set of requirements for a columnar analytical data warehouse that S7 Airlines S7 Airlines leaps to the front of Russian air travel industry due to data-driven business decisions with Vertica Analytics Platform At a Glance Industry Transportation Location Russia Challenge S7 Airlines wanted to scale the enterprise’s analytical data warehouse and speed up user access. Products and Services Vertica Success Highlights + Developed an internal culture of data, analytics and data-driven business decision making + Provided quick access to the data warehouse enabling users to execute ad hoc queries in real time + Users gain opportunities for deep data analysis + Improved commercial proposal recommenda- tion quality + Increased customer loyalty and decreased complaints + Business analysts with SQL and data scientists with Python and Jupyter use the same platform for data analysis Case Study Analytics and Big Data “The company’s Vertica data analytics solutions are a delight for business representatives. Some types of data were previously updated once a day, but now the technical limitations for data loading with a higher frequency have been removed, and so it is now possible to perform data loading every minute” Roman Ryzhikh Data Architect S7 Airlines

Upload: others

Post on 24-Dec-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: S7 Airlines - vertica.com

About the companyS7 Airlines is among the top three airlines in Eastern Europe, according to the prestigious international Skytrax rating. Based on the 2018 results, S7 Airlines has become the most punctual Russian airline, ranking sixth in the OAG’s Punctuality League 2019, the European rating of airline punctuality.

A data-driven airline“The company’s data culture has been evolv-ing for quite some time. It unites people, processes, and technologies,” says Roman Ryzhikh, S7 Airlines Data Architect. “Today, all the major departments, including sales, mar-keting and strategy, are active and highly de-manding data users.”

There are more than 100 internal and external data sources at S7 Airlines. These include re-lational databases, NoSQL data, flat files, HTTP services, data in S3 object storage, and more. The aggregate of all that data amounts to tens of terabytes.

Some time ago, the company created a classic data warehouse based on a popular relational database. Informatica tools were used for data management and integration, Tableau BI appli-cations were used as data displays, and Kafka was used as an internal bus for data exchange.

User requirements increased — a more powerful database was neededAs the data grew, the problem of scaling the legacy data warehouse was compounded. The previous database was single-node, not clustered because of the high licensing cost. To scale the data warehouse, S7 Airlines had to purchase new server equipment each year and transfer the database over to it. Expanding the functionality of the data warehouse was also expensive because many of the required options cost extra.

The data in the warehouse was only updated once per day. Meanwhile, there was a data and analytics culture boom across all business units: Many users wanted quality data that could be accessed quickly.

At the end of 2018, the company introduced a powerful new source of data from one of its contractors — a volume of 5 TB and daily growth of 5-10 GB. The business needed fast, ad hoc analytics across the entire array of its data and the former database could not work effectively with such a large source. A different database was required.

The new platform: columnar, clustering, standard SQL, and low codeTo reduce the total cost of ownership, when making their selection, the company’s spe-cialists developed a set of requirements for a columnar analytical data warehouse that

S7 Airlines

S7 Airlines leaps to the front of Russian air travel industry due to data-driven business decisions with Vertica Analytics Platform

At a Glance

■ IndustryTransportation

■ LocationRussia

■ ChallengeS7 Airlines wanted to scale the enterprise’s analytical data warehouse and speed up user access.

■ Products and Services Vertica

■ Success Highlights + Developed an internal culture of data, analytics

and data-driven business decision making + Provided quick access to the data warehouse

enabling users to execute ad hoc queries in real time

+ Users gain opportunities for deep data analysis + Improved commercial proposal recommenda-

tion quality + Increased customer loyalty and decreased

complaints + Business analysts with SQL and data scientists

with Python and Jupyter use the same platform for data analysis

Case StudyAnalytics and Big Data

“The company’s Vertica data analytics solutions are a delight for business representatives. Some types of data were previously updated once a day, but now the technical limitations for data loading with a higher frequency have been removed, and so it is now possible to perform data loading every minute”

Roman RyzhikhData ArchitectS7 Airlines

Page 2: S7 Airlines - vertica.com

would scale across multiple nodes, and en-able data access through standard ANSI SQL and low-code tools.

Given these requirements, the results from experiments with ClickHouse did not suit the company. Other open-source database platforms required serious effort from the developers, so they were also excluded.

The free vers ion of Ver t ica , Ver t ica Community Edition (CE), proved very valu-able compared to open-source databases, not only in terms of higher performance but also because it contained low-code tools, which greatly reduced the cost to develop the data warehouse.

In the end, S7 Airlines chose the Vertica Analytics Platform. “This columnar database supports ANSI SQL and you can quickly cre-ate playbooks on the enterprise cloud plat-form,” continues Roman Ryzhikh. “Another very useful feature of Vertica is that you can quickly make copies of existing database in-stances, which is very useful when creating a development environment, and it is free for development environments. A lot of useful features are already included in the main li-cense package.”

The Vertica platform can be implemented in-houseThe company deployed the database and further developed the Vertica data ware-house and analytics in-house. Company spe-cialists quickly mastered the new platform and consulted with colleagues at Vertica only occasionally. During the migration, it was not just the database that was replaced; the data model was also changed significantly.

Implementation of the new platform was preceded by a full year of research — from participating in meetups to visiting Vertica reference clients.

“Significant assistance in solving certain technical issues was available thanks to Vertica’s powerful Russian community,” continues Roman. “The ability to seek advice

from experienced users is a big advantage.”

The main data sources are integrated with the warehouse via a Kafka data bus. Integration with S3 storage and legacy sources with the database is performed using connectors de-veloped by the airline specialists.

Informatica is currently being replaced with Airflow. The load schedule has remained batch, but the frequency is increased. Whereas the update used to be performed daily, it is now done once every three hours, and in some scenarios even more often—once every half hour—so data from different sources can be downloaded asynchronously.

“We spent about a year creating a new culture of using a columnar database, actively telling users about the benefits of the new platform, holding meetings, and training business em-ployees on how to work with data more effec-tively. Somewhere along the line we created a small center of expertise on the columnar database,” adds Roman Ryzhikh. “Our efforts resulted in more than ten projects being de-veloped on the basis of Vertica Community Edition and other columnar databases. Over time, many of these evolved into production systems based on the commercial version of Vertica. After the new culture of working with data was established, we began the transfer of the data warehouse from the old database to Vertica.”

The Vertica data warehouse is now deployed in a hybrid cloud environment in a three-node configuration. The plan is to expand the configuration to 15 nodes during next year. “Business requirements and expectations regarding data warehouse performance are steadily increasing, so we plan to move each node to a separate physical server soon,” says Dmitry Negreev, S7 Airlines Enterprise Data Warehouse Architect.

Enterprise analytics kernelThe Vertica-based data warehouse now pro-vides all airline key performance indicators: sales, booking, revenues, expenses, etc.

The users of the top-level analytics tools are employees in almost all business units across the company. The number of spe-cialists actively using data from the Vertica data warehouse has already exceeded 100. “Anyone who can write SQL queries can use data from the data warehouse in their work,” says Roman Ryzhikh. Advanced users write Python code and use Jupyter Notebooks .

Experts from the Data Science Department have recently joined the circle of active data consumers.

Data from Vertica is used to make a wide variety of business decisions, including stra-tegic ones. When deciding on the type of aircraft for a particular flight, data from the data warehouse is studied. Last year, which was a bad year for aviation, the company’s management began analyzing the data from Vertica to optimize flight schedules. In 2020, S7 became the leader of the Russian air transportation market for some time, over-taking Aeroflot. According to Roman, this vic-tory was ensured by highly professional S7 managers, for whom the Data Management Department was able to provide fast, in-time, qualitative information and analytics for busi-ness decisions.

Vertica: stability, performance, and convenient licensing“Vertica is a very reliable system, so we are not concerned about our data,” continues Roman. “We appreciate the support of stan-dard SQL and the very beneficial licens-ing policy for customers. Using the Vertica Community Edition you can calculate the net present value (NPV) without having to pre-negotiate with the vendor. And with a commercial license, we don’t have to worry about forgetting to include the right features in the contract because the normal license has everything, including data science.”

Business users is finally able to quickly do ad hoc data warehouse queries.

“We now have a transparent and flexible

In 2020, S7 became the leader of the Russian air transportation market

Page 3: S7 Airlines - vertica.com

database platform to provide users with data analysis in the shortest possible time,” adds Dmitry Negreev. “Long-term agreements on additional licenses are a thing of the past. Quickly meeting business needs as they emerge is a major breakthrough for us.”

New user experience“The company’s Vertica data analytics solu-tions are a delight for business representa-tives,” says Roman Ryzhikh. “Some types of data were previously updated once a day, but the technical limitations for data loading with

a higher frequency have now been removed, and so it is now possible to perform data loading every minute.” With the transition to Vertica, commercial proposal recommenda-tion quality has improved, the number of com-plaints has decreased, and passenger loyalty has increased.

“The transition to the Vertica platform has helped minimize the human impact on data processing techniques, including ETL,” Roman continues. “It is possible to trace data pipelines and quickly identify problems that reduce data

quality. Something that might have taken us a day can now usually be done in an hour.”

The next step is telemetry analyticsThe airline has big plans to expand the use of Vertica. For example, work is under way to create domain data warehouses that operate using the data mesh approach. When telem-etry data from aircraft is available in the data warehouse, analysis will help to optimize the consumption of fuel and other consumables.

10-27-2021 | V | DS | 10/27 | © 2021 Micro Focus or one of its affiliates. Micro Focus and the Micro Focus logo, among others, are trademarks or registered trademarks of Micro Focus or its subsidiaries or affiliated companies in the United Kingdom, United States and other countries. All other marks are the property of their respective owners.

“We now have a transparent and flexible database platform to provide users with data analysis in the shortest possible time.

Quickly meeting business needs as they emerge is a major breakthrough for us.”

DMITRY NEGREEV

ENTERPRISE DATA WAREHOUSE ARCHITECTS7 AIRLINES

Contact us at:www.vertica.com

Like what you read? Share it.