the application of business intelligence solutions in a ... · the process of data discovery is...
TRANSCRIPT
REMIGIUSZ LEWANDOWSKI, EUGENIA FRONCZAK, KRZYSZTOF WAWRZYNIAK, MARCIN ŁAGODZI�SKI, WOJCIECH CZECHUMSKI
THE APPLICATION OF BUSINESS INTELLIGENCE SOLUTIONS
IN A HEALTH CARE ORGANIZATION
Summary
Implementation of IT solutions in health care, as well as the competent
application of information collected in medical data bases are the key to improving
the standards of treatment and the processes of knowledge management in
organizations. This paper provides an analysis of a concept of the application of
Business Intelligence tools in health care to determine the correlation between
medical events and their results. It describes fundamental elements for the
development of a Data Exchange Structure (DXS) and a Data Warehouse (DW), as
well as for the application of Online Analytical Processing (OLAP)
Keywords: business intelligence, data warehouse, OLAP, data mining, medical data bases
1. Introduction
The global process of informatization has largely contributed to the development of the world
economy and information society; therefore, the possession of comprehensive and current data in
an organization is becoming more and more central to its growth, facilitates the decision-making
processes, and enables the successful undertaking of future projects. The development of
information technology has made it possible to store and process increasing amounts of data.
However, the mere collection of large data bases does not contribute to the development of
a business organization. The quality of data is much more important, as is an ability to take
advantage of its correlations, which makes the data reveal useful knowledge and offer measurable
value for the organization.
Nowadays, in order to survive in the market, one should be skilled at using available
information to draw conclusions which leads to gaining knowledge on the basis of which apt
decisions can be taken [1]. The purpose of knowledge management is to change the attitude from
‘you don’t know what you don’t know’ to ‘you know what you know’ and to apply the knowledge
to improve the efficiency of an organization [2].
Modern database systems are very efficient and capacious; therefore, the actual problem is not
how to store data but how to use it effectively [3]. A massive increase in the number of databases
and data repositories in the field of health service offers ample opportunities for their use to
R. Lewandowski, E. Fronczak, K. Wawrzyniak, M. Łagodzi�ski, W. Czechumski
The application of Business Intelligence solutions in a health care organization
90
explore the data and discover knowledge [4]. Nevertheless, processing information by means of
basic methods is not always enough to provide a satisfactory answer to a formulated hypothesis.
This is why in the realities of today it is becoming more and more important to apply the
technology of data mining.
Data exploration is a process which consists in finding new, previously unknown, potentially
useful, understandable, and correct patterns in extensive data sets [5]. The use of data mining
systems is increasingly vital to the functioning of an organization, because it enables discovering
‘new’ knowledge, which is then used to gain greater advantages. Besides, the process of data
exploration used in business enterprises makes it possible to take suitable decisions in a dynamical
environment [6].
In the data exploration process, new information is mined through mathematical and statistical
analysis of data, by means of database technology, pattern recognition and machine learning
techniques, as well as artificial intelligence. The knowledge mined from large data sets is used to
support decision-making processes, to find solutions to problems, and to make forecasts and plans,
which is particularly important in health service organizations.
Data exploration comprises the following stages:
• Preparation of data – the data is prepared in a way to ensure that it is correct, compliant
with, and significant for the considered problem,
• Data mining – i.e. learning about the features and characteristics of the analyzed data,
• Analysis and evaluation of data – selection of an appropriate method to solve the problem
and obtain useful information (at this stage, thanks to the application of various techniques, rules,
correlations, and algorithms a specific model or models of data are built),
• Model application – the model which gives the best results is applied.
The process of data discovery is very difficult and multifaceted; therefore it often requires
repeating of individual stages or narrowing down the area of exploration due to the complexities of
data and its correlations. Furthermore, in order to achieve the best results in data mining, one
should have specialist knowledge, an ability to understand the specific problems and to identify the
right analytical methods to handle them, and the commitment of all users of the process. Some
measurable benefits of data mining include the finding of certain principles on which the business
operates and a supports in managing public relations [7].
Information technology is finding more and more applications in healthcare facilities, whether
it involves various aspects of therapy, or the management, e.g. of resources, deliveries, logistics,
medical supplies, according to standards specified by the National Health Fund.
Opinions that data mining supplements the current functionalities of database systems are
becoming increasingly common [8]. Collected data is stored in databases, which should be used
directly by analytical systems to manage the knowledge of an organization. It is essential to
combine the process of exploration with a database administration system [9]. However, the next
step in the informatization of health services is the growing number of implementations of data
warehouses with Business Intelligence (BI) applications. This enables the integration of diverse
information coming from various sources, such as medical equipment, diagnostic procedures,
patient demographics, as well as details connected with the costs of treatment. It should be noted
that it is particularly important that the resources available to health service units are reasonably
managed [10].
91
The implementation of a Data Warehouse (DW) at the Oncology Centre in Bydgoszcz is
another stage of its informatization, as well as a key element in the concept of knowledge
management in that organization [11]. In this paper, an attempt has been made to introduce the
outlooks of establishing correlations between medical events and their results using a Business
Intelligence tool kit. The paper outlines the process of sourcing knowledge from large sets of
medical data of patients from the Oncology Centre.
2. Results
a. Information systems in medicine
The number of IT products for the health service sector offered on the market is continu-ously
increasing and now nearly every healthcare facility uses a whole range of specialist field-specific
systems. Nevertheless, subsequent implementations of new systems often fail to yield expected
results for the health service unit due to fundamental problems with integration of the information
handled by the systems. The effects of the implementation of electronization tools vary that the
probability of failure is higher than that of the success [12]. The most critical factors that impact
the effectiveness of a BI system are: output information accuracy, conformity to the requirements,
and support of organizational efficiency [13]. One of the ways to optimize internal processes in
hospitals is to introduce a standardized format of data recording and automation of data acquisition
and exchange methods that would not require human intervention [14].
Investments in IT systems usually have a clear priority – to introduce additional
functionalities. However, the functionality of a platform on which data could be exchanged among
different sys-tems (or their modules) is not readily available in the market of IT solutions. This is
where a need arose for the introduction of our own method to solve the problems connected with
such integration of information processed by IT systems operated now and in the future. The
solution was to launch a Data Exchange Structure (DXS), which enables the multi-party exchange
of data and organized collection of data transferred through the DXS. The acquisition of collected
data for the Data Warehouse, as another party in the data exchange system, was just the logical
next step.
b. Principles of interaction between data exchange systems and the DXS
The following principles were established for the systems which were to exchange data with
the Data Exchange Structure:
• Each system operates using its own native data structures.
• Due to the conditions in which the systems operated in the organization, in the event of
any change in the data exchanged between the systems, the change must be propagated
immediately.
• Each system handling data acquired from an external system makes its own data reception
mechanism accessible to receive notifications of changes in the data. The mechanism should
provide the functionality of recording data acquired from external systems in its native structures.
• In order to notify an external system of a change in the data of interest to the external
system, each system, having modified its internal data set, initiates the reception mechanism of the
external system.
Studies & Proceedings of Polish Association for Knowledge Management No. 58, 2012
R. Lewandowski, E. Fronczak, K. Wawrzyniak, M. Łagodzi�ski, W. Czechumski
The application of Business Intelligence solutions in a health care organization
92
For example, if an X system is interested in data from an external Y system, a change made in
the data will require the Y system to notify the X system of the change by triggering the reception
mechanism of the X system.
The role of the Y system is to prepare its native data correctly for the reception mechanism of
the X system and to initiate the mechanism. If the data is accepted by the X system, the role of the
Y system ends when it registers the delivery of the information in the log. The information may
only be rejected by the X system if an irregularity in the data exchange is detected. In a situation
like this, the X system is required to communicate the reason for rejection. Both X and Y systems
must register all errors.
The principles above refer to the interaction of any data-exchange parties (shown as X and Y
systems in Fig. 1) and of the Data Exchange Structure.
Figure 1. Standard data exchange between individual systems and the DXS
Source: own study.
The DXS makes it possible to exchange data down to the detail of data elements supported by
individual parties, which participate in the exchange. In this way, external conditions and
changeable standards do not require any modifications of the internal structures of the data
handled by the parties. As a result, for example, medical procedures can be defined to a greater
degree of detail than in standard classification methods.
93
c. Databases
The environment in which the analyzed data is stored and processed is based on the MS SQL
2008 Database Engine. A few dozen databases from the production server were analyzed, and the
number of patients registered in the system exceeded four hundred thousand. More than four
million internal procedures were registered for the patients.
d. Business Intelligence (BI) tools
In this project, the data was analyzed in more than sixty data bases on an OLTP (OnLine
Transaction Processing) server operating in the MS SQL 2005 environment. In order to improve
the usefulness of the databases, the following BI services were applied: MS SQL 2008 SSIS (SQL
Server Integration Services), SSAS (SQL Server Analysis Services), and – for presentations –
Microsoft PowerPivot. The choice of BI tools was determined by the desire to ensure coherence of
the environment for OLTP and OLAP servers, as well as by the price factor. The OLAP is a new
model of data processing, which is designed to support data analysis processes. In this model, it is
possible to analyze data in multiple user-defined ‘dimensions’ (time, place, product classification,
etc.). The analysis consists in calculating aggregates for defined dimensions. It should be
underlined that the whole analytical process is controlled by the user [15].
Despite the substantial degree of integration of the production server, a problem of data
heterogeneity occurred in the project, which principally resulted from diverse needs and
applications of the specialist systems. Another difficulty was encountered because of continuous
changes in medical and auxiliary systems. The application of a special analytical tool, dedicated to
the problem, enabled the smooth acquisition of data and the feeding of OLAP cubes from various
heterogeneous data sources.
2.1. Categories of fundamental problems and applied solutions
a. Patients’ survival as a measure of successful treatment and a consequence of medical
events
• The survival time is calculated as the time from the day on which the cancer was first
diagnosed to the date of the last recorded medical event (including the patient’s death, if
applicable). There is also an option to use an automatic time marker, as in the paragraph below.
• In the case that a patient does not report for check-ups and has not died, and the time from
the patient’s last visit to the Oncology Centre to the date of the current analysis is longer than the
assumed >t, an automatic time marker is used (current date – >t), equivalent to a medical event.
• The assumed >t should ensure that the death of a patient is properly reflected in the Data
Warehouse (DW).
b. Comparability of analyzed relations
• The principal requirement for the analysis, provided the data is correct, is to maintain its
comparability. This should be ensured through a selection of resources in terms of the kind and
stage of disease at the time when treatment is attempted.
For example, for the purpose of presentation, a group of female patients with diagnosed C50.*
(all forms of breast cancer) was selected at the OC. They were divided by years (2006–2009) and
stage of cancer development (0 – IV).
Studies & Proceedings of Polish Association for Knowledge Management No. 58, 2012
R. Lewandowski, E. Fronczak, K. Wawrzyniak, M. Łagodzi�ski, W. Czechumski
The application of Business Intelligence solutions in a health care organization
94
• The use of a relative time scale
In order to maintain the comparability of events concerning different patients who start
treatment in different times, a relative time scale was introduced.
o n – number of days (n = 50 in the presentation), which enables an analysis of the actions
preceding the TNM classification (e.g. surgery and pathomorphological examinations).
o The first day on this scale is a relative start date, defined as the day on which the first
medical event took place, recorded from day n, being earlier than the date of the identification of
the TNM code, which determines the stage of development of the cancer in the OC medical
system.
o If the assumed start date is the date of a patient’s first visit to the OC, these could be the
dates of examinations or procedures, which are unrelated to the cancer, much earlier than the
period of our interest. A similar situation happens if the start date is the day on which C50.* was
diagnosed (e.g. occurring earlier than the date of the patient’s first visit to the OC).
o All dates are referenced as days calculated from the start date (including the date of death,
if applicable).
o Wherever the patient’s age is indicated in the analysis, it is related to the date.
c. Method of presentation of analyzed relations
• Event categories
For analytical purpose,s the following categories of medical events (actions) were identified:
– Check-up and/or diagnosis,
– Surgical procedures,
– Chemotherapy,
– Radiotherapy,
– Other medical actions, not listed above, mainly including rehabilitation.
The events were defined on a time scale with an accuracy of one day of the patient’s stay at
the OC.
The date of determination of the stage of cancer was also marked on the time scale along with
the date of the patient’s death, if applicable.
• Frequency of occurrence
In order to analyze the accomplishment of specific categories of medical events by the
hospital, the presentation includes the number of days on which the events took place during the
patient’s stay at the OC.
The number of days can be represented as the length of the green bar on the charts, or the days
can be directly marked on the time scale.
• Time frame of the events
For the purpose of presenting the time frame in which medical events of a specific category
took place, a red horizontal bar was used on the time scale, the length of which represents the
interval in days between the first and the last event in the given category over the considered
period.
• Time scale
The time scale is determined according to needs; therefore it is possible to use a time scale
with the days in the treatment process, on which specific medical procedures were carried out.
Also, an analysis of the sequence of their occurrence is possible (i.e. the procedures that precede
and follow a given procedure).
95
2.2. Selected presentations
a. Patient’s survival rates – a possibility to use statistical measures
a) – (upper chart) the use of the median and quartiles for the number of survival days in
patients with C50.* and stage IIa
b) – (lower chart) the use of the median to compare patients with C50.* at different stages of
development (except for stage 0 which was hardly represented)
Figure 2. Selected presentations: Patient’s survival rates – a possibility to use statistical
measures a) – (upper chart) the use of the median and quartiles for the number of survival days in
patients with C50.* and Stage IIa. b) – (lower chart) the use of the median to compare patients
with C50.* at different stages of development.
Source: own study.
b. Presentation of the course of treatment of female patients diagnosed with a C50.9 tumor
on a synchronized relative timeline.
Studies & Proceedings of Polish Association for Knowledge Management No. 58, 2012
R. Lewandowski, E. Fronczak, K. Wawrzyniak, M. Łagodzi�ski, W. Czechumski
The application of Business Intelligence solutions in a health care organization
96
Figure 3. Presentation of the course of treatment of female patients diagnosed
with a C50.9 tumor on a synchronized relative timeline
Source: own study.
A selection of treatment courses of three patients diagnosed with C50.9, presented on
a synchronized relative timeline, show different patterns of the course of treatment, rehabilitation,
and the time during which the patients remained under medical care (on the diagnostics bar).
In the first case, surgical events occurred on the time scale for the total of approximately 300
days. This kind of information can be identified by selecting the option of the span of specific
medical events (surgical procedures in the example above).
c. Example presentation of the quantity of medical events (various categories of actions)
with a consideration for the stage of development of C50.* (Fig. 4.)
The presentation enables the assessment of the level of involvement of the hospital in different
categories of medical actions, taking account of the stage of cancer.
The number of medical events (different categories of medical actions), represented by the
number of days on which these occurred, clearly shows the efficiency of treatment of breast cancer
at different stages of spread (the height of the blue bars versus the height of the yellow bars). It
also gives an idea of the involvement of the resources available to the OC in the process of
treatment of breast cancer at different stages.
97
Figure 4. Example presentation of the quantity of medical events (various categories of
actions) with a consideration for the stage of development of C50.*
Source: own study.
d. Presentation of the involvement of OC resources in f 2006–2011 in the process of
treatment of patients diagnosed with C50.*, divided into surviving (blue bar) and deceased patients
(red bar) and different stages of cancer (Fig. 5.)
Figure 5. Presentation of the involvement of OC resources in 2006–2011 in the process of
treatment of patients diagnosed with C50.*, divided into surviving (blue bar) and deceased
patients (yellow bar) and different stages of cancer
Source: own study.
Studies & Proceedings of Polish Association for Knowledge Management No. 58, 2012
R. Lewandowski, E. Fronczak, K. Wawrzyniak, M. Łagodzi�ski, W. Czechumski
The application of Business Intelligence solutions in a health care organization
98
2.3. Types of analyses practicable using the currently functioning solution
a. Medical analysis
? The information resources of the Data Warehouse make it possible to narrow down
analyzed medical events to the applied medical procedures. This means that it is possible to select
– for defined conditions (dimensions) – all cases where:
– a specific procedure was applied,
– the procedure was applied within a time interval (Tx) following a significant event on the
time scale (including after another specific procedure),
– a criterion of no later than or no earlier than is used.
? Each case selected for analysis can be evaluated from the point of view of the outcome
(patient survival, reoperations and recurrences).
? It is also possible to select combined treatment, which however requires a precise
identification of the course of events pertinent to the case.
? For selected events (e.g. the patient’s visit to the clinic for a routine checkup), it is
possible to determine whether patients report for checkups with the frequency recommended from
the medical perspective and what the consequences are.
? With this kind of information in hand, the hospital could control the patients and its own
medical staff, as medically unjustified visits prevents the patients who are in a real threat from the
disease from access to the services.
b. Economic aspects of treatments
• One of the basic problems faced by hospitals are the difficulties connected with the
accounting and settlement of medical services with the National Health Fund.
Allocating events related to individual patients for these purposes with the use of DW
resources and the functionalities of a BI solution is possible and already applied at the Oncology
Centre.
• The process of cost analysis at hospitals, similarly to the settlement of services with the
NHF, is another one of the fundamental problems of the administrative nature. The availability of
economic information in the DW makes it possible to analyze costs (including e.g. the problem of
underestimating costs of procedures, averaged costs of treatment of a patient in a defined time
frame, etc.).
• Besides cost control, it is possible to carry out different ongoing analyses, e.g. of the
loading (utilization) of specific resources, such as diagnostic equipment or beds.
3. Conclusions
The presentation of the implemented concept of the application of Business Intelligence
solutions in a health service organization, intended to determine the correlations between medical
events and their results, allows us to formulate a basic conclusion that such solutions, structurally
adapted to provide business analysis, may well be applied in the medical sector, and even more so
where business and health services overlap. It also concerns the data structures in transactional
data sets, which occur on the borderline of medical and economic and organizational problems.
The condition precedent for the task identified in the title is a competent preparation of the Data
Warehouse environment, which would guarantee that the information resources are reliable and
99
proper for the intended analyses. The systems from which the information would be acquired by
the DW should meet these requirements as well.
The use of DW resources and taking advantage of the possibilities offered by the BI tool
implemented at the Oncology Centre in Bydgoszcz have only been hinted at. The tool was
developed to link individual classifications, to visualize correlations, and to find common indirect
connections. The authors plan as a next step to develop a tool so-called “knowledge cube” which
will register, integrate, and present knowledge to the user. At the same time, it will minimize
integration services and enable finding the shortest links or the kind of lack of them for a minimal
path between individual objects. An organized register of knowledge is indispensable for
successful knowledge management.
The use of a data warehouse and a BI solution, suitably to the needs and possibilities of any
other kind of hospital is also practicable, offering advantages and prospects of development.
Bibliography
[1] Bies, G. Business Intelligence w ochronie zdrowia. Wydawnictwo Uniwersytety
�l�skiego, City, 2008.
[2] Fr�czkiewicz – Wronka, A., Austin, A. Od zarz�dzania informacj� do tworzenia wiedzy –
zastosowanie ICT w organizacjach sektora zdrowotnego. Wydawnictwo Uniwersytety
�l�skiego, City, 2008.
[3] Gramacki A., Gramacki J., Nowa metoda grupowania danych koszyka sklepowego.
Przegl�d Telekomunikacyjny, rocznik LXXXI, nr 6/2008.
[4] MZ. Strategia e-Zdrowie Polska 2009–2015. Ministerstwo Zdrowia, 2009.
[5] Mullins, I. M., Siadaty, M. S., Lyman, J., Scully, K., Garrett, C. T., Greg Miller, W.,
Muller, R., Robson, B., Apte, C., Weiss, S., Rigoutsos, I., Platt, D., Cohen, S., Knaus, W.
A. Data mining and clinical data repositories: Insights from a 667,000 patient data set.
Computers in Biology and Medicine, 36, 12, 2006, pp. 1351–1377.
[6] Gawrylczyk A., Zastosowanie i znaczenie technologii “data mining” w bankowo�ci. III
Sympozjum Naukowe SKN „Economicus”, 11–12.02.2008 Przesieka k/Jeleniej Góry.
[7] Kotarski, D., Patients cost accounting as a source of managerial information in spa
facilities, Studies and Proceedings of Polish Association for Knowledge Management vol.
39, pp. 123–130.
[8] T. Imielinski T., Manilla H., A Database Perspective on Knowledge Discovery, Comm. of
the ACM, Vol. 39, No. 11, November 1996.
[9] Stanisławski W., Szydłowska E., Analiza narz�dzi Data Mining ORACLE 10g do
klasyfikacji komórek nowotworowych w cytometrycznym systemie skaningowym. XII
Konferencja u=ytkowników i deweloperów ORACLE, 17–20.10.2006 Zakopane –
Ko�cielisko.
[10] Fronczak, E. and Michalcewicz, M. Zastosowanie narz�dzi eksploracji danych Data
Mining do tworzenia modeli zarz�dzania wiedz�. Studia i Materiały Polskiego
Stowarzyszenia Zarz�dzania Wiedz�, 27,2010, pp. 126–139.
[11] Lewandowski R., Łagodzi%ski M., Fronczak E., Data Warehouse implementation to
support knowledge management; a case study of F. Łukaszczyk Oncology Centre in
Studies & Proceedings of Polish Association for Knowledge Management No. 58, 2012
R. Lewandowski, E. Fronczak, K. Wawrzyniak, M. Łagodzi�ski, W. Czechumski
The application of Business Intelligence solutions in a health care organization
100
Bydgoszcz, Poland, Studies and Proceedings of Polish Association for Knowledge
Management vol. 37, 2011, pp.186–195.
[12] Back T. Adaptive businessintelligence based on evolution strategies software., An
International Journal of Information Sciences, Vol. 148, (1–4,) 2002, pp. 113–121.
[13] Lin, Y.H., Tsai, K.M., Shiang, W.Y., Kuo, T.C., Tsai, C.H., Research on using ANP to
establish a performance assessment model for businessintelligence systems. An
International Journal of Information Sciences, Vol. 36,(2), Part 2, 2009, pp. 4135–4146.
[14] Gawro%ska-Błaszczyk A., Global GS1 Standard for the sake of the improvement of
hospital efficency. World best practices. Studies and Proceedings of Polish Association
for Knowledge Management vol. 39, pp. 35–46.
[15] Morzy, T. Eksploracja danych: problemy i rozwi�zania, V Konferencja PLOUG 1999, pp.
1–10.
ZASTOSOWANIE NARZ�DZI BUSINESS INTELLIGENCE
W ORGANIZACJI SŁU�BY ZDROWIA
Streszeczenie
Informatyzacja słu�by zdrowia jak te� umiej�tne wykorzystywanie informacji
zgromadzonych w medycznych bazach danych odgrywa kluczow� rol�
w podwy�szaniu standardów leczenia jak równie� w procesie zarz�dzania wiedz�
w organizacji. W pracy przeanalizowano koncepcj� zastosowania narz�dzi Business
Intelligence w słu�bie zdrowia w celu ustalenia relacji mi�dzy zdarzeniami
medycznymi, a ich skutkami. Opisano kluczowe elementy tworzenia Struktury
Wymiany Danych (SWD), Hurtowni Danych (HD) oraz stosowania technologii
OLAP (On Line Analytical Processing).
Słowa kluczowe: Business Intelligence, Hurtownia Danych, OLAP, Data Mining, Medyczne Bazy
Danych
Remigiusz Lewandowski
Management Information Systems
University of Technology and Life Sciences in Bydgoszcz
ul. Fordo%ska 430, 85-790 Bydgoszcz, Poland
e-mail: [email protected]