research and pedagogy in business analytics: opportunities and illustrative examples

13
Journal of Computing and Information Technology - CIT 21, 2013, 3, 171–183 doi:10.2498 /cit.1002194 171 Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples Ramesh Sharda, Daniel Adomako Asamoah and Natraj Ponna Institute for Research in Information Systems, Spears School of Business, Oklahoma State University, USA In recent times, business analytics and big data have gained momentum both in industry practice and academic research. The objective of this paper is to provide both a research and teaching introduction to business analytics in the context of both current and prospective perspective of the business analytics domain. It begins by providing a quick overview of the three types of analytics. To assist the future analytics professionals, we identify, group and discuss nine different participants of the analytics industry into clusters. We then include a brief description of some current research projects under way in our team. We also note some research opportunities in Big Data analytics. The paper also concludes with a discussion of teaching opportunities in analytics. Keywords: business analytics, Big Data, social media research, athletic injuries, healthcare 1. Introduction Although many authors and consultants have defined it slightly differently, analytics can be viewed as the process of developing recommen- dations for actions or actionable decisions based upon insights generated from historical data. The Institute for Operations Research and Man- agement Science (INFORMS) has created a ma- jor initiative to organize and promote analyt- ics. According to INFORMS, analytics repre- sents the combination of computer technology, management science techniques, and statistics to solve real problems. Of course, many other organizations have proposed their own interpre- tations and motivation for analytics. For exam- ple, SAS Institute Inc. proposed eight levels of analytics that begin with standardized reports from a computer system. These reports provide a sense of what is happening with an organiza- tion. Additional technologies have enabled us to create more customized reports that can be gen- erated on an ad hoc basis. The next extension of reporting takes us to online analytical pro- cessing (OLAP) type queries that allow users to dig deeper and determine specific sources of concern or opportunities. Technologies avail- able today can also automatically issue alerts for decision makers when the performance is- sues warrant such alerts. At a consumer level, we see such alerts for weather or other issues. But similar alerts can also be generated in spe- cific settings, for example, when the sales fall above or below a certain level within a certain time period, or when the inventory for a specific product is running low. All of these applications are made possible through analysis and queries on data being collected by an organization. The next level of analysis entails statistical analysis to better understand patterns, which can be used to develop forecasts or models for predicting how customers might respond to a specific mar- keting campaign or ongoing service/product offering. When an organization has a good view of what is happening and what is likely to happen, it can also employ other techniques to make the best decisions under the circum- stances. These eight levels of analytics are de- scribed in more detail in a white paper by SAS. (http://www.sas.com/news/sascom/analy- tics levels.pdf)[13]. Instead of eight levels, INFORMS defines three categories of analytics, based on the idea of looking at all the data to understand what is happening, what will happen, and how to make

Upload: independent

Post on 10-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Journal of Computing and Information Technology - CIT 21, 2013, 3, 171–183doi:10.2498/cit.1002194

171

Research and Pedagogy in BusinessAnalytics: Opportunities andIllustrative Examples

Ramesh Sharda, Daniel Adomako Asamoah and Natraj PonnaInstitute for Research in Information Systems, Spears School of Business, Oklahoma State University, USA

In recent times, business analytics and big data havegainedmomentumboth in industry practice and academicresearch. The objective of this paper is to provide both aresearch and teaching introduction to business analyticsin the context of both current and prospective perspectiveof the business analytics domain. It begins by providinga quick overview of the three types of analytics. To assistthe future analytics professionals, we identify, groupand discuss nine different participants of the analyticsindustry into clusters. We then include a brief descriptionof some current research projects under way in our team.We also note some research opportunities in Big Dataanalytics. The paper also concludes with a discussion ofteaching opportunities in analytics.

Keywords: business analytics, Big Data, social mediaresearch, athletic injuries, healthcare

1. Introduction

Although many authors and consultants havedefined it slightly differently, analytics can beviewed as the process of developing recommen-dations for actions or actionable decisions basedupon insights generated from historical data.The Institute for Operations Research and Man-agement Science (INFORMS) has created ama-jor initiative to organize and promote analyt-ics. According to INFORMS, analytics repre-sents the combination of computer technology,management science techniques, and statisticsto solve real problems. Of course, many otherorganizations have proposed their own interpre-tations and motivation for analytics. For exam-ple, SAS Institute Inc. proposed eight levels ofanalytics that begin with standardized reportsfrom a computer system. These reports provide

a sense of what is happening with an organiza-tion. Additional technologies have enabled us tocreate more customized reports that can be gen-erated on an ad hoc basis. The next extensionof reporting takes us to online analytical pro-cessing (OLAP) type queries that allow usersto dig deeper and determine specific sources ofconcern or opportunities. Technologies avail-able today can also automatically issue alertsfor decision makers when the performance is-sues warrant such alerts. At a consumer level,we see such alerts for weather or other issues.But similar alerts can also be generated in spe-cific settings, for example, when the sales fallabove or below a certain level within a certaintime period, or when the inventory for a specificproduct is running low. All of these applicationsare made possible through analysis and querieson data being collected by an organization. Thenext level of analysis entails statistical analysisto better understand patterns, which can be usedto develop forecasts or models for predictinghow customers might respond to a specific mar-keting campaign or ongoing service/productoffering. When an organization has a goodview of what is happening and what is likelyto happen, it can also employ other techniquesto make the best decisions under the circum-stances. These eight levels of analytics are de-scribed in more detail in a white paper by SAS.(http://www.sas.com/news/sascom/analy-tics levels.pdf) [13].

Instead of eight levels, INFORMS defines threecategories of analytics, based on the idea oflooking at all the data to understand what ishappening, what will happen, and how to make

172 Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples

the best of it. These three levels are identified(http://www.informs.org/Community/Ana-lytics) as Descriptive, Predictive, and Pre-scriptive [5]. Figure 1 presents a graphical viewof these three levels of analytics. The intercon-nected circles show that there is some overlapacross these three types of analytics. A moredetailed description of each of these categoriesfollows.

Figure 1. Three types of analytics.

Descriptive or reporting analytics refers to know-ing what is happening in the organization andunderstanding someunderlying trends and causesof such occurrences. This involves consolida-tion of data sources and availability of all rel-evant data in a form that enables appropriatereporting and analysis. From this data infras-tructure, we can develop appropriate reports,queries, alerts, and trends using various report-ing tools and techniques. In our view, this hasbeen the key focus of business intelligence (BI).A significant technology that has become a keyplayer in this area is visualization. Using thelatest visualization tools in the market, we cannow develop powerful insights into the opera-tions of an organization.

Predictive analytics aims to determine what islikely to happen in the future. This analysis isbased on statistical techniques as well as othermore recently developed techniques that fall un-der the general category of data mining. Thegoal of these techniques is to be able to predictif the customer is likely to switch to a com-petitor (“churn”), what a customer is likely tobuy next and in what quantity, what promo-tion a customer would respond to, whether this

customer is a credit risk, etc. A number of tech-niques are used in developing predictive analyti-cal applications including various classificationalgorithms. For example, we can use classifi-cation techniques such as decision tree modelsand neural networks to predict how well a mo-tion picture would do at the box office. We canalso use clustering algorithms for segmentingcustomers into different clusters to be able totarget specific promotions to them. Finally, wecan use association mining techniques to esti-mate relationships between different purchasingbehaviors. That is, if a customer buys one prod-uct, we can predict other items the customer islikely to purchase. Such analysis can assist aretailer in recommending or promoting relatedproducts. For example, any product search onAmazon.com results in the retailer also suggest-ing other similar products that a customer maybe interested in.

The third category of analytics is termed Pre-scriptive analytics. The goal of prescriptive an-alytics is to examine current trends and likelyforecasts and use that information to make deci-sions. This group of techniques has historicallybeen studied under the umbrella of operationsresearch or management sciences and is gener-ally aimed at optimizing the performance of asystem. The goal here is to provide a decision ora recommendation for a specific action. Theserecommendations can be in the forms of a spe-cific yes/no decision for a problem, a specificamount (say, price for a specific item or airfareto charge), or even a complete set of produc-tion plans. The decisions may be presented toa decision maker in a report, or may directlybe used in an automated decision rules system(e.g., in airline pricing systems). Thus, thesetypes of analytics can also be termed Decisionor Normative Analytics.

Over the years, the use of the analytics and itsrelated fields has grown tremendously and nowthere are wide varieties of industry participantsthat use analytics. They range from industriesthat provide data infrastructure, data warehous-ing solutions, middleware, data aggregation,and analytics software developers to analyticsuser organizations and academics. Researchacross all three categories of analytics also con-tinues to flourish. Technology enhancementshave enabled application of analytic techniquesto gain insights from large amounts of both

Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples 173

structured and unstructured data. This formsthe basis of Big Data analytics.

This paper is organized as follows. In Sec-tion 2, we identify various participants in theanalytics industry and group them into clusters.Section 3 discusses research in predictive ana-lytics and introduces an example of a predictiveanalytics project the authors have worked onrecently. Section 4 presents a discussion onresearch options in Big Data analytics. Thissection also includes a brief description of oneof our research projects that employs Big Dataanalytics. Lastly, we discuss the role of aca-demic programs and modules in advancing thecause of both research and practitioner devel-opment of business analytics. We note that thispaper is an extension of a paper presented atthe 35th International Conference on Informa-tionTechnology Interfaces, Dubrovnik,Croatia,June 2013 [18].

2. Analytics Industry Participant Clusters

This section is aimed at identifying various an-alytics industry players by grouping them intovarious sectors. We note that the list of companynames included is not exhaustive. These merelyreflect our own awareness and mapping of com-panies’ offerings in the analytics industry space.Additionally, the mention of a company’s nameor its capability in one specific group does notmean that the specific activity is the only offer-ing of that organization. We use these namessimply to illustrate our descriptions of sectors.

Many other organizations exist in this industry.Our goal is not to create a directory of players ortheir capabilities in each space, but to illustrateto students considering a career in analytics thatmany different options exist for playing in theanalytics industry. One can start in one sector,and move to another role altogether. We willalso see that many companies play in multiplesectors within the analytics industry and thusoffer opportunities for career transitions withinthe field both horizontally and vertically.

Figure 2 presents the analytic industry clus-ters categorized into nine key sectors or clus-ters, forming Analytics Ecosystem. The firstfive clusters can be broadly termed technologyproviders or accelerators. Their primary rev-enue comes from developing technology, solu-tions, and training which enables the user or-ganizations to employ these technologies in themost effective and efficient manner. The accel-erators include academics and industry organi-zations whose goal is to assist both technologyproviders and users.

2.1. Data Infrastructure Providers

This cluster includes all major organizationsthat provide the hardware and software infra-structures targeting the basic foundation forthe data management solutions. Some of themost obvious players in this cluster: companiesproviding hardware infrastructure for databasecomputing like IBM, Dell, HP and Oracle; stor-age solution providers like EMC and NetApp;

Figure 2. Analytics industry clusters.

174 Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples

companies providing indigenous hardware andsoftware platforms such as IBM and Teradata;data solution providers offering hardware andplatform independent databasemanagement sys-tems like the SQL Server family of Microsoft;specialized integrated software providers suchas SAP; companies like Amazon and Sales-force.com providing full data storage solutionsthrough cloud computing; the recent crop ofcompanies in big data space like Cloudera, Hor-tonworks, Alteryx, Pivotal and Hadapt; andcompanies providing infrastructure services andtraining to create Big Data platform such asHadoop Clusters, MapReduce, NOSQL.

2.2. Data Warehouse Industry

Companies with a data warehousing focus pro-vide technology and services aimed towards in-tegrating data from multiple sources thus en-abling organizations to derive and deliver valuefrom its data assets. They form the backbone forall the players in the analytics industry. Manycompanies in this space include their own hard-ware to provide efficient data storage, retrieval,and processing. Recent developments in thisspace include performing analytics on the datadirectly in memory. Companies such as IBM,Oracle, and Teradata are major players in thisarena.

2.3. BI Platforms/Middleware Industry

The general goal of middleware industry is toprovide easy to use tools enabling reportingor descriptive analytics on the enterprise-widedata. This forms a core part of BI or an-alytics employed at organizations. Examplesof companies in this space include Microstrat-egy, Plum, Oracle–Hyperion, SAP–BusinessObjects, IBM-Cognos and many others.

2.4. Data Aggregators/Distributors

These companies typically focus on a specificindustry sector for specialized data collection,aggregation and distribution through their nicheplatforms and services for data collection. Theybuild their relationships by providing the data toother industry players and enabling them to em-ploy analytics. Some of the example companies

include: Nielsen, which provides data sourcesto their clients on customer retail purchase be-havior; Experian, which provides data on eachhousehold in the US; Omniture, which has de-veloped technology to collect web clicks andshare such data with their clients; and Google,which compiles data for individualwebsites andmakes a summary available throughGoogleAn-alytics services.

2.5. Analytics Focused SoftwareDevelopers

Companies in this category have developed an-alytics software for general use with data thathas been collected in a data warehouse or isavailable through one of the platforms identi-fied earlier (including Big Data). It can alsoinclude inventors and researchers in universi-ties and other organizations that have developedalgorithms for specific types of analytics appli-cations. We can identify major industry playersin this space along with the three types of ana-lytics outlined earlier.

2.5.1. Reporting/Analytics

Reporting or descriptive analytics is enabled bythe tools available from the Middleware indus-try players or unique capabilities offered byfocused providers. For example, Microsoft’sSQL Server BI tool kit includes reporting aswell as predictive analytics capabilities. Onthe other hand, specialized software is availablefrom companies such as Tableau for visualiza-tion. SAS also offers a Visual Analytics toolfor similar capacity. Likewise, there are manyopen source visualization tools as well.

2.5.2. Predictive Analytics

Perhaps the biggest recent growth in analyticshas been in this category, resulting in a largenumber of companies that focus on providingboth technical and management solutions forpredictive analytics. Many statistical softwarecompanies such as SAS, SPSS, etc. embracedpredictive analytics early on and developed thesoftware capabilities as well as industry prac-tices to employ data mining techniques as well

Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples 175

as classical statistical techniques for analytics.IBM-SPSS Modeler from IBM and EnterpriseMiner from SAS are some of the examples oftools used for predictive analytics. Other play-ers in this space include KXEN, Statsoft, Sal-ford Systems, and scores of other companiesthat may sell their software broadly or use itfor their own consulting practices. Two opensource platforms (R and RapidMiner) have alsoemerged as popular industrial strength softwaretools for predictive analytics and have compa-nies that support training and implementationof these open sources tools.

2.5.3. Prescriptive Analytics

Software providers in this category offer model-ing tools and algorithms for optimization of op-erations tagged as management science/opera-tions research (MS/OR) software. This fieldhas had its own set of major software providers.IBM, for example, has classic linear and mixedinteger programming software, IBM–ILOG,which provides prescriptive analysis services.Analytics providers such as SAS have their ownOR/MS tools – SAS/OR. Other major playersin this domain include companies such as AI-IMS, AMPL, Frontline, GAMS, Gurobi, LindoSystems, Maximal, and many others.

Of course, there are many techniques that fallunder the category of prescriptive analytics,each having their own set of providers. For ex-ample, simulation software is provided bymajorcompanies like Rockwell (ARENA) and Simio.Similarly, Frontline offers tools for optimizationwith Excel spreadsheets. Decision analysis inmulti-objective settings can be performed us-ing tools such as Expert Choice. There are alsotools from companies such as Exsys,XpertRule,and others for generating rules directly fromdata or expert inputs. Some new companies areevolving to combine multiple analytics modelsin the Big Data space. For example, TeradataAster includes its own predictive and prescrip-tive analytics capabilities in processing big datastreams.

2.6. Application Developers: IndustrySpecific or General

The organizations in this group use their analyt-ical expertise, focusing on using solutions avail-

able from the data infrastructure, data ware-house, middleware, data aggregators, and an-alytics software providers to develop customsolutions for a specific industry. Thus, thisindustry group makes it possible for the ana-lytics technology to be truly useful. Most majoranalytics technology providers like IBM, SAS,Teradata, etc., clearly recognize the opportunityto connect to a specific industry or client and of-fer analytic consulting services. Companies thathave traditionally provided application/data so-lutions to specific sectors are now developingindustry specific analytics offerings. For exam-ple, Cerner provides electronic medical records(EMR) solutions to medical providers and theirofferings now include many analytics reportsand visualizations. Similarly, IBM offers afraud detection engine for the health insuranceindustry, and is working with an insurance com-pany to employ their famous Watson analyticsplatform in assisting medical providers and in-surance companies with diagnosis and diseasemanagement. Another example of a vertical ap-plication provider is Sabre Technologies, whichprovides analytical solutions to the travel indus-try including fare pricing for revenue optimiza-tion, dispatch planning, etc.

This cluster also includes companies that havedeveloped their own domain specific analyt-ics solutions and market them broadly to aclient base. For example, Axiom has devel-oped clusters for virtually all households inthe US based upon all the data they collectabout households from many different sources.Credit score and classification reporting com-panies (Equifax, Experian, TransUnion, etc.)also belong to this group. Demandtec (nowowned by IBM) provides pricing optimizationsolutions in retail industry. This field representsan entrepreneurial opportunity to develop in-dustry specific applications. Examples of suchcompanies and their activities are: Sense Net-works, which employs location data for devel-oping user/group profiles, X+1 and Rapleaf,which profile users on the basis of email usage,Bluecava, which aims to identify users throughall device usage, and Simulmedia, which targetsadvertisements on TV on the basis of analysisof a user’s TV watching habits.

176 Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples

2.7. Analytics User Organizations

Clearly, this is the economic engine of thewholeanalytics industry. If there were no users, therewould be no analytics industry. Organizationsin every other industry, size, shape, and locationare using analytics or exploring use of analyticsin their operations, trying to gain/retain a com-petitive advantage. These include organizationsin private sector, government, education, mili-tary, etc. Examples of uses of analytics in dif-ferent industries are too numerous to list here.Also, specific companies are not identified inthis section. Rather, the goal is to see what typeof roles analytics professionals can play withina user organization.

Of course, the top leadership of an organization,(Chief Information Officer, etc) is critically im-portant in applying analytics to its operations.Although not enough senior managers seem tosubscribe to this view, the awareness of apply-ing analytics within an organization is growingeverywhere. Major organizations in every in-dustry that we are aware of are hiring analyticalprofessionals under a variety of titles; these an-alytics professionals are eventually moving intomanagement positions and driving business de-cision making using insights derived by em-ploying analytics.

2.8. Analytics Industry Analystsand Influencers

This cluster of analytics industry includes threetypes of organizations or professionals. Thefirst group is the set of professional organiza-tions that provide advice to the analytics indus-try providers and users. Their services includemarketing analyses, coverage of new develop-ments, evaluation of specific technologies, anddevelopment of training and white papers. Ex-amples of such players include organizationssuch as the Gartner Group, The Data Warehous-ing Institute, Forrester, McKinsey, and many ofthe general and technical publications and web-sites that cover the analytics industry.

The second group includes professional soci-eties or organizations that also provide some ofthe same services, but are membership-based.For example, INFORMS, a professional orga-nization, has now focused on promoting an-alytics. Special Interest Group on Decision

Support Systems (SIGDSS), a subgroup of theAssociation for Information Systems, also fo-cuses on analytics. Most of the major vendors(e.g., Teradata, SAS, etc.) also have their ownmembership-based user groups promoting theuse of analytics.

A third group of analytics industry analystsis what we call analytics ambassadors, influ-encers, or evangelists. They have presentedtheir enthusiasm for analytics through their sem-inars, books, and other publications. Illustrativeexamples include Steve Baker, Tom Davenport,Charles Duhigg, Wayne Eckerson, Bill Franks,Malcolm Gladwell, Claudia Imhoff, Bill Inman,and many others.

2.9. Academic Providers and CertificationAgencies

Academic providers and certificate agencies playa major role in fueling the analytics industrywith the necessary human resources by prepar-ing students with current analytics expertise.This cluster, made up of different programsin business schools (information systems, man-agement, marketing etc.), hence represents theacademic programs that prepare professionalsfor the analytics industry. Other academic pro-grams such as computer science, statistics,math-ematics and industrial engineering offer coursesthat help train students for the analytics industry.

The certification programs established by tech-nology providers such as IBM, Microstrategyand SAS have a paramount aim of giving stu-dents the needed set of tools that are relevantto solving real world problems in the indus-try. Accompanying exercises and laboratoryassignments are mostly adapted from actual im-plementations in businesses. For instance, IN-FORMS has just introduced a Certified Ana-lytics Professional (CAP) certificate programthat is aimed at testing an individual’s generalanalytics competency.

Collaborative initiatives also exist between for-mal academic institutions and technology provi-ders that help students develop strong theoret-ical background of analytic concepts, sharpentheir problem formulation techniques and helpthem solve real world problems. Among othertargets, the collaborative initiatives between aca-demic institutions and technology providers are

Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples 177

aimed at helping students develop better tools,procedures and algorithms through research ac-tivities. The outputs of such research activitiesare then translated into providing both rigorousand practically sound solutions to real worldproblems encountered by businesses.

3. Predictive Analytics Research Examples

Research in applying predictive analytics hasbeen growing exponentially over the last fewyears. The first wave of the research focusedon developing algorithms in major categoriesof predictive analytics: classification, cluster-ing, association mining, etc. These includeddecision trees, apriori algorithms, neural net-works, and hundreds of other algorithms andvariants. These algorithms have been appliedin thousands of exploratory projects. A cata-log of such research projects will fill a book,hence we will only illustrate these opportuni-ties by providing a brief description of one re-cent project in applying predictive analytics inhealthcare and sports.

3.1. Analyzing Athletic Injuries

Any athletic activity is prone to injuries and ifthe injuries are not handled properly, adverseperformance-related impact to athletic teams

could result. Using analytics to understand in-juries can help in deriving valuable insights thatwould enable the coaches and the medical teamto manage the team composition, understandthe player profiles and ultimately aid in betterdecision making on the availability of players.

In an exploratory study, injuries related toAmer-ican Football were analyzed using reporting andpredictive analytics by our research team. Theproject followed the CRISP-DM methodologyto understand the business problem of makingrecommendations on managing injuries, under-stand the various data elements collected aboutinjuries, cleaning the data, performing relevantimputations for the missing data, developing vi-sualizations to drawvarious inferences, buildingpredictive models to analyze the injury healingtime period and drawing sequence rules to pre-dict the relationship among the injuries and thevarious body parts that were afflicted with in-juries.

The injury data set consisted of over 560 foot-ball injury records that include injury-specificvariables: body part, body site, laterality, ac-tion taken, severity, injury type, injury start dateand healing date. It also included player iden-tity and other sport specific variables such asposition played, activity, onset, and game loca-tion. Based on the injury start date and the heal-ing date, healing time was calculated for eachrecord and classified into different sets of time

Figure 3. Injury recovery period at various positions.

178 Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples

periods: 0-1 month, 1-2 months, 2-4 months,4-6 months and 6-24 months. The overall in-jury records with valid recovery time limited thefinal data available for modeling to 384 records.

We first began with descriptive analytics. Vari-ous visualizationswere built to draw inferences,depicting the healing time period associatedwith player’s positions, severity of injuries andthe healing time period, treatment offered andthe associated healing time period, major in-juries afflicting body parts, type of field andgame etc. Some of these are shown in the fig-ures below.

Figure 3 depicts the period of recovery from in-juries sustained by players at various positions.The injury recovery variable ranges are 0 to 1week, 1 to 2 weeks, 2 to 3 weeks and 3 to 4weeks. It is realized that fifty percent of in-juries are resolved in less than 2 weeks. Also,defensive positions are more likely to sustaininjuries as compared to other positions.

The visualization in Figure 4 indicates the var-ious motions that lead to players’ injuries. Theplayers are further segmented into different play-ing positions. It is seen from Figure 4 that tack-ling is the main cause of injuries in defensiveplayers, such as the defensive secondary andlinebacker positions while running is the main

cause of injuries in offensive players, such asthe running back and wide receiver positions.

Our analysis included manymore visualizationsthat enabled better and richer understanding ofthe problem domain and the data. We then be-gan to apply predictive modeling methods tobe able to understand the injury recovery rates.IBM SPSS Modeler was used for predictingeach of the healing time period categories. Thedata was divided into training and validationdata sets and several predictive models usingtechniques like CHAID, CRT and Neural Net-works were built. Of all the techniques, a neuralnetwork using multi-layer perceptron and a sin-gle hidden layer with 15 neurons yielded thebest results.

Some of the predictor variables in the neuralnetwork model were: current status of injury,severity, body part, body site, type of injury, ac-tivity, event location, action taken and positionplayed. The prediction results shown in Table 1depict the accuracy of the model in predictinghow long an athlete remains injured once an in-jury is sustained. The relevant values are thediagonal figures which indicate the percent ac-curacy with which the recovery period for dif-ferent periods of injuries could be predicted.

Figure 4. Motions at time of injury to various player positions.

Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples 179

For instance, with an accuracy of 75%, injuriesthat take 2 to 4 months to heal can be correctlypredicted to take that long to heal based on thevariables listed above in the prior sections. Theoverall percentage accuracy of the correct pre-dictions of healing time period categories was89.8%. The classification for the different heal-ing periods was generated after initial analysisof the data set and in consultation with industryexpects. The relevant injury period of inter-est with regards to college sports was chosenas a maximum of 2 years since most playersonly start active game time participation in thelast two years of being on the football team.In this preliminary study, a classification pe-riod in months was chosen in order to give abroader level view of predicting injury recoveryperiods. In order to derive an in-depth under-standing of the injury recovery pattern, furtherstudies would drill down and consider injuryrecovery periods in weeks and days.

Predicted (months)Observed(month) 0-1 1-2 2-4 4-6 6-24

0-1 97.3% 2.2% 0.5% 0.0% 0.0%1-2 44.8% 55.2% 0.0% 0.0% 0.0%2-4 18.8% 6.2% 75.0% 0.0% 3.9%4-6 16.7% 0.0% 0.0% 83.3% 0.0%6-24 0.0% 0.0% 25.0% 0.0% 75.0%

Table 1. Classifications for healing category.

Furthermore, we also performed a sequenceanalysis with injuries and body parts as prece-dents and antecedents respectively. The playersare most likely to suffer the same injury again(repeated injuries) associated with same bodypart. It was also interesting to note that lowerextremity injuries resulted in upper extremityinjuries. Some of these include possible injuriesafflicting the hand when a player had alreadyhad an injury to the ankle and injuries afflict-ing shoulder when a player has had an injury tothe femur. For instance, with a support of 25%and confidence of 22.5%, a strain to the femurwould result in a consequent shoulder injury.

Based on the analysis, many business recom-mendations were suggested which include thefollowing: employing the services of healthcarespecialist at the onset of the injury instead of

letting the training room staff screen the injuredplayers; training players at defensive positionsto increase their reflexive abilities in order toavoid getting injuries, and taking precautionarymeasures for the future injuries that a playermight suffer based on results of the sequenceanalysis.

This research project illustrates an interestingapplication of predictive modeling methods inathletic health issues. The next section de-scribes an emerging area of research in ana-lytics.

4. Big Data Research Opportunities

The bottleneck to knowledge generation andscientific progress is now an issue of effectivedata analysis rather than data availability. Con-temporary lifestyles of individuals and techno-logical advancements in most parts of the worldhave resulted in the generation of huge amountsof data in astounding proportions every hour, inthe form of emails, weblogs, tweets, sensor gen-erated data, phone logs, GPS data, traffic data,video, and text. The data sources are a mixtureof both structured and unstructured data for-mats. Hitherto, large volumes of structured datahave been mainly collected, stored and minedusing traditional and relational data warehous-ing technologies. However, the fast growth ofdata has resulted in a new breed of analyticscalled Big Data analytics. Big Data analyticssupports large scale data analysis that focuseson large volumes of historical and near real-timedata of different varieties and that needs to beanalyzed at a fast pace.

At the core of Big Data research are busi-ness concepts, technologies and statistical tech-niques necessary to design highly scalable sys-tems that can collect, process, store and analyzelarge volumes of both structured and unstruc-tured data. One of the most popular platformsused to support processing of Big Data is anopen source platform called Apache Hadoop.Apache Hadoop has two main basic compo-nents; Hadoop Distributed File System (HDFS)and MapReduce. HDFS, a distributed and scal-able file system contributed byGoogle, supportsthe storage of data and computations on multi-ple commodity computers. MapReduce is ahigh level programming model that allows forparallel and distributed algorithms to be writ-ten for processing massive sets of data on a

180 Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples

cluster of computers. Besides these two com-ponents, the open source community has de-veloped other sub-projects that help efficientlymanage massive amounts of data. Some ofthese sub-projects are Hive, Pig, Mahout andHbase. For instance, Hive is an SQL-like datawarehouse platform that supports data opera-tions such as data summarization and complexqueries of data sets. The sub-projects may havedifferent or over-lapping roles, yet they all aredesigned to facilitate efficient storage, manipu-lation and analysis of massive amounts of data.The Apache Hadoop platform is not a replace-ment for traditional relational database systems,but rather a complement that can better handleboth structured and unstructured data sources.

Just like other streams in the data analytics field,research in Big Data analytics is applicable toalmost all fields of practice and study. In areaslike healthcare, logistics, marketing, retail anddefense, it has become imperative to exploreand analyze large amount of data in ways thatsupports effective decision making and strate-gic management. Such research and practiceis growing rapidly. The highly scalable plat-form used for Big Data analytics is necessi-tated by three fundamental characteristics ofdata sources; volume, velocity and variety. BigData research looks at how to analyze data indifferent domains with such characteristics andin a way that generates deeper knowledge andadds value to the decision making process inbusinesses. In the following, we describe oneproject on which we are currently working.

4.1. Analyzing the Role of Social Media inHealthcare Delivery

One of the nascent research streams in BigData analytics is the role of electronic socialmedia platforms in healthcare delivery. Earlystudies have established that factors such asevolving patient demographics, financial con-straints and longer incidence of chronic diseaseshave enhanced the impact of social relationshipsin healthcare delivery [15]. Social relation-ships, which are associative bonds establishedbetween individuals with common aspirationsand for the purpose of giving and receiving sup-port, foster disease self-management programsthat are vital to the management of chronic dis-eases.

Even though healthcare delivery and supporthas been traditionally made through channels

such as physicians, nurses and other clinical ex-perts, the advent of the Web 2.0 platform hasushered an era where patients are beginning toinculcate new healthcare delivery strategies inthe self-management of chronic diseases. Infact, some studies have suggested that tradi-tional sources of health management informa-tionmay no longer be enough in providing qual-ity healthcare to patients [4]. Pharmaceuticalcompanies and other healthcare-related busi-nesses already market drugs and solicit con-sumer feedback from social networks about newdrugs and disease management options.

Among a myriad of health related topics dis-cussed on social media platforms is the man-agement of chronic diseases such as diabetes,chronic obstructive pulmonary disease (COPD),major depressive disorder (MDD), epilepsy andcardiac failure [2]. These social media plat-forms offer social support and information topatients and caregivers. Social support has beenidentified as a major factor for improving healthoutcomes in patients who suffer from varioustypes of illness [3][6]. The concept of socialsupport, also sometimes referred to as peer sup-port has been shown to not only help diseaseprevention, but also foster and promote generalwell-being and health [15][14][7]. The conceptof social relationships and how they can be es-tablished on electronic social media platformsin the context of the provision of healthcaresupport is still at an early stage in academicresearch. More research needs to be conductedto establish the role and effectiveness of socialmedia platforms in healthcare delivery.

Previous studies have mostly investigated ef-forts made by healthcare institutions in usingsocial media as a tool for healthcare manage-ment and information delivery [9]. Armstrongand Powell (2009) assert that patients valuethe additional support they receive from on-line communities and perceive online sourcesof information provided by lay people as acomplement to information provided by pro-fessional healthcare providers [1]. In addition,studies have shown that peer-to-peer supportcreates an enabling environment for diseaseself-management. However, very few studieshave investigated how disease management andhealthcare support are initiated by patients ina social media domain. Patients like to be incontact with others with similar problems in or-der to draw both emotional support and obtaininformation on how to manage diseases [1].

Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples 181

It has been established that social media plat-forms are a viable avenue for disease self-mana-gement. The question, however, still remainswhether self-management programs through so-cial media platforms are an effective meansof managing chronic diseases. Disease self-management, especially for chronic diseases,can be an important aspect of the healthcare de-livery process. Self-management programs aredesigned as a complement to other treatmentmethods to put the patients in the driving seatand empower them to take control of their healthcare status. The aim of a self-help program in-cludes

1. teaching the appropriate use of medications,

2. evaluatingmethods and use of new treatmentmethods,

3. engaging in physical exercises that help de-crease instance of fatigue, isolation and pain,

4. better ways of communicating with family,peers and clinical healthcare givers.

There is a need to investigate the effect of peersupport and information exchange on social me-dia in self-managing chronic disease conditionsin an environment that is not directly mediatedby a healthcare institution or any clinical stake-holder. Sarasohn-Kahn (2008) reported that,with respect to quality, expert information pro-vided by a large collection of users on a socialmedia platform is known to be at par with whatwould be provided by any single expert on asubject matter [12]. The same is true for clin-ical information provided by other users on asocial networking platform. Hence, besides thehealthcare advice from expert clinicians such asdoctors, the advice and support provided on so-cial media platforms by non-expert patients isuseful to the achieving of the necessary healthgoals that need to be attained by patients.

As the focus of traditional health informaticsshifts from health professionals to consumers,Big Data research assumes a vital role because itgives one the opportunity and flexibility to ana-lyze extremely huge amounts of data rather thanjust a small sample and make inferences anddecisions on the use and delivery of healthcareon social media platforms. Big Data researchalso delves into in-depth analysis and drawingof deeper conclusions on data generated fromother domains of study.

5. Teaching Perspectives

In any knowledge intensive industry such as an-alytics, the fundamental strength comes fromhaving students who are interested in the tech-nology and choose that industry as their pro-fession. Universities play a key role in mak-ing this possible. Academic programs thatprepare professionals for the industry are in-creasing in number and type. These programsspan various components of business schoolssuch as information systems, marketing, man-agement sciences, etc. Such programs are alsoemerging far beyond business schools to includecomputer science, statistics, mathematics, andindustrial engineering departments across theworld. These programs also include training ofgraphics developerswho design new ways of vi-sualizing information. Universities are offeringundergraduate and graduate programs in analyt-ics in all of these disciplines, though they maybe labeled differently. A major growth fron-tier has been certificate programs in analyticsto enable current professionals to re-train andre-tool themselves for analytics careers. Cer-tificate programs enable practicing analysts togain basic proficiency in specific software bytaking a few critical courses. Power (2012)published a partial list of the graduate programsin analytics, but there are likelymany more suchprograms, with newones being added daily [11].Watson (2013) describes how he teaches his an-alytics courses and also argues for the growingpopularity of such programs [17]. Graduatesfrom these programs fill such positions as dataanalysts, data scientists and data mining archi-tects.

Another group of players assist in developingcompetency in the field of analytics. Theseare certification programs awarding certificatesof expertise in specific software. Most (ifnot all) major technology providers (IBM, Mi-crosoft, Microstrategy, Oracle, SAS, Teradata,etc.) have their own certification program.These certificates ensure that potential newhireshave a certain level of tool skills. The CertifiedAnalytics Professional (CAP) certificate pro-gram introduced by INFROMS has the mainaim of equipping individuals with a generalcompetency in data analytics. Any of thesecertifications gives a college student additionalmarketable skills.

Many of these industry leaders also have aca-demic alliance programs to enable teaching their

182 Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples

software and its uses. IBM, Microsoft, Ora-cle, SAP, and Teradata have major initiativesthat allow faculty members to use and sharesoftware as well as other teaching materialsfor instructional purposes. Watson (2013) pro-vides a summary of several of these resources[17]. For example, TeradataUniversityNetwork(www.teradatauniversitynetwork.com) alsoknown as TUN is sponsored and supported byTeradata, a major provider of analytics hard-ware/software and applications consulting ser-vices. TUN is free to use and provides facultymembers and students access tomany case stud-ies, white papers, teaching materials, and soft-ware for learning concepts, applications, andtools for analytics. Faculty members contributesuch resources to TUN. Even teaching notesare available for many of the teaching materi-als. A recent addition is several videos that de-pict analytics applications in realistic settings.These videos have been developed by Dr DavidSchrader of Teradata. These are labeled Busi-ness Scenario Investigations (BSI), a takeoff onthe popular TV show CSI. Not only are these en-tertaining, but they also provide the class somequestions for discussion. For example, in athttp://www.teradatauniversitynetwork.com/teach-and-learn/library-item/?LibraryItemId=889, one can assume the roleof an airline customer service center profes-sional. A video on YouTube gives informa-tion about customers’ profiles and relationshipwith the airline. An incoming flight is runninglate, and several passengers are likely to misstheir connecting flights. There are seats on oneoutgoing flight that can accommodate two ofthe four passengers. Students watch the video,pause it as appropriate, and answer the questionson which passengers should be given priority.They then resume the video to get more infor-mation. Students’ decisions might change asthey learn more about those customers’ profiles.TUN also provides information about how theanalysis was prepared in a link at the bottom ofthe video page. This engaging multimedia ex-cursion provides an example of how additionalinformation available through an enterprise datawarehouse can assist in decision making. Thereare almost a dozen such videos and hundredsof other teaching resources available throughTUN.

6. Conclusion

Overall, there is much to be excited about in theanalytics industry at this point. Its applicationsstretch to almost all fields of practice and re-search including healthcare, retail and manufac-turing. Support for teaching analytics is avail-able from many major vendors. The industryclusters of analytics ecosystem show that thereare many opportunities for students, academicsand industry professionals. With increases indata availability and the realization that morebusiness insights could be generated from anotherwise junk volume of data, research in BigData has increased. This trend in Big Dataanalytics is expected to continue to grow andcomplement traditional data analytics methodsand existing data management strategies. Theincrease in the amount of attention analytics isreceiving from the press also indicates that thisis an expanding field which can provide a com-petitive edge to many types of businesses. Asillustrated in this paper, descriptive, predictiveand prescriptive analytics applications and re-search opportunities are wide and growing.

References

[1] N. ARMSTRONG, J. POWEL, Patient perspectives onhealth advice posted on Internet discussion boards:a qualitative study. Health Expectations, 12(13)(2009), 313–320.

[2] G. BOND, R. BURR, F. WOLF, M. PRICE, S. MC-CURRY, L. TERI, The Effects of a Web-Based Inter-vention on the Physical Outcomes Associated withDiabetes among Adults Aged 60 and Older: ARandomized Trial. Diabetes Technology and Ther-apeutics, 9(1) (2007), 52–59.

[3] C.-L. DENNIS, Peer support for postpartum depres-sion: volunteers’ perceptions, recruitment strate-gies and training from a randomized controlledtrial. Health Promotion International, 28(2) (2013),187–196.

[4] M. Evans, L. Donelle, L. Hume-Loveland, Socialsupport and online postpartum depression discus-sion groups: A content analysis. Patient Educationand Counseling, 87(3) (2012), 405–410.

[5] INFORMS, Analytics Section Overview.http://www.informs.org/Community/Analytics, Accessed February 2013.

[6] B. LAKEY, S. COHEN, Social support theory andmeasurement. In S. COHEN, L. G. UNDERWOOD,B. GOTTLIEB (EDS.), Social Support Measurementand Intervention: A Guide for Health and SocialScientists. OxfordUniversity Press, (2000) Toronto.

Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples 183

[7] M. E. LARA, J. LEADER, D. N. KLEIN, The associ-ation between social support and course of depres-sion: Is it confounded with personality? Journal ofAbnormal Psychology, 106 (1997), 478–482.

[8] E. K. LEE, F. PIETZ, B. BENECKE, J. MASON, G.BUREL, Advancing Public Health and Medical Pre-paredness with Operations Research. Interfaces,43(1) (2013), 79–98.

[9] M. P. MARTINASEK, A. D. PANZERA, T. SCHNEI-DER, J. H. LINDENBERGER, C. A. BRYANT, R. J.MCDERMOTT, M. COULURIS, Benefits and Barriersof Pediatric Healthcare Providers toward Using So-cial Media in Asthma Care. American Journal forHealth Education, 42(4) (2011).

[10] P. PEKGUN, R. P. MENICH, S. ACHARYA, P. G.FINCH, F. DESCHAMPS, K. MALLERY, J. V. SISTINE,K. CHRISTIANSON, F. J. CARLSON REZIDOR, Ho-tel Group Maximizes Revenue Through ImprovedDemand Management and Price Optimization. In-terfaces, 43(1) (2013), 21–36.

[11] D. P. POWER, What universities offer master’sdegrees in analytics and data science. (2012).http://dssresources.com/faq/index.php?action=artikel&id=250, Accessed Feb 2013.

[12] J. SARASOHN-KAHN, The wisdom of patients:Health care meets online social media. Oakland,CA: California HealthCare Foundation. (2008).http://www.chcf.org/publications/2008/04/the-wisdom-of-patients-health-care-meets-online-social-media, Accessed Oct2012.

[13] SAS.COM, Eight Levels of Analyticshttp://www.sas.com/news/sascom/analytics levels.pdf, Accessed Feb 2013.

[14] I. SKARSATER, A. LANGIUS, H. AGREN, L. HAG-GSTROM, K. DENCKER, Sense of coherence and so-cial support in relation to recovery in first-episodepatientswithmajor depression: A one-year prospec-tive study. International Journal of Mental HealthNursing, 14 (2005), 258–264.

[15] M. STEWART, V. TILDEN, The contributions of healthcare science to social support. International Journalof Nursing Studies, 32 (2005), 535–544.

[16] T. VARELAS, S. ARCHONTAKI, J. DIMOTIKALIS, O.TURAN, I. LAZAKIS, O. VARELAS, Optimizing ShipRouting to Maximize Fleet Revenue at Danaos.Interface, 43(1) (2013), 37–47.

[17] H. WATSON, The Business Case for Analytics.BizEd, May/June (2013), 49–54.

[18] R. SHARDA, D. ASAMOAH, N. PONNA, BusinessAnalytics: Research and Teaching Perspectives,Proceedings of the 35th International Conferenceon Information Technology Interfaces, Dubrovnik,Croatia, June 2013.

Received: August, 2013Revised: September, 2013

Accepted: September, 2013

Contact addresses:

Ramesh ShardaInstitute for Research in Information Systems

Spears School of BusinessOklahoma State University

Stillwater, OK 74078USA

e-mail: [email protected]

Daniel Adomako AsamoahInstitute for Research in Information Systems

Spears School of BusinessOklahoma State University

Stillwater, OK 74078USA

e-mail: [email protected]

Natraj PonnaInstitute for Research in Information Systems

Spears School of BusinessOklahoma State University

Stillwater, OK 74078USA

e-mail: [email protected]

RAMESH SHARDA is Director of the PhD in Business for ExecutivesProgram, the Institute for Research in Information Systems (IRIS),ConocoPhillips Chair and Regents Professor of management scienceand information systems in the Spears School of Business at OklahomaState University. He has co-authored two text books (Decision Sup-port and Business Intelligence Systems, 9th edition, Prentice Hall andBusiness Intelligence: A Managerial Approach, 2nd Edition, PrenticeHall). His research has been published in major journals in manage-ment science and information systems including Management Science,Operations Research, Information Systems Research, Decision SupportSystems, Interfaces, INFORMS Journal on Computing, and many oth-ers. He is a member of the editorial boards of journals such as theDecision Support Systems, ACM Transactions of MIS, and InformationSystems Frontiers. He is currently serving as the Executive Director ofTeradata University Network. He recently received the 2013 INFORMSHG Computing Society Lifetime Service Award.

DANIEL ADOMAKO ASAMOAH is pursuing a PhD in management sci-ence and information systems in the Spears School of Business at theOklahoma State University. He obtained a Master of Science degreein telecommunications management from the same university. Priorto his graduate studies, he completed a Bachelor of Science degree inelectrical and electronic engineering at the Kwame Nkrumah Universityof Science and Technology, Ghana. His research has been publishedin journals such as Decision Support Systems and Simulation: Trans-actions of the Society for Modeling and Simulation International. Hiscurrent research interests include the use of Big Data analytics as aplatform to study the role of social media in healthcare provision. Hisresearch also involves business intelligence, data analytics, decisionssupport systems in healthcare and the use of analytical modeling insolving operations management and information systems problems.

NATRAJ PONNA is a Master’s student in management information sys-tems at Oklahoma State University. He is a BASE SAS 9 certifiedprogrammer and a certified SAS predictive modeler with experiencein using SAS Enterprise Miner 7. He also has the SAS and OSUData Mining Certificate and has undertaken several projects in businessintelligence and data mining.