Journal of Infection and Public Health

Journal of Infection and Public Health

ARTICLE IN PRESSG ModelIPH-867; No. of Pages 8 Journal of Infection and Public Health xxx (2018) xxx–xxx

Contents lists available at ScienceDirect

Journal of Infection and Public Health

journa l homepage: ht tp : / /www.e lsev ier .com/ locate / j iph

eview

ealthcare predictive analytics: An overview with a focus on Saudi rabia

ana Alharthi epartment of Health Information Management and Technology, College of Public Health, Imam Abdulrahman Bin Faisal University (IAU), formerly known s University of Dammam (UoD), P.O. Box 2435, Dammam, 31441, Saudi Arabia

r t i c l e i n f o

rticle history: eceived 9 October 2017 eceived in revised form 1 February 2018 ccepted 21 February 2018

eywords: redictive analytics ealthcare analytics ata mining

a b s t r a c t

Despite a newfound wealth of data and information, the healthcare sector is lacking in actionable knowl- edge. This is largely because healthcare data, though plentiful, tends to be inherently complex and fragmented. Health data analytics, with an emphasis on predictive analytics, is emerging as a trans- formative tool that can enable more proactive and preventative treatment options. This review considers the ways in which predictive analytics has been applied in the for-profit business sector to generate well-timed and accurate predictions of key outcomes, with a focus on key features that may be applica- ble to healthcare-specific applications. Published medical research presenting assessments of predictive analytics technology in medical applications are reviewed, with particular emphasis on how hospitals

audi Arabia have integrated predictive analytics into their day-to-day healthcare services to improve quality of care. This review also highlights the numerous challenges of implementing predictive analytics in healthcare settings and concludes with a discussion of current efforts to implement healthcare data analytics in the developing country, Saudi Arabia.

© 2018 The Author. Published by Elsevier Limited on behalf of King Saud Bin Abdulaziz University for Health Sciences. This is an open access article under the CC BY-NC-ND license (http://creativecommons.

org/licenses/by-nc-nd/4.0/).

ontents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Big data in the healthcare sector and the need for predictive analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Data mining vs. traditional statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .00 Open source and commercial data mining tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

The healthcare sector vs. other sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Challenges for implementing predictive analytics in the healthcare sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Challenges in information technology, architecture, and platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Healthcare data challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Building predictive analytics models into real clinical practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Recent studies of data mining and predictive analytics in the healthcare sector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .00 Predictive analytics research in cancer detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Predictive analytics research on heart disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Predictive analytics at work: key partnerships between IT and healthcare institutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Partnerships with the International Business Machines Corporation (IBM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Partnerships with Microsoft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Challenges to implementing predictive analytics in the Saudi Arabia healthcare sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Funding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

Competing interests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ethical approval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

E-mail address: halharthi@iau.edu.sa

ttps://doi.org/10.1016/j.jiph.2018.02.005 876-0341/© 2018 The Author. Published by Elsevier Limited on behalf of King Saud Bin C BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

alytics: An overview with a focus on Saudi Arabia. J Infect Public

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 00

Abdulaziz University for Health Sciences. This is an open access article under the

ING ModelJ 2 and P

I

h c c o r T t t g l p c r e fl a

c h p s F a i i i

B a

d d d c U b n

e s p w t p w i t d t b a M s

c f i a m e

ARTICLEIPH-867; No. of Pages 8 H. Alharthi / Journal of Infection

ntroduction

The move from paper to electronic-based patient health records as made the healthcare industry rich in data. Potential sources of linical information include physicians’ notes, computerized physi- ian order entries, and imaging devices, just to name a few. Myriad ther administrative and government compliance and regulatory equirements also lead to vast amounts of stored patient data. hese datasets are particularly complex and fragmented compared o other industries [1]. Perhaps in part because of these reasons, he healthcare sector has been especially slow to develop technolo- ies to leverage this wealth of data and information, resulting in a ack of actionable knowledge that can be used to make meaningful rogress toward better patient outcomes and improved quality of are overall. Contrast this with other industries, such as insurance, etail, banking, engineering, and aviation, in which a widespread volution of information systems, with an emphasis on scalable, exible, and intelligent systems, has resulted in enhanced efficiency nd bigger returns on investments [2].

This review summarizes the current state of knowledge con- erning “big data” and predictive analytics approaches within the ealthcare sector. Specific case studies of applications currently in lace in the for-profit business sector are presented to aid in con- idering what might be possible in healthcare-specific applications. inally, we highlight the unique challenges inherent in developing nd applying effective predictive healthcare analytics applications n Saudi Arabia, where progress toward a unified electronic med- cal records system has been slow, and a small skilled labor force mpedes large-scale implementation efforts.

ig data in the healthcare sector and the need for predictive nalytics

“Big data”, or extremely large datasets that require specially esigned software to analyze, is becoming increasingly common as igital storage is both cheap and plentiful. In healthcare settings, igitized clinical data is generated at nearly every point along the ontinuum of care. These datasets can add up quickly. In 2011, the S stored roughly 150 exabytes (1018) of health data, and this num- er is expected to increase to more than a yottabyte (1024) over the ext few years [3].

In addition to the sheer size of the healthcare datasets being gen- rated, the data are exceptionally complex. Some sources produce tructured data, for example from laboratory or radiology results, ast medical history, medications lists, or allergy information, hile other sources produce data that are inherently unstruc-

ured, such as notes on treatment progress or other healthcare rovider reports. For the most part, existing hardware or soft- are tools, including relational databases or data warehouses, are

ncapable of organizing or analyzing extremely large datasets like hose being produced today. For example, traditional relational atabases, a technology that was developed 50 years ago, requires hat IT specialists design unique schema dictating how data will e processed [4]. This approach is virtually impossible if data are rriving from many different database and/or warehouse sources. oreover, relational databases were not designed to process semi-

tructured or unstructured data. New technologies are emerging to address some of these

hallenges. For example, Hadoop, an open source data storage ramework, makes it possible to store data “as is” without requir-

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

ng specially designed schema, while keeping the data immediately ccessible for processing [5]. This comes in very handy for decision akers who need to act quickly based on incoming data. This is

specially important for physicians in emergency departments or

PRESS ublic Health xxx (2018) xxx–xxx

intensive care units who need to make life or death decisions based on such data.

Data analytics is another example of an emerging technology designed to handle big data with significant implications for health- care settings. Such tools examine patterns within immense and complicated datasets to gain knowledge and insights. It is becom- ing an essential tool for health organizations to provide better healthcare services and reduce costs. As in other industry sectors, healthcare analytics tools can be descriptive, predictive, or both. Descriptive analytics are generally used to generate regular reports (daily, weekly, yearly, etc.) on specific parameters of the dataset and to display those reports through interactive dashboards or score- cards. These data can be used to compare performance based on specified key performance indicators (KPIs) for inpatient services, such as readmission rate, mortality rate, or average wait time at the pharmacy. Predictive analytics, on the other hand, can take a dataset and make predictions based on past events or model- ing. The clinical applications of predictive analytics are many and significant, as even very individualized predictions of patient out- comes are possible. For example, one may predict whether patient is at a high risk of heart attack, which patient is likely to be read- mitted after a surgery, or whether a patient will stay longer than the average after a surgery [6]. For this reason, predictive analyt- ics in healthcare settings has received a great amount of interest over the past few years. The knowledge gained through applying predictive analytics in health and medicine will change the way medicine is practiced while enhancing our ability to prevent and treat significant diseases and illnesses.

Data mining vs. traditional statistics

Traditional statistical analysis has been utilized in the health- care sector for decades to predict certain outcomes. Notable models used in intensive care units (ICUs) to predict disease severity for adult patients include the Acute Physiology and Chronic Health Evaluation (APACHE I and APACHE II), the Simplified Acute Phys- iology Score (SAPS), and the Mortality Probability Model (MPM). Measures such as standardized mortality ratios (SMRs) can be cal- culated using these models, all of which utilize logistic regression (LR) [7]. Logistic regression provides an odds ratio and a confidence interval (CI) for each predictor in the model, both of which are com- monly used and easy to interpret. However, predictions based on LR have their constraints. First, nonlinear relationships cannot be directly included in logistic models; this restricts many variables that exhibit such relationships and requires converting these vari- ables to a linear scale, which may complicate interpretations. In the face of big data, LR predictions may not be as accurate as data mining approaches, particularly when calculating predictions for individual patient’s prognosis. Recent advancements in the pro- cessing power, speed, and capacity of data mining tools have made it possible to search and analyze even immense datasets to discover meaningful patterns [8]. Predictive analytics has evolved out of data mining approaches to include applications designed to address the specific needs of an organization [9]. The business world has val- idated the use of predictive analytics to enhance efficiency and improve their bottom line; the resulting business intelligence has both cut costs and improved their organizations’ competency and productivity [10].

Open source and commercial data mining tools

There are now several options for open source (free license)

alytics: An overview with a focus on Saudi Arabia. J Infect Public

software to perform complex predictive analytics. Several have been implemented and validated through academic applications and have achieved a very good reputation for supporting predic- tive analytics models. Rapid Miner [11,12] is a user-friendly tool

ARTICLE IN PRESSG ModelJIPH-867; No. of Pages 8 H. Alharthi / Journal of Infection and Public Health xxx (2018) xxx–xxx 3

nd com

w v w e e E u l m t M n ( s K U w t a p e t S a w s m

T

a i d a c p d p t t i b a p c w T l t

Fig. 1. List of open source a

ith a visually appealing user interface. It has pre-processing and isualization capabilities, including strong parameter optimization, hich is particularly useful when working with complicated mod-

ls, such as support vector machines (SVMs). It is also relatively asy to extend with other platform technologies. WEKA (Waikato nvironment for Knowledge Analysis) [11,12] is another strong tool sed extensively in the academic world. It includes many machine

earning algorithms and procedures used to evaluate models with atrices. However, its user interface offers only weak parame-

er optimization, it is more difficult to extend compared to Rapid iner, and a continued lack of good documentation plus a regular

eed to upgrade the software detracts from its usefulness. KNIME Konstanz Information Miner) [11,12] is a particularly user-friendly oftware that is easy for beginners to learn. Like Rapid Miner, NIME is visually appealing and intuitive and is easily extendable. sers can connect with other platforms, including Hadoop, from ithin KNIME. However, like WEKA, KNIME offers poor parame-

er optimization. In addition to these open source options there re several commercial software tools available at various price oints. Granter group, a leading organization reporting on the lat- st information and trends in information technology, reported that he top analytics software currently available is provided by IBM, AS (SAS Institute Inc.), and Rapid miner (commercial version). In ddition, several other companies are already in direct competition ith these top three, and still others are poised to enter the market

oon [13]. Fig. 1 lists the different open source and commercial data ining tools.

he healthcare sector vs. other sectors

Healthcare organizations have begun to implement predictive nalytics to manage and process big data in hopes of discover- ng hidden relationships, trends, and predictions that support the elivery of improved healthcare services. However, these efforts re relatively immature compared to those that have already suc- eeded in the business sector. Large for-profit companies that rovide business services to millions of people have utilized pre- ictive analytics successfully for years in order to increase their rofits and reduce their costs [14]. The Target Corporation, one of he largest discount retailers in the US, used predictive analytics o better reach a specific segment of their customers (mothers of nfants). Company revenue increased from $44 billion to $67 billion etween 2002 and 2010 as a direct result of those efforts [15]. In nother example, FedEx, a US-based multi-national courier com- any, applied predictive analytics to successfully anticipate which ustomers were likely to stay loyal to their services and which

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

ere likely to leave them and opt for another competitor [16]. he world’s biggest online job search website, Monster.com, has

everaged data and business intelligence to develop new initiatives o apply text and data mining algorithms that detect fraudulent

mercial data mining tools.

and spam-related job postings [17]. Amazon.com has been innova- tive in using predictive data analytics to personalize their customer shopping experience by showing alerts such as “You might also want. . .” while visiting their website [14]. Indeed, the list of indus- tries in which firms have leveraged such approaches continues to grow, including insurance companies that intensively use pre- dictive modeling to anticipate fraudulent claims, and financial institutions that use models to predict a “credit score” for how likely it was a customer would end up paying what was owed.

Examples of predictive analytics applications in the public sec- tor are a bit scarcer. In education, Flinders University, an Australian university, deployed a predictive model to predict which first year students were likely to drop out. In response, the Univer- sity implemented an intervention program that included coaching and support for the most “at-risk” students. Internal assessments suggest the program has benefitted both the students and the Uni- versity [18]. In another example, JKF airport in New York City, NY implemented a predictive model to anticipate operational distur- bances due to bad weather, air traffic congestion, and/or delayed flights. In this case, early detection of potential problems makes it possible to coordinate response efforts, improve safety, and mini- mize the impact on passenger experience. The airport reports that between 2010 and 2012 plane taxi time has significantly reduced, resulting in $11 M in fuel cost savings and a 48,000-ton reduction in emissions [19].

The above examples highlight initiatives that have significantly impacted different sectors’ revenues and overall competitiveness. However, when considering the role of predictive analytics in the healthcare sector, it is important to keep in mind key differences in what is at stake. Specifically, when commercial companies make suboptimal predictions and decisions, losses are typically in terms of profits; in the healthcare sector, the loss is likely to impact patient outcomes and may even translate to the loss of human life. This difference emphasizes the need for enhanced efforts to develop, implement, and adopt predictive analytics that can improve the quality of healthcare services and, ultimately, save lives.

Challenges for implementing predictive analytics in the healthcare sector

Challenges in information technology, architecture, and platforms

Healthcare organizations face unique challenges when it comes to developing and implementing big data-ready technologies. Most importantly, they need to have functional electronic health records (EHRs) operating and integrated within the facility work-

alytics: An overview with a focus on Saudi Arabia. J Infect Public

ing environment. However, many hospitals struggle with building EHR systems; it is a major information technology (IT) project that requires the alignment of people, technology, and processes, and, thus, substantial investments of time and money. In addi-

ING ModelJ 4 and P

t c e o c m c

i ( t o a a d a d m i a l p t g h p d c s b a s H n w e c [ o p a E s c o s t m d

i T S d h e D d c h T fi t b

l

ARTICLEIPH-867; No. of Pages 8 H. Alharthi / Journal of Infection

ion, the failure rate of IT projects in healthcare is high when ompared to other industries [20], mostly likely because of the xtensive communication and coordination required among vari- us groups involved in data collection, each with unique workflow onstraints. EHRs themselves are inherently challenging to imple- ent in light of the extensive patient privacy and confidentiality

oncerns attached to the data they contain [52]. To address at least some of these concerns, the United States

ntroduced the Health Insurance Portability and Accountability Act HIPPA) in 1996, that specifies policies and procedures to pro- ect patient privacy. This law may serve as a useful model for ther governments as they work to protect the confidentiality nd privacy of their citizens amid the development of data-driven dvancements in healthcare. Another challenge involves hospitals eveloping their own data warehousing systems that can integrate ll of the facility’s databases, including EMRs, into one centralized atabase with a common data format. Storing and organizing the assive datasets is in and of itself a daunting project, but one that

s essential to the development of both descriptive and predictive nalytics applications. Recently, the health sector is now poised to everage big data technologies to better understand and manage atient care. Big data technologies employ parallel data analytics o discover hidden knowledge in large datasets. These technolo- ies have developed alongside the rapid increase in the amount of ealth data that are available [21]. Perhaps because of the com- lexity and sensitivity of health data, there are currently very few ata architectures or platforms capable of extracting and/or pro- essing big data in the health care sector. Open source platforms, uch as Hadoop (developed by Yahoo) and MapReduce (developed y Google), are attractive because they are available via the cloud t a very reasonable price point. In addition, both can work with tructured and unstructured data, key to health sector applications. owever, these tools require complex programming expertise, and either package comes with the kind of technical support one ould receive from higher-priced commercial products. This is an

xample of the types of decisions healthcare institutions must face urrently when trying to build a foundation for big data analytics 22]. One recent report detailed a framework for a health technol- gy system that is built using only cloud-based technologies. The roposed system collects, stores and processes data in the cloud, nd uses data mining tools to discover trends. Clinical data (from MR, labs, radiology, etc.) tend to be heterogenous in nature. This ystem provides a way to convert them into a standard format that an be stored and recalled quickly. It also allows for data processing ffline or in real-time, for example when monitoring patient vital igns in the intensive care unit. Approaches like this could alter he way healthcare IT systems handle health data in the future by

oving the field away from resource-intensive projects, such as ata warehousing [23].

Another major issue with clinical data management is the nability of most relational databases to handle unstructured data. herefore, a new database architecture system, NoSQL (not only QL) has been proposed to store and analyze complex data. NoSQL atabases are easily expandable to include more data and built to andle heterogenous datasets. The main idea of the system is that ach patient has a master patient-driven medical document (PaMe- oc) that consist of several separate documents. From there, the atabase architecture branches like a tree, such that one PeMeDoc an have several branches, including include new PeMeDocs. This elps keep data organized even as new data are continually added. he system can also be easily queried with parallel searches and ltering conditions, using temporal information and MapReduce

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

echnology. This is key to extracting specific information that can e used to build predictive models [24].

Finally, the rapid increase in genomic data brought with it chal- enges in terms of storage and analysis fortunately, we can find

PRESS ublic Health xxx (2018) xxx–xxx

useful examples of successful data management architectures in the field of genomics. Cloud-based technologies, such as Hadoop and MapReduce, have been key to handling genomics data, and are now used by many different institutions worldwide. For analysis of genomic data, researchers have turned to so-called metalearning systems. These systems evaluate the past results of various algo- rithms to generate a metamodel that predicts the best performing algorithm to obtain the desired result with a new dataset. In other words, a metamodel can determine which type of algorithm will be best suited to the task at hand based on the dataset charac- teristics, etc. This approach saves time and resources that would otherwise be used to manually compare algorithm performance for each new dataset. The metalearning system was tested on microar- ray gene expression data and included five meta-algorithms: radial basis function network (RBFN), linear regression (LR), least median square regression (LMSR), neural network (NN), and support vec- tor machine (SVM). While all algorithms performed well, the model predicted that SVM was most well-suited to the task [25].

Healthcare data challenges

Developers of health information systems must take into con- sideration the four properties of big data, the so-called “4 Vs”, each of which presents its own information management challenge: Vol- ume (the datasets are too large to be handled by traditional systems and tools), Variety (the datasets usually include both structured and unstructured formats), Velocity (the datasets often include time-sensitive data that need to be analyzed quickly), and Veracity (the datasets include many sources of error, or noise, that must be accounted for [26].

To address these challenges requires significant investments of money and resources to manage and handle, including peo- ple, technology, and dedicated processes. One study reports on the attempt to overcome such data challenges by creating the ICARE system, which was developed based on the idea of collaborative filtering. Using only ICD 9 coding applied to patients’ health infor- mation, this system can produce a ranked list of what diseases a given patient might be at risk of developing by examining data on other similar patients. It was validated using a 13 million patient Medicare database containing 32 million visits collected over a period of 4 years. Using this dataset, the system successfully pre- dicted correctly 50% of the diseases a patient might have in the future. The authors of the study also note that accuracy is only likely to increase with the addition of more data to ICARE, such as demographics, genetic data, or family history [27].

Finally, as institutions work toward leveraging healthcare data, the issue of data quality is becoming increasingly apparent. It is not uncommon for patient data to be scattered among multiple clinical and enterprise systems, reside in different formats (each of which may utilize distinct definitions), and accumulate incredibly rapidly, even doubling every 18 months [28]. This sets up a difficult task in monitoring different processes to produce data that are accurate, timely, accessible, reliable, consistent, relevant, and detailed [29]. Top management personnel in healthcare organizations must work to develop a data governance framework; no matter how expensive and advanced a hospital’s IT system is, the results can only be as good as the system that produces them.

Building predictive analytics models into real clinical practice

Currently, the key building blocks for predictive analytics come

alytics: An overview with a focus on Saudi Arabia. J Infect Public

from deep data, that is patient-level data that can be used to generalize to the larger population [30]. However, currently most patient-deep data come from EMRs, which only include variables particular to the population under study but can be heterogeneous

ING ModelJ and Pu

i s

m a u c a a fi t i [

m o a i m q v c

c d s p a s a p c

e u n c s m i s t [ f t [ t i e m [ w r

R t

S m o o h v a

ARTICLEIPH-867; No. of Pages 8 H. Alharthi / Journal of Infection

n terms of content and format. Thus, EMR data is often not well- uited to generalization [31].

Complex and heterogenous datasets also makes optimizing the odels themselves more challenging within the context of clinical

pplications. One possible approach for building predictive models sing clinical data is to build a number of different models, each spe- ific to a particular type of data, and then unify the results to form

single predictive model [31]. There are several ways to optimize model’s performance for a particular application. For example, xing the problem of overfitting (i.e., a model performs well on a

raining set, but does not perform well on a new, unseen dataset) s fairly straightforward using the 10-fold cross-validation strategy 32].

This method creates 10 models and uses the average perfor- ance to generate the final model [33]. Other considerations in

ptimizing model performance are not as straightforward. Each lgorithm in a given model has different characteristics that make t more or less suited to a given application. Thus, the developers

ust build a strategy that includes considerations of the type and uality of the dataset, as well as algorithms that prioritize/ignore ariables according to each specific application. Clinician input is ritical to this process [33].

In addition, assessments of model performance must include ost-sensitive analysis of model accuracy. In other words, etermining how well a model performs must include both con- iderations of accuracy, i.e., how often is the model correct, and the enalty for being wrong. Performance of the final model should be ssessed in terms of sensitivity (high percentage of true positives), pecificity (low percentage of false positives), and/or receiver oper- ting characteristic (ROC) curves (ratio of true positive rate to false ositive rate, with optimally performing models generating curves lose to unity) [34].

Another important challenge for implementing predictive mod- ls in clinical applications is how to train clinicians in how to nderstand, interpret, and use the results of the model. Again, the eed for domain expert input during the model development pro- ess is critical. Complex algorithms, such as neural networks or upport vector machines, arguably produce higher performance odels, but their results are harder for the average clinician to

nterpret. On the other hand, less complex algorithms, such deci- ion tree and logistic regression, are more easily interpreted and, hus, more likely to be integrated into routing clinical practice 31]. In one example, a California hospital developed, and success- ully implemented, a predictive model of readmission risk factors hat was based on a simple, ree-Lasso logistic regression algorithm 35]. Ideally, clinicians would work collaboratively with data scien- ists, but this is not always possible; however, simpler approaches, ncluding those that incorporate visual analytics tools, make it asier for clinicians to contribute directly to building predictive odels, or even to minimize the need for data scientists altogether

36]. Examples of such tools are RapidMiner, KNIME, and Weka, hich can be used alongside tools like SNOMED to build clinically

elevant models [37].

ecent studies of data mining and predictive analytics in he healthcare sector

Although only a few healthcare organizations in the United tates have adopted predictive analytics in their real-time data onitoring systems, research interest in this area has grown rapidly

ver the past fifteen years [38]. Some of this work has been devel-

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

ped around predictive models for disease diagnoses or patient ealth outcomes. The cases reviewed in the following sections pro- ide examples of how predictive modeling has been successfully pplied to health data.

PRESS blic Health xxx (2018) xxx–xxx 5

Predictive analytics research in cancer detection

Detecting cancer at an early stage increases the survival rates of patients and perhaps increase the chances of long-term remission. Data mining tools have been developed and tested in various stud- ies as a way to predict which patients will develop cancer based on genetic and non-genetic factors. Using these models to predict can- cers before patients are subjected to any screening or blood tests can reduce the costs and emotional stress associated with running unnecessary screening. In addition, patients determined to be at high risk can be offered treatments that might make them less likely to develop the disease or reduce its severity. In one recent study, a hybrid system was developed that used the Artificial Immune Recognition System, as well as data mining algorithms, to build a predictive model to distinguish between breast cancer and non- breast cancer diagnoses. The model utilized the Wisconsin Breast Cancer dataset, containing data from 699 tissue samples. The model was designed to take a given classifier and return a yes/no can- cer prediction. In this case the model was 100% accurate [39]. In another study, conducted in Taiwan where oral cancer results in 8000 deaths per year, a team of researchers utilized data from the country’s oncology research database to predict the 5-year survival rate among oral cancer patients. The model predicted survivability with 95.7% accuracy, and the data mining algorithms outperformed predictions based on standard logistic regression [40].

Taking a slightly different approach, another study using pre- dictive analytics found that for patients with acute leukemia, the mortality risk of hematopoietic stem cell transplantation (HSCT) can be optimally estimated using a mere 3–5 variables [41]. The study used six different data mining algorithms and all consistently returned the same top three variables: disease stage, donor type, and conditioning regimen. Indeed, while the HSCT registry contains 23 variables, this study suggests that fewer variables collected from a large dataset are sufficient for maximum predictive power. This study’s algorithms do not definitively identify which patients will survive the transplantation procedure, however the top three pre- dicted factors can now be unpacked into underlying variables that were never studied before, including genetic, biological, and clonal factors [41].

Predictive analytics research on heart disease

Cardiovascular disease is another health threat affecting a large population of patients. As with cancer detection, there is great potential to improve early detection accuracy, risk assessment, and ongoing symptom tracking through the careful application of pre- dictive analytics. For example, one study developed a predictive model using six different algorithms to mine patient data from the machine learning repository of University of California, Irvine [42]. Using thirteen risks factors, the model classified patients as having or not having cardiovascular disease. The model positively identified patients diagnosed with heart disease with an accuracy of 93.02%. Another study reported the successful use of predic- tive analytics to develop a new 5-year life expectancy index for patients >50 years old who suffer from multiple diseases, including cardiovascular disease [43]. In this case, the index was developed by applying data mining algorithms to patients’ electronic medi- cal records (EMRs). Indeed, the researchers found that the index performed better than standard predictive approaches, such as the Walter life expectancy method or the Charlson Comorbidity Index. Finally, a research team from Rice University used predictive ana- lytics to create a new scorecard to identify early warning signs for

alytics: An overview with a focus on Saudi Arabia. J Infect Public

a heart attack or indicators of cardiovascular disease progression. The predictive model was built using risk factor information and patients’ biomarkers such as Myoglobin and Creatine Kinase. This innovative work highlights the use of a new technology, namely

ING ModelJ 6 and P

t a p h

p a t c d c

P a

l u i h [ f t g

P C

s c U b p p u m s I h b I U s I C W w p

P

h e m u m u p d r e t r

ARTICLEIPH-867; No. of Pages 8 H. Alharthi / Journal of Infection

he Internet of Things, in which patient information is captured nd processed in scalable lab-on-a-chip devices that can be used in oint-of-care testing to quickly alert healthcare teams of potential eart attacks or patient complications [44].

These are just a few notable examples from the literature exem- lifying the power of predictive analytics in healthcare-related pplications. A growing body of evidence supports the idea that hese approaches have the potential to advance the standard of are beyond diagnostic and treatment recommendations that tra- itionally rely heavily on physician experience and/or subjective riteria.

redictive analytics at work: key partnerships between IT nd healthcare institutions

This section discusses various success stories of major techno- ogical companies, such as IBM and Microsoft, that have teamed p with healthcare organizations to implement predictive analyt-

cs models. As one published study notes, most predictive analytics ealthcare applications have not been reported in the literature 45]. Thus, here we present details of these applications gleaned rom company websites and other similar sources to highlight how hese analytics models positively impact healthcare through inte- ration with clinicians’ workflow.

artnerships with the International Business Machines orporation (IBM)

According to the company’s website, IBM has partnered with everal hospitals to implement predictive analytics to address spe- ific and pressing needs related to electronic patient data. Emory niversity Hospital has leveraged its partnership with IBM, and IBM usiness partner Excel Medical Electronics, to develop and pilot a redictive model to apply to the nearly 100,000 near real-time data oints per second per patient they collect from their intensive care nit (ICU) [46] They have used the model to predict which patients ight develop complications such as atrial fibrillation (A fib) or sep-

is, making it possible to implement a contingency treatment plan. n another effort, the Hospital for Sick Children (SickKids) in Canada as implemented IBM predictive analytics that use live data from edside monitors and EMR to predict which infants in the neonatal

CU are likely to develop nosocomial infections 24 h in advance [47]. sing this tool, the medical team can interfere and take the neces-

ary medical actions to treat the babies and improve outcomes. n another example from pediatric inpatient care settings, Lurie hildren’s Hospital at Northwestern University has utilized an IBM atson supercomputer program to predict, in a matter of minutes, hich cancer treatment would be best for a patient by linking the

atient’s genetic profile to favorable treatment options [48].

artnerships with Microsoft

Microsoft corporation has also embarked on projects with ealthcare institutions with the goal of optimizing the use of lectronic clinical data to improve patient health. In one imple- entation, the Carolinas HealthCare System in the southeast US

sed predictive analytics to manage their readmissions process by onitoring patients’ clinical changes in real-time [49]. Continually

pdating data collection systems (once per hour for every hour a atient is in the hospital) can detect changes in the model’s pre- icted likelihood of readmission. Thus, patients at high-risk for

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

eadmission, according to the model, can then receive targeted ducation from clinicians to reduce their risk. This model has been ested on 100,000 patients thus far with 80% accuracy in predicting eadmission [50].

PRESS ublic Health xxx (2018) xxx–xxx

In another example, the Estadual Getúlio Vargas hospital, serv- ing one of the poorest areas of Rio de Janeiro in Brazil, developed and implemented a predictive analytics tool to help address their challenge of managing limited ICU facilities and space. While this hospital specializes in trauma cases, it has only 22 ICU beds and per- forms nonstop treatment of patients. The hospital used predictive analytics to manage their patients’ length of stay (LOS) by antic- ipating which patients might be at risk for medical complications requiring a longer treatment course. In response, they could initiate appropriate early interventions, thereby minimizing the impact on total LOS. This approach successfully reduced LOS at this hospital [51].

Two other examples illustrate some of the creative ways other institutions have used Microsoft’s predictive analytics approaches to address public health challenges. The city of Vienna, Austria used predictive analytics to track, trace, and analyze incident reports in real-time, helping city health officials predict the risk of a disease spreading [52]. And finally, Microsoft enabled the Aerocrine manu- facturer, specializing in the manufacturing of devices that monitor inflammation biomarkers in the airways of asthma patients, to monitor patient devices in order to preemptively identify units or sensors needing to be replaced. This approach would effectively eliminate the gap in time between a patient discarding the old device and receiving a new one and ensure optimal continuous treatment [53].

Challenges to implementing predictive analytics in the Saudi Arabia healthcare sector

As the above examples illustrate, there are significant advan- tages to implementing powerful predictive analytics in healthcare settings. However, for Saudi Arabia to move into the world of big data analytics, its hospitals need to have complete, functional EMR systems in place. Unfortunately, most Saudi hospitals are paper- based or use very basic software tools. Only a small portion of hospitals have information systems that are at more advanced lev- els of implementation [54]. A nationwide healthcare strategy aimed at supporting nation-wide transition to EMR systems was initi- ated in 2008, but hospitals are facing significant implementation challenges [55]. Between 2007 and 2011, 52 healthcare IT projects failed, wasting up to $10 million [56]. Two studies have highlighted hospitals that are facing challenges in financial, organizational, and regulatory areas and lack the specialized manpower to execute EMR projects [57,58]. Both studies recommended that Saudi Arabia establish a unified national plan to be led by a new national body.

Saudis can benefit from the experience of European countries that are working to establish their own national EHR systems. Each country in the European Union has published a detailed docu- ment outlining specific goals and measures to track progress in EHR implementation and other related technologies [59]. As part of good governance, many of these countries have established advi- sory bodies that include all stakeholders to monitor the different projects in progress to ensure continuity and success. There are many potentially useful models among these national plans that could effectively serve the needs of Saudi Arabia as well.

Outside of Europe, there are several other case studies of national EHR systems. The Kingdom of Jordan has a positive rep- utation for implementing EHRs nationwide through the Hakeem Program initiated in 2009. Through this program, a nonprofit com- pany was created to support hospitals during their implementation

alytics: An overview with a focus on Saudi Arabia. J Infect Public

process. Although Jordanian hospitals still face implementation challenges, they have made significant progress and have success- fully “wired up” several hospitals such that patient information can be exchanged easily [60].

ING ModelJ and Pu

h a t h E $ i i R l

r s H n d a i

a t s u o p h w s [ t t p i c a t p c r

b t p l o o s i i a t t d e a a h a

w f w n i b

ARTICLEIPH-867; No. of Pages 8 H. Alharthi / Journal of Infection

In the US there are multiple initiatives and programs to help ospitals implement EHRs. As a nation, the US plans to increase the doption of EHRs to between 70 and 90% of all healthcare organiza- ions by 2019 [61]. To bring this plan into action, the US government as introduced legislation (the Health Information Technology for conomics and Clinical Health Act) that authorizes spending up to 40 billion in incentives and $2 billion in staff training and system nfrastructure enhancements [61]. The government is also pay- ng attention to data standardization; it introduced its Big Data esearch and Development Initiative in 2012, whereby $200 mil-

ion was allocated to ensure healthcare data standardization [61]. Big data analytics stem from large, more centralized data

esources, such EMR or CPOE pharmacies, and from external ources, such as government agencies and insurance companies. owever, a recent study (2016) in Saudi Arabia reported that the umber of e-health initiatives, i.e., those most likely to support the evelopment of such resources, is particularly low. This makes the ssessment of current health analytics, including predictive analyt- cs, more challenging in Saudi Arabia [62].

Beyond implementation, hospitals in Saudi Arabia face the dditional challenge of motivating physicians to embrace EHR echnologies once they are put in place. One strategy is to demon- trate the predictive power of big data analytics through research sing data mining. Searching electronic databases (PubMed, web f science, Scopus, Google scholar) revealed that that healthcare redictive analytics research in Saudi Arabia is rare. Two articles ave been published using data mining tools since 2010; one in hich a group of researchers predicated that a certain hyperten-

ive treatment is more effective than others among Saudi patients 63], and the other in which researchers predicted the most effec- ive type of intervention for diabetic patients [64]. To increase he number of studies in this area, collaborations between hos- itals and academics should be encouraged. The Saudi population

s suffering from a high mortality rate—up to 71% in cases of non- ommunicable diseases, such as diabetes, heart disease, and cancer, ccording to a World Health Organization report [65]. The coun- ry could benefit greatly from using predictive analytics to predict atients at high risk and provide earlier treatment to avoid medical omplications, reduce treatment costs, and improve the mortality ate.

To strengthen data mining and healthcare research, the national ody that promotes science and technology in Saudi Arabia, called he King Abdulaziz City for Science and Technology (KACST), could lay an enabling role. Through generous grants, it could foster col-

aboration between academic researchers and physicians to carry ut projects in healthcare predictive analytics. A quick review f local Saudi university curricula showed that multiple univer- ities have strong programs in computer science and computer nformation systems. For instance, faculty members teach artificial ntelligence, data mining, and advanced statistics for undergradu- te programs at King Fahd University of Petroleum and Minerals, he University of King Saud, the University of King Abdulaziz, and he University of Dammam. There is high potential for these stu- ents to become specialized in predictive analytics if they are ncouraged to pursue their graduate studies in machine learning nd data mining. Such a workforce would bring analytics skills nd competency to KACST-funded teams that collaborate with ealthcare providers to perform research in the area of predictive nalytics.

In fact, these talented young people can comprise the next orkforce to build and integrate predictive models in healthcare

acilities for the delivery of quality healthcare services for patients

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

orldwide. The future demand for workers with these skills is sig- ificant; in the US, the McKinsey Global Institute reported [66] an

mpending deficit of 2 million workers qualified to handle big data y 2018, including those with skills in data analysis, data manage-

PRESS blic Health xxx (2018) xxx–xxx 7

ment, and systems management. The report’s recommendation is for universities to create new curricula that include big data and analytics and make industry training a requirement. Similar stud- ies should be carried out in Saudi Arabia to estimate the number of people needed to serve the big data era there and optimize the application of the country’s resources available to build a nation- wide EHR and move the healthcare sector toward big data analytics. However, this will require government-led initiatives, including short- and long-term planning for legislative actions, initiatives, programs, and policies to drive the adoption.

Conclusion

As patients’ records are digitized through EHRs, immense datasets are accumulating with the potential to drive progress toward more personalized and effective care through data-driven predictive models. Predictive analytics has already proven useful in the clinical setting by researchers studying heart disease risk and cancer prognoses. As healthcare institutions enter into partnership with information technology companies to develop more advanced models, challenges still exist related to the inherent complexity of patient data and the need to integrate the results of predictive mod- els within existing physician workflows. For a country like Saudi Arabia, where much of the country’s hospitals continue to rely on paper-based records, these challenges are magnified. There is a real and pressing need for government programs to initiate and support widespread efforts to digitize health records. In addition, the country would benefit from building its data analytics work- force by preparing talented university graduates through generous grants that enable faculty and their students to collaborate with local hospitals and move the entire field forward.

Funding

No funding sources.

Competing interests

None declared.

Ethical approval

Not required.

References

[1] Raghupathi W, Raghupathi V. Big data analytics in health- care: promise and potential. Health Inf Sci Syst 2014;2(1):3, http://dx.doi.org/10.1186/2047-2501-2-3.

[2] Kasem M, Hassanein E. Cloud business intelligence survey. Int J Comput Appl 2014;9(1):23–8, http://dx.doi.org/10.5120/15540-4266.

[3] Fang R, Pouyanfar S, Yang Y, Chen C. Computational health informat- ics in the big data age: a survey. ACM Comput Surv 2016;49(1):1–36, http://dx.doi.org/10.1145/2932707.

[4] Trujillo G, Kim C, Jones S, Garcia R, Murray J. Virtualizing hadoop: how to install, deploy, and optimize hadoop in a virtualized architecture. USA: VMware Press; 2015.

[5] McAfee A, Brynjolfsson E. Big data: the management revolution. Harv Bus Rev 2012;90(10):60–8.

[6] Admes J, Garets D. The healthcare analytics evolution: moving from descriptive to predictive to prescriptive. In: Gensinger R, editor. Analytics in healthcare: an introduction. Chicago: Health Information and Management System Society (HIMSS); 2014. p. 13–20.

[7] Breslow M, Badawi O. Severity scoring in the critically ill: part 1 – inter- pretation and accuracy of outcome prediction scoring systems. CHEST

alytics: An overview with a focus on Saudi Arabia. J Infect Public

2012;141(1):245–52, http://dx.doi.org/10.1378/chest.11-0330. [8] Witten I, Frank E, Hall M. Data mining practical machine learning tools and

techniques. USA: Elsevier; 2011. [9] Finlay S. Predictive analytics, data mining and big data: myths, misconceptions

and methods. UK: Palgrave Macmillian; 2014.

ING ModelJ 8 and P

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[ [

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[66] Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, et al. Big data:

ARTICLEIPH-867; No. of Pages 8 H. Alharthi / Journal of Infection

10] Zhao C-M, Luan J. Data mining: going beyond traditional statistics. New Dir Inst Res 2006;131:7–16, http://dx.doi.org/10.1002/ir.184.

11] Jovic A, Brkic K, Bogunovic N. An overview of free software tools for general data mining. Information and communication technology, electronics and micro- electronics (MIPRO), 37th international convention on IEEE 2014:1112–7.

12] Kalpana K, Bansal L. Comparative study of data mining tools. Int J Adv Res Comput Sci Softw Eng 2014;4(6).

13] Linden A, Krensky P, Hare J, Idoine C, Sicular S, Vashisth S. Magic quadrant for data science platforms. Gartner 2017. https://www.gartner.com/doc/3606026/ magic-quadrant-data-science-platforms [accessed 05.01.18].

14] Siegel E. Predictive analytics: the power to predict who will click, buy, lie, or die. New Jersey: Wiley & Sons; 2013.

15] Duhigg C. How companies learn your secrets. The New York Times Maga- zine; 2012. http://www.nytimes.com/2012/02/19/magazine/shopping-habits. html [accessed 25.11.16].

16] Whiting R. Businesses mine data to predict what happens next. Informa- tionWeek 2006. http://www.informationweek.com/businesses-mine-data-to- predict-what-happens-next/d/d-id/1043681 [accessed 20.12.16].

17] Schick A, Frolick M, Ariyachandra T. Competing with BI and analytics at monster worldwide. IEEE Xplore 2011, http://dx.doi.org/10.1109/HICSS.2011.119.

18] Seidel E, Kutieleh S. Using predictive analytics to target and improve first year student attrition. Aust J Educ 2017;6(2), http://dx.doi.org/10.1177/0004944117712310.

19] Surface Management. Surface management fact sheet. Passure 2016. http://www.passur.com/wp-content/uploads/2016/07/PASSUR SurfaceManagement FactSheet 050516 v2.pdf [accessed 05.01.18].

20] Bonnie K, Kimberly D, Salamone H. Health IT success and failure: recom- mendations from literature and an AMIA Workshop. J Am Med Inform Assoc 2009;16(3):291–9, http://dx.doi.org/10.1197/jamia.M2997.

21] Poucke S, Zhang Z, Roest M, Vukicevic M, Beran M, Lauwereins B, et al. Normal- ization methods in time series of platelet function assays: a squire compliant study. Medicine 2016;95(28) (e4188).

22] Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2014. Licensee BioMed Central Ltd 2[3].

23] Zhang Y, Qiu M, Tsai C-W, Hassan M, Alamri A. Health-CPS: healthcare cyber- physical system assisted by cloud and big data. IEEE Syst J 2017;11(1).

24] Lin CH, Huang LC, Chou SC, Liu CH, Cheng HF, Chiang IJ. Temporal event trac- ing on big healthcare data analytics. In: Big data applications and use cases. Springer International Publishing; 2016. p. 95–108.

25] Vukicevic M, Radovanovic s, Milovanovic M, Minovic M. Cloud based met- alearning system for predictive modeling of biomedical data. Scient World J 2014.

26] Rajaraman V. Big data analytics. Resonance 2016;21(8):695–716, http://dx.doi.org/10.1007/s12045-016-0376-7.

27] Chawla NV, Davis DA. Bringing big data to personalized healthcare: a patient- centered framework. J Intern Med 2013;3:S660–5.

28] Brooks M. A case for business intelligence across the continuum of care. In: McKinney C, Whitecar M, editors. Implementing business intelligence in your healthcare organization. Chicago: Healthcare Information and Management System Society (HIMSS); 2012. p. 13–23.

29] Ransbotham S, Kiron D, Prentice K. Beyond the hype: the hard work behind analytics success. MIT Sloan Manag Rev 2016;57(3):3–19.

30] Hripcsak G, Duke J, Shah N, Reich C, Huser V, Schuemie MS, et al. Observational health data sciences and informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inf 2015;216:574–8.

31] Poucke SV, Thomeer M, Heath J, Vukicevic M. Are randomized controlled trials the (g)old standard? From clinical intelligence to prescriptive analytics. J Med Intern Res 2016;18(7):e185, http://dx.doi.org/10.2196/jmir.5549.

32] Buduma N, Lacascio N. Fundamentals of deep learning. O’Reilly Media Inc; 2017. 33] Wang G, Lam K-M, Deng Z, Choi K-S. Prediction of mortality after radical cys-

tectomy for bladder cancer by machine learning techniques. Comput Biol Med 2015;63.

34] Witten I, Frank E, Hall M. Data mining practical machine learning tools and techniques. 3rd ed. Elsevier; 2011. p. 163–77.

35] Jovanovic M, Radovanovic S, Vukicevic M, Poucke S-V, Delibasic B. Building interpretable predictive models for pediatric hospital readmission using Tree- Lasso logistic regression. Artif Intell Med 2016;72:12–21.

36] Oldham P, Hall S, Burton G. Synthetic biology: map- ping the scientific landscape. PLOS ONE 2012;7(4):e34368, http://dx.doi.org/10.1371/journal.pone.0034368.

37] Poucke SV, Zhang Z, Schmitz M, Vukicevic M, Laenen MV, Celi LA, et al. Scalable predictive analysis in critically ill patients using a visual open data analysis plat- form. PLOS ONE 2016;11(1), http://dx.doi.org/10.1371/journal.pone.0145791.

38] Zolbanin HM, Delen D, Zadeh H. Predicting overall survivability in comorbidity of cancers: a data mining approach. Decis Support Syst 2015;74:150–61.

39] Saybani MR, Wah TY, Aghabozorgi SR, Shamshirband S, Kiah LM, Balas VE. Diagnosing breast cancer with an improved artificial immune recognition system. Soft Comput OCT 2016;20(10):4069–84, http://dx.doi.org/10.1007/s00500-015-1742-1.

Please cite this article in press as: Alharthi H. Healthcare predictive an Health (2018), https://doi.org/10.1016/j.jiph.2018.02.005

40] Tseng T, Chiang F, Liu Y, Roan J, Lin N. The application of data min- ing techniques to oral cancer prognosis. J Med Syst 2015;39(5):1–7, http://dx.doi.org/10.1007/s10916-015-0241-3.

41] Shouval R, Labopin M, Unger R, Giebel S, Ciceri F, Schmid C, et al. Prediction of hematopoietic stem cell transplantation related mortality-lessons learned

PRESS ublic Health xxx (2018) xxx–xxx

from the In-Silico approach: a European society for blood and marrow trans- plantation acute leukemia working party data mining study. PLOS ONE 2016, http://dx.doi.org/10.1371/journal.pone.0150637.

42] Abdar M, Kalhori N, Sutikno T, Subroto I, Arji G. Data mining on the heart disease with the use of different algorithms. Int J Electr Comput Eng 2015;5(6):1569–76.

43] Mathias JS, Agrawal A, Feinglass J, Cooper AJ, Baker DW, Choudhary A. Devel- opment of a 5-year life expectancy index in older adults using predictive mining of electronic health record data. J Am Med Inf Assoc 2013;20:118–24, http://dx.doi.org/10.1136/amiajnl-2012-001360.

44] McRae P, Simmons G, Wong J, McDevitt J. Programmable bio-nanochip plat- form: a point-of-care biosensor system with the capacity to learn. Acc Chem Res 2016;49(7):1359–68, http://dx.doi.org/10.1021/acs.accounts.6b00112.

45] Yoo I, Alafaireet P, Marinov M, Pena-Hernandez k, Gopidi R, Chang J-F, et al. Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst 2012;36(4):2431–48, http://dx.doi.org/10.1007/s10916-011-9710-5.

46] Buchman T. Emory University Hospital uses IBM streaming analytics to gain lifesaving insights on patients. IBM; 2014. http://www-03.ibm.com/software/ businesscasestudies/us/en/corp?synkey=Y890037M62954T62 [accessed 09.02.16].

47] Leveraging key data to provide proactive patient care. IBM. http://www. ibmbigdatahub.com/sites/default/files/document/ODC03157USEN.PDF [accessed 19.01.16].

48] Ann, Lurie RH. Children’s Hospital of Chicago uses IBM Watson super com- puter program to quickly personalize cancer treatment; 2015. https://www. luriechildrens.org/en-us/news-events/Pages/ibm watson super computer program personalizes cancer treatment 218.aspx [accessed 10.02.16].

49] Wright J. Health analytics in action – predictive analytics. Microsoft 2014. https://www.youtube.com/watch?v=Yl8Tp3dj8js [accessed 05.02.16].

50] Hunter J. Value report. Carolinas healthcare systems 2016. https://www. carolinashealthcare.org/documents/chs/2016-chs-value-report.pdf [accessed 11.12.16].

51] Lawry T. Brazilian hospital reduces length of stay and mortality rates with analytics insights; 2015. http://enterprise.microsoft.com/en-us/industries/ health/brazilian-hospital-reduces-length-of-stay-and-mortality-rates-with- analytics-insights/ [accessed 16.12.16].

52] Bonfiglioli E. The future of health is more predictive and preventive when powered by advance analytics and the trusted cloud. Microsoft 2015. https:// www.microsoft.com/en-us/health/blogs/the-future-of-health-is-more- predictive-and-preventive-empowered-by-advanced-analytics-and-trusted- cloud/default.aspx#fbid=G-KHCgkQTMX [accessed 03.02.16].

53] Aerocrine. Microsoft; 2015. http://enterprise.microsoft.com/en-us/industries/ health/aerocrine/ [accessed 08.02.16].

54] National e-Health Strategy. Ministry of Health, Saudi Arabia; 2011. http:// www.moh.gov.sa/en/Ministry/nehs/Pages/The-New-Hospital-Systems.aspx [accessed 08.04.16].

55] Almalki M, Fitzgerald G, Clark M. Health care system in Saudi Arabia: an overview. East Mediterr Health J 2011;18(10):1078–9.

56] Abouzahra M. Causes of failure in Healthcare IT projects. IACSIT Press; 2011. http://www.ipedr.com/vol19/9-ICAMS2011-A00018.pdf [accessed 28.03.16].

57] Alsulame K, Khalifa M, Househ M. Health in Saudi Arabia: current trends, challenges and recommendations. In: Hasman J, Househ A, Mantas S, editors. Studies in health technology and informatics. Netherlands: IOS Press; 2015. p. 213–33.

58] Khalifa M. Organizational, financial and regulatory challenges of implement- ing hospital information systems in Saudi Arabia. J Health Inf Dev Count 2016;10(1):30–45.

59] Stroetmann K, Artmann J, Protti D, Dumortier J, Giest S, Walossek U, et al. European countries on their journey towards national e health infrastructure Luxembourg. Office for official publications of the Euro- pean communities; 2011. http://es.esacproject.net/sites/intranet.esacproject. net/files/ehstrategies final report.pdf [accessed 01.01.16].

60] Sulaiman H, Magairea A. Factors affecting the adoption of integrated cloud based e-health record in healthcare organizations: a case study of Jordan. IEEE Xplore 2014, http://dx.doi.org/10.1109/ICIMU.2014.7066612.

61] Groves P, Kayyali B, Knott D, Kuiken S. The big data revolution in health- care. McKinsey Quarterly; 2013. http://www.pharmatalents.es/assets/files/ Big Data Revolution.pdf [accessed 12.03.16].

62] Alsulame K, Khalifa M, Househ M. E-Health status in Saudi Arabia: a review of current literature. Health Policy Technol 2016;5(2).

63] Almazyad A, Gulam A, Siddiqui K, Almazyad A. Effective hypertensive treatment using data mining in Saudi Arabia. J Clin Monit Comput 2010;24(6):391–401, http://dx.doi.org/10.1007/s10877-010-9260-2.

64] Aljumah A, Siddiqui K, Gulam A. Application of classification based data mining technique in diabetes care. J Appl Sci 2013;13(3):416–22, http://dx.doi.org/10.3923/jas.2013.416.422.

65] World Health Orgnization. Country cooperation strategy at a glance. WHO; 2013. http://apps.who.int/iris/bitstream/10665/136842/1/ccsbrief sau en.pdf [accessed 22.03.16].

alytics: An overview with a focus on Saudi Arabia. J Infect Public

the next frontier for innovation, competition, and productivity. McKinsey Global Institute; 2011. http://www.mckinsey.com/business-functions/digital- mckinsey/our-insights/big-data-the-next-frontier-for-innovation [accessed 20.12.16].


Comments are closed.