The Impact of Data Visualization for Laboratories M… · Our three speakers today will talk about the importance and need for data visualization in the laboratory, take a look at
44
www.aphl.org The Impact of Data Visualization for Laboratories Stephen Soroka, MPH LIMS Scientific Advisor Centers for Disease Control and Prevention Garrett Peterson, MBA Chief Strategic Officer Yahara Software Harvey Kaufman, M.D. Senior Medical Director Medical Informatics Quest Diagnostics
Harvey Kaufman, M.D.Senior Medical Director Medical InformaticsQuest Diagnostics
Presenter
Presentation Notes
Our three speakers today will talk about the importance and need for data visualization in the laboratory, take a look at some of the different kinds of visualization techniques and tools available, and explore real life examples of how data can be displayed to inform and impact public health. Garrett Peterson, Chief Strategic Officer for Yahara Software, will begin with an overview of visualization theories and practices and delve a bit into the history. Stephen Soroka, LIMS Scientific Advisor for CDC will look at how labs can actually put their data to best use. Finally, Dr. Harvey Kaufman, Senior Medical Director at Quest Diagnostics will showcase some results that demonstrate how laboratories can actually share and display information to impact public health at large.
www.aphl.org
A BACKGROUND IN DATA VISUALIZATIONGarrett Peterson, MBAChief Strategic Officer
Yahara Software
www.aphl.org
A PICTURE IS WORTH A THOUSAND WORDS….
Presenter
Presentation Notes
Pictures and images are a part of our everyday life. We seem them in every shape and size and every aspect of information sharing. Simply put…they make us think. We use visuals to evaluate infrastructure, to understanding training concepts, and to evaluate key performance indicators for your organization. It is information.
www.aphl.org
PICTURES/VISUALIZATIONS AND DATA
Information is:• Everywhere• Readily accessible• Large and complex• Difficult to make sense of it all
• Need ways to convey meaning
Presenter
Presentation Notes
And information, or data, is everywhere. In many cases readily accessibly; but it is also becoming increasingly large and more complex…thus making it difficult to make sense of it all. We need ways to communicate more effectively and efficiently to convey meaning and purpose. And visualizations help us to understand the data that surrounds us.
VISUALIZATION OF DATA IS NOT A NEW CONCEPT
Presenter
Presentation Notes
Visualizing data is are not a new concept. A classic example of this is Charles Joseph Minard’s graphic on Napoleon’s March. One does not need to understand French to see that in this graphic, one can gain an understanding of the size of Napoleon’s army in its attempt to invade Russia. The path depicts the size of his army in relation to geospatial, temporal, and thermometric scales. While this does not tell the entire picture of the War of 1812, one may think…“don’t invade Russia in the winter”
www.aphl.org
WHAT MAKES A GOOD VISUALIZATION?
• Needs a purpose• Someone willing to listen• Reveals data at different
levels
Starts with the creator understanding their data!
Presenter
Presentation Notes
So the question becomes…what makes a good visualization. First and foremost, the visual needs a purpose…a goal or story to tell. But it also needs an audience…someone to willing to hear what the visual has to say. It also must present the data in the most effectively way, allowing it to reveal data at different levels. Human perception of visuals is based on one’s knowledge along with how information is presented. This all starts with its creator having a good understanding of their data.
www.aphl.org
WHAT MAKES A GOOD VISUALIZATION?
The right visual: • Tells a story• Communicates
clearly and efficiently
• Triggers human perception
Presenter
Presentation Notes
John Snow’s mapping of the 1854 London Cholera Outbreak along Broad Street in soho London. Vorinoi diagram showing that the deaths were primarily occurring in households that were closer to the broadstreet pump than the other pumps in London. Considered the birthplace of epidemiology. Also was a major catalyst for scientists to question the Miasma theory of disease exposure through air quality and eventually lead them to the germ theory constructs understood today. This all starts with its creator having a good understanding of their data.
www.aphl.org
WHAT MAKES A GOOD VISUALIZATION?
Concise descriptor
Easily Read
Symbols
Details available but
not cluttering the overall scene
“The greatest value of a picture is when
it forces us to notice what we never expected to see”
-Edward Tufte
Presenter
Presentation Notes
Tufte visualization
www.aphl.org
SO, WHAT MAKES A BAD VISUALIZATION?
• Distractions, dstractions, DiSTRaCTIOnS
Presenter
Presentation Notes
The last example here highlights the concept of distractions or “chart junk” that impair a person’s ability to perceive the information. In this case, the image of bananas, the color palette, and the amount of data for a 3D chart make it difficult to ascertain what the creator is trying to tell the audience.
www.aphl.org
SO, WHAT MAKES A BAD VISUALIZATION?
• The numbers don’t add up
Presenter
Presentation Notes
Garrett provided examples of some good visualizations but how can one spot a bad visualization? Some cases are more obvious than others. In this example, the numbers surrounding what appears to be a pie chart simply don’t add up—they total %188. It is a case of an incorrect informational visualization providing a false sense of reality and thus losing its focus with its audience.
SO, WHAT MAKES A BAD VISUALIZATION?• The creator does not know their data and/or
their graphics
Presenter
Presentation Notes
In this example of a spurious correlation, the data shows a very high correlation of cheese consumption with the number of people who died by becoming tangled in their bedsheets. Clearly 2 different datasets that have absolutely no relationship to each other but somehow correlate very well. This comes from a series of spurious correlations presented by Tyler Vigen.
SELECTING A VISUALIZATION APPROACH
Presenter
Presentation Notes
When selecting a visualization approach there are several aspects of the problem to consider. I generally break the problem down first by the nature of the data and then the filtering and aggregation tactics needed. Based upon these parameters we then find ourselves with a pretty clear choice of what the appropriate delivery mechanisms might be.
www.extremepresentations.comA. Abela
Presenter
Presentation Notes
Once we have a good idea of what we are trying to do with that information we can then select the correct chart type for the job based upon that information and the structure of the data we are working with. I find this decision tree written by Dr. Anthony Abela to be very helpful in deciding what type of chart will work best based upon my data and what I am trying to communicate.
USE DYNAMIC VISUALIZATIONS TO TELL A STORY
Presenter
Presentation Notes
Sometimes the information we are trying to convey really depends on the end user being capable of interacting with the visualization to clarify the point. Here is a public health (albeit not laboratory) example of just that. The user is given the ability to create a pro-forma budget for a typical individual living under very tight economic realities. In trying to make the budget work, the user quickly sees the very difficult scenario these non-custodial parents living in poverty face on a daily basis.
SANKEY DIAGRAMS ALLOW USERS TO EXPLORE
Presenter
Presentation Notes
This is the example of Sankey diagram (all test data) Another example of this problem that can much more easily be explored using a interactive visualization. In this case the users can visually filter information using a Sankey diagram that they create on the fly by selecting first the filter criteria hierarchy and then the sorting mechanisms within them by dragging and dropping elements onto the screen.
DASHBOARDS CAN GIVE OPERATIONAL INSIGHTS
Presenter
Presentation Notes
A natural extension to dynamic visualizations are dashboards. Primarily created to manage dynamically created operational data such as laboratory operations information or surveillance programs.
www.aphl.org
THE VALUE OF DATA VISUALIZATIONS FOR PUBLIC HEALTH LABORATORIES
Stephen Soroka, MPHLIMS Scientific Advisor
Centers for Disease Control and Prevention
Presenter
Presentation Notes
Steve Slide Visualizations serve a necessary role in helping people make sense of their data and make decisions. And the value of this is critically important for public health laboratories.
LAB SCIENCE AND OPERATIONS ARE DATA-DRIVEN
Presenter
Presentation Notes
Steve Slide Laboratory science and the operations of the lab are driven by data, and really at a scale previously unimagined. The ubiquity of access and the volume of data are fundamentally transforming the scientific process within the lab and are helping to modernize the lab.
THE KEY PRODUCT OF A LAB IS DATA!
Presenter
Presentation Notes
Steve Slide The key product of the lab is your data and it holds the answers to many different and new scientific insights in public health. More and more of a lab’s viability now relies upon data that it generates, being shared within and outside of the lab to support research, clinical studies, surveillance, outbreaks, patient care, informing the public, amongst numerous other public health activities.
www.aphl.org
LABS PRODUCE A LOT OF DATA
Specimens Data Elements per Specimen
Specimen Records
Result Records per Specimen
Total Records
10000 100 1,000,000 10 1.1M
• Example: Laboratory Information Management Systems (LIMS)• Designed to manage the specimen lifecycle and overall lab
management• Capable of storing lots of lab data
Presenter
Presentation Notes
Steve Slide And the modern laboratory produces lots of data, probably more data than the lab even realizes! A good example would a Lab Information Management System. It is designed to manage the lifecycle of specimen (from accessioning to testing to reporting), as well as some lab management functions such as managing equipment or reagents. This type of a system records and stores critical and supportive data. For example, if a LIMS stores data from 10,000 specimens, each specimen may have 100 different data elements associated with it, leading to 1M specimen records! Adding on result information further increases this volume. Today, many public health labs are testing 10’s to 100’s of thousands of specimens each year!
www.aphl.org
USE CASES OF DATA VISUALS IN THE LAB
• Data cleaning• Data analysis, mining,
and monitoring• QMS trending (i.e. TAT,
submitters management, testing volume)
• Surveillance studies• Assay and validation
comparisons• Dashboards
Presenter
Presentation Notes
Data is used throughout the operational, analysis and information delivery aspects of every laboratory. The use of visualizations in supporting lab operations can help you in virtually every situation, from understanding how clean your data are, to identifying patterns and trends in testing volume, to supporting the validation of an assay.
www.aphl.org
WHAT CAN BE DONE TO HELP LABS BETTER UNDERSTAND THEIR DATA?
PURPOSE
How
BENEFITS Provides an easy to use service for laboratorians to view/analyze data in near real-time
Integrate visualization tools within your laboratory
Enable easy-to-use visualization capabilities to laboratorians without the need for programming experience
Presenter
Presentation Notes
So what can be done to help labs make better sense of the data that they generate? One way is by providing labs with the capabilities to enable and use visualizations without the ability to utilize programming to create them. This can be accomplished through the use of point-and-click data visualization tools, providing a service to the lab to enable real-time analysis and understanding of their data.
www.aphl.org
TOOLS TO CREATE VISUALS
• PC technology aided progression • Spreadsheets• Programming software
• e.g. SAS, R, Matlab, Minitab, etc.
• Visualization Software• e.g. Tableau, SAS
Visual Analytics
Presenter
Presentation Notes
There are many tools available to the average laboratory to create visualizations. A common tool in use today are the capabilities built into spreadsheets such as Microsoft Excel. But there are also many visualization packages and software tools available from commercial vendors and as open source projects that allow for more creativity with more functionality. Many visualization products provide great flexibility in their use. From point and click through the use of programming, these products can allow labs to perform visual analytics, provide dashboards for oversight of operations, and can produce visuals for presentations and publications.
DASHBOARDS CAN GIVE OPERATIONAL INSIGHTS
Presenter
Presentation Notes
Visualization tools provide laboratorians with the ability to view their data in real-time but also provide quick analyses of their data. Dashboards are one way to give operational oversight into the data that matter most for a lab. These can require a fair amount of foresight and preparation to create properly and be useful. But once set up, the visuals can be automatically updated as new data as generated.
DASHBOARDS CAN GIVE OPERATIONAL INSIGHTS
Presenter
Presentation Notes
An additional feature of visualization tools is the ability to link dynamic visuals. We can see in this example that a laboratory is interested in quality of their specimens. By displaying some information about specimen quality on a dashboard, the laboratory is able to select ‘sample contaminated’, and the lower table is adjusted to just show information regarding that quality issue. In turn, the laboratory now has real insight into their data and can take the necessary actions to improve the quality of their specimens.
USE BI TOOLS TO EXPLORE AND GAIN INSIGHTS
Presenter
Presentation Notes
BI Tools allow us to explore lab data, to do deep dives into our data, and look at data in new and meaningful ways. In this example, a BI tool was used to create a histogram showing the relationship between test order submissions and which lab partners were submitting these orders. The process took just two clicks to create this visual. The graph itself is not very useful but with a BI tool, you could very quickly determine and change the visualization.
USE BI TOOLS TO EXPLORE AND GAIN INSIGHTS
Presenter
Presentation Notes
Through a single click, the graph was changed to a heat map that does indeed produce a better representation of the data. However there is still too much data to accurately depict the data. BI tools accommodate for this through the use of filters, or in this case using a ranking of the data.
USE BI TOOLS TO EXPLORE AND GAIN INSIGHTS
Presenter
Presentation Notes
By ranking the data and only showing the top ten ordered tests and the top ten submitters, the laboratory can now very easily get a better sense of the top ordered tests and who has been sending them. All of this can be simply done through a few clicks using a BI tool.
INVESTMENT IN TOOLSET IS WORTH IT
Presenter
Presentation Notes
Another example in how BI tools can support laboratories is through exploratory data analysis. In this example, we are looking at the distribution of patient age. Here we can see that there are some potential outliers…or some really, really, REALLY old patients!
INVESTMENT IN TOOLSET IS WORTH IT
Presenter
Presentation Notes
In understanding the data, the laboratory knows there other variables that we can use to filter the data. In this case, there is an age unit associated with the patient age and we can see that within the data, age units are captured in various increments…not just in years.
INVESTMENT IN TOOLSET IS WORTH IT
Presenter
Presentation Notes
By removing the other age units, the laboratory can see a better distribution of their data. This exploration could have identified outliers in the data and additional explorations of the data could allow the laboratory to create new variables to allow this type of data to be normalized. These are just a few examples and ways that laboratorians can empower themselves to visualize their data, make real-time insights into their data, and improve the efficiencies of the operations and the data that they collect and produce. Now I will turn it over to Harvey who will talk to you about data visualizations created from laboratory data and the impact that they have on public health. Harvey…take it away.
www.aphl.org
EXAMPLES OF DATA VISUALIZATIONHarvey W. Kaufman, M.D.
Senior Medical Director, Medical InformaticsQuest Diagnostics
Presenter
Presentation Notes
Focus of using lab results to impact public health
Nearly four in nine (43%) patients have a vitamin D level that is categorized as deficient or suboptimal
CATEGORIZATION BY VITAMIN D
Presenter
Presentation Notes
This is a simple example where showing three numbers with a graph and colors and more impactful than the three numbers alone. Two ways are shown to display the distribution. I also like the take home message being spelled out.
TRENDS IN LDL CHOLESTEROL: 2001-2017Kaufman HW, Blatt AJ, Huang X, Odeh MA, Superko HR (2013) Blood Cholesterol Trends 2001–2011 in the United States: Analysis of 105 Million Patient Records. PLoS ONE 8(5): e63416. doi:10.1371/journal.pone.0063416 National Lipid Association Annual Conference, Abstract, 2019
This is an example of trending data. The time trend is clear. Differences by sex and age range are shown.
HOW DOES LDL CHOLESTEROL COMPARE TO LDL PARTICLE NUMBER?Three groups sorted by LDL cholesterol ranges(<100 mg/dL, 100-129 mg/dL, and 130-159 mg/dL)Display percent of each group by LDL particle number
Presenter
Presentation Notes
Example of three overlapping distributions and the cut-points used to categorize risk.
www.aphl.org
Presenter
Presentation Notes
A five-digit ZIP Code map of Buffalo showing a strong association with pre-1950 housing. No key is needed.
Kaufman HW, Chen Z. Trends in Laboratory Rotavirus Detection: 2003 to 2014. Pediatrics 2016;138.
Over the 11-year period, 276,949 specimens.
This different pattern suggests that natural immunity may have protection in the unlikely vaccinated children as they age, but that protection wanes among children who are likely to have been vaccinated.
Presenter
Presentation Notes
Complex message. Upper right graph shows test volume in blue line and positives in red bars. One can see seasonality and every two-year cycle. In the bottom graph, the positivity by age range is shown in each set. The first and third sets show the impact of herd immunity and that these children were all unlikely to be vaccinated. The final set of bars represents children who were likely immunized. Like the children who were unlikely immunized, the positivity rate has decreased significantly from the pre-vaccine period. However, the trend by age has shifted, suggesting waning immunity.
HCV ANTIBODY POSITIVITY RATESVary by Birth Cohort and Median Income Range
• Persons living in the lower median income areas had higher HCV antibody positivity rates, compared to persons living in the higher median income areas.
• Within each median income level, the positivity patterns stayed the same, i.e., baby boomers had the highest HCV antibody positivity rates.
Presenter
Presentation Notes
Again, visualization is perfect for displaying relationships and multiple points. The two key points is that HCV antibody positivity is highest among baby boomers, supportive of CDC guidelines for this cohort to be tested. Further, there is a strong inverse association with median income. This suggests targeting these infected individuals for treatment will impose special challenges. The call-out box is effective at summarizing the take-away.
DRUG MIS-USE: CONTINUALLY EVOLVING
2.5% 2.2% 2.4% 2.7%
6.5%7.5%
9.4% 9.6%
13.4%
10.0%
16.4%
18.4%
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
20%
2016 2017 2018 2016 2017 2018
United States Massachusetts
Non-Prescribed Fentanyl and Gabapentin Positivity, 2016-2018
Non-prescribed Fentanyl Non-prescribed Gabapentin
Presenter
Presentation Notes
One of the challenges of our time is drug mis-use. This graph shows positivity of two commonly mis-used drugs, fentanyl, a powerful opioid, and gabapentin. The benchmark is all patients tested in the United States and the focus is on the rising trends in Massachusetts.
www.aphl.org
Thank you
www.aphl.org
HOW TO LEARN MORE
• Many tools supporting visual analytics• Talk with Data Managers /
Statisticians• Join Data Visualization
Scientific Groups• Search the web
Presenter
Presentation Notes
Information about visualization is everywhere. The best way to learn is by doing and talking with others.
www.aphl.org
“YOU CAN SEE A LOT BY LOOKING”: --Yogi Berra
Presenter
Presentation Notes
One of the challenges of our time is drug mis-use. This graph shows positivity of two commonly mis-used drugs, fentanyl, a powerful opioid, and gabapentin. The benchmark is all patients tested in the United States and the focus is on the rising trends in Massachusetts.
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.