26.4: display color error in the medical digital image workflow

4
Display Color Error in the Medical Digital Image Workflow John Penczek*, Paul A. Boynton** *Univ. Colorado, Boulder, and National Institute of Standards and Technology , Boulder, CO **National Institute of Standards and Technology , Gaithersburg, MD Abstract The color error produced by mobile and desktop displays is evaluated and compared relative to other error sources in the medical photography workflow. Several display calibration methods were also investigated for their effectiveness in reducing the initial color errors. Author Keywords medical displays; color error; color accuracy; color calibration; digital image workflow. 1. Objective and background Digital imagery is commonly used in modern medicine to record complex scenes of a patient’s anatomy, and in some cases, to make diagnostic decisions based on those images. Grayscale medical displays have matured over the years to meet the needs of the medical community through technology advances and the wide acceptance of the Digital Imaging and Communications in Medicine (DICOM) standard. [1] However, the incorporation of color images in the medical digital image workflow has been hampered by the lack compatibility with the DICOM standard and the standardization of a color space. Even so, color photography is still used in fields like telepathology and teledermatology to extract critical color information. Since the quality of the color information can have an impact on the eventual accuracy or speed of the diagnosis, it is important that the colors are rendered correctly. [2] Achieving and maintaining accurate color reproduction is already challenging for desktop monitors, and will be even more so as more users start viewing medical color images on mobile displays in uncontrolled lighting environments. As the industry works toward bringing color into the medical digital image workflow, it is valuable to gauge the current color accuracy of displays in order to estimate a baseline from which future progress can be judged. This study examines the color error produced by cellphone, table, and desktop displays that users may use to view medical imagery. These results are put into perspective by comparing the display contribution relative to the other elements of a medical photography workflow. In addition, we investigate how much the display color errors can be reduced by evaluating several color calibration methods. 2. Measurement method The display color error is determined relative to a reference color image. The reference image is a virtual color chart that spans a range of hues, and includes flesh tones. The image originates from a real color chart that was created from a composite of color patches (Fig. 1). The saturated colors are from the NIST Color Quality Scale (CQS). [3] They are Munsell samples with a matte reflective surface (NIST CQS #1-#17), and are commercially available. Many of the colors lie within the standard sRGB color Quantum Measurement Division, Physical Measurement Laboratory, U.S. Department of Commerce. This is a contribution of the National Institute of Standards and Technology, and is not subject to copyright. gamut rendered by most displays. A few colors lie slightly outside this gamut, and served to stress the color management system. The impact of gamut mapping was evaluated separately, and will be discussed later. A range a grayshades (NIST CQS #19-#24) was added to the saturated hues to characterize the white balance. Since flesh tones are an important category of color for medical images, a range of flesh tones were also used in the reference image. These flesh tones (Fig. 1 right side) were originally adapted from the X-rite Digital ColorChecker SG. [4] The real color chart was placed in a light booth and illuminated with daylight fluorescent, cool white fluorescent, and incandescent light sources. A precision spectroradiometer was used to measure the chromaticity of each color. Digital cameras were also used to capture the image of the color chart under the different lighting conditions. The color error produced by the camera capture process was evaluated, and reported in a prior publication. [5] A relative comparison of the digital camera color error to the display error will be discussed later. Spectroradiometer measurements from the real color chart were used to create a reference image in Adobe Photoshop CS5. For this paper, spectroradiometer measurements of the color targets under incandescent illumination were used as reference colors. The color for each color patch was input using the spectroradiometer data. Since most color displays are designed to operate in the sRGB color space, this color space was chosen for this study. The Adobe color management system performed a color gamut mapping for any colors that were outside the sRGB gamut. The final reference image was saved in Joint Photographic Experts Group (JPEG) format in order to be compatible with all the devices tested. Color measurements were conducted on four different displays: a cellphone display, a tablet display, a typical desktop display, and a medical monitor. The JPEG reference image was rendered on the desktop and medical monitor using Photoshop as the viewing program. This was not possible for the cellphone and table displays. In those cases, the native device viewer was used. Each display was placed in a darkroom with a spectroradiometer 1 m from the screen, measuring normal to the display surface. Each color patch was moved to the center of the screen in turn in order to minimize any effects from display non-uniformity. The chromaticity of each rendered color patch was measured and compared to the original spectroradiometer value used to generate the reference image. The average measurement reproducibility over all the color patches was 0.3 in terms of E* ab . 3. Results The color error of the display is estimated as the 1976 CIELAB color difference E* ab between the measured patch color and the color values used to create the reference image. This error includes a small contribution due to gamut mapping, which will be demonstrated in a later discussion. The color difference is defined as: (1) 26.4 / J. Penczek 348 SID 2014 DIGEST ISSN 0097-966X/14/4501-0348-$1.00 © 2014 SID

Upload: paul-a

Post on 29-Mar-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 26.4: Display Color Error in the Medical Digital Image Workflow

Display Color Error in the Medical Digital Image Workflow

John Penczek*, Paul A. Boynton** *Univ. Colorado, Boulder, and National Institute of Standards and Technology†, Boulder, CO

**National Institute of Standards and Technology†, Gaithersburg, MD

Abstract The color error produced by mobile and desktop displays is evaluated and compared relative to other error sources in the medical photography workflow. Several display calibration methods were also investigated for their effectiveness in reducing the initial color errors.

Author Keywords medical displays; color error; color accuracy; color calibration; digital image workflow.

1. Objective and background Digital imagery is commonly used in modern medicine to record complex scenes of a patient’s anatomy, and in some cases, to make diagnostic decisions based on those images. Grayscale medical displays have matured over the years to meet the needs of the medical community through technology advances and the wide acceptance of the Digital Imaging and Communications in Medicine (DICOM) standard. [1] However, the incorporation of color images in the medical digital image workflow has been hampered by the lack compatibility with the DICOM standard and the standardization of a color space. Even so, color photography is still used in fields like telepathology and teledermatology to extract critical color information. Since the quality of the color information can have an impact on the eventual accuracy or speed of the diagnosis, it is important that the colors are rendered correctly. [2] Achieving and maintaining accurate color reproduction is already challenging for desktop monitors, and will be even more so as more users start viewing medical color images on mobile displays in uncontrolled lighting environments. As the industry works toward bringing color into the medical digital image workflow, it is valuable to gauge the current color accuracy of displays in order to estimate a baseline from which future progress can be judged. This study examines the color error produced by cellphone, table, and desktop displays that users may use to view medical imagery. These results are put into perspective by comparing the display contribution relative to the other elements of a medical photography workflow. In addition, we investigate how much the display color errors can be reduced by evaluating several color calibration methods.

2. Measurement method The display color error is determined relative to a reference color image. The reference image is a virtual color chart that spans a range of hues, and includes flesh tones. The image originates from a real color chart that was created from a composite of color patches (Fig. 1). The saturated colors are from the NIST Color Quality Scale (CQS). [3] They are Munsell samples with a matte reflective surface (NIST CQS #1-#17), and are commercially available. Many of the colors lie within the standard sRGB color †Quantum Measurement Division, Physical Measurement Laboratory, U.S. Department of Commerce. This is a contribution of the National Institute of Standards and Technology, and is not subject to copyright.

gamut rendered by most displays. A few colors lie slightly outside this gamut, and served to stress the color management system. The impact of gamut mapping was evaluated separately, and will be discussed later. A range a grayshades (NIST CQS #19-#24) was added to the saturated hues to characterize the white balance. Since flesh tones are an important category of color for medical images, a range of flesh tones were also used in the reference image. These flesh tones (Fig. 1 right side) were originally adapted from the X-rite Digital ColorChecker SG. [4] The real color chart was placed in a light booth and illuminated with daylight fluorescent, cool white fluorescent, and incandescent light sources. A precision spectroradiometer was used to measure the chromaticity of each color. Digital cameras were also used to capture the image of the color chart under the different lighting conditions. The color error produced by the camera capture process was evaluated, and reported in a prior publication. [5] A relative comparison of the digital camera color error to the display error will be discussed later. Spectroradiometer measurements from the real color chart were used to create a reference image in Adobe Photoshop CS5. For this paper, spectroradiometer measurements of the color targets under incandescent illumination were used as reference colors. The color for each color patch was input using the spectroradiometer data. Since most color displays are designed to operate in the sRGB color space, this color space was chosen for this study. The Adobe color management system performed a color gamut mapping for any colors that were outside the sRGB gamut. The final reference image was saved in Joint Photographic Experts Group (JPEG) format in order to be compatible with all the devices tested. Color measurements were conducted on four different displays: a cellphone display, a tablet display, a typical desktop display, and a medical monitor. The JPEG reference image was rendered on the desktop and medical monitor using Photoshop as the viewing program. This was not possible for the cellphone and table displays. In those cases, the native device viewer was used. Each display was placed in a darkroom with a spectroradiometer 1 m from the screen, measuring normal to the display surface. Each color patch was moved to the center of the screen in turn in order to minimize any effects from display non-uniformity. The chromaticity of each rendered color patch was measured and compared to the original spectroradiometer value used to generate the reference image. The average measurement reproducibility over all the color patches was 0.3 in terms of E*ab.

3. Results The color error of the display is estimated as the 1976 CIELAB color difference E*ab between the measured patch color and the color values used to create the reference image. This error includes a small contribution due to gamut mapping, which will be demonstrated in a later discussion. The color difference is defined as:

(1)

26.4 / J. Penczek

348 • SID 2014 DIGEST ISSN 0097-966X/14/4501-0348-$1.00 © 2014 SID

Page 2: 26.4: Display Color Error in the Medical Digital Image Workflow

Figure 1. Color images used to create a virtual reference image for measuring display color error. The saturated colors in the left image are the NIST CQS colors, with grayshades added underneath. A peak white (R=G=B=255) was also added at the bottom. The image on the right shows the flesh tone colors that were originally taken from the middle of the X-rite Digital ColorChecker SG color chart. The numbers in each color patch were added in order to identify each color.

where each color is defined by its lightness (L*) and hue (a* and b*) coordinates. A component of the color difference is the lightness difference L* between two colors. It is generally accepted that a color difference of E*ab =1 is a just-noticeable difference between adjacent colors. The initial color error produced by the four displays, as measured from the image of the NIST CQS color patches, is shown in Fig. 2. The medical monitor was delivered from the factory with a DICOM grayscale calibration. This resulted in large color errors for most of the NIST CQS chart colors. The display was calibrated for the sRGB color space using the medical monitor’s calibration kit, then re-measured for its color accuracy. The calibration produced large reductions in the color errors. The initial measurements on the cellphone (smartphone) and tablet displays also produced significant color errors. The results were summarized by calculating the mean color difference and lightness difference over all the NIST CQS colors

(Table 1). The displays exhibited similar trends for flesh tones, but with smaller color errors (Table 2). Table 1. Summary color error data for NIST CQS colors.

Display Type Mean

CIELAB E*

Mean

CIELAB L*

Smartphone 7.0 -2.0 Tablet 6.6 -0.2 Desktop 4.7 0.6 Medical monitor- DICOM Cal 14 -11

Medical monitor- sRGB Cal 2.6 0.6

Figure 2. Color error for NIST CQS color patches from the four displays tested. The medical monitor results are given with its original factory DICOM calibration, and after an sRGB calibration.

1 2 3 4 5

7 8 10 9

13 17

11

19 20 22 21 23

14 16 15

24 1 2 3 4 5 6 7

8 9 10 11 12 13 14

0.9 m 1.5 m

26.4 / J. Penczek

SID 2014 DIGEST • 349

Page 3: 26.4: Display Color Error in the Medical Digital Image Workflow

Table 2. Summary color error data for Digital ColorChecker SG flesh tone colors.

Display Type Mean

CIELAB E*

Mean

CIELAB L*

Smartphone 3.4 -2.8 Tablet 1.4 -0.4 Desktop 2.4 -0.4 Medical monitor- DICOM Cal 15 -14

Medical monitor- sRGB Cal 1.5 0.8

The relative contribution of the display color error to other photography workflow factors is demonstrated in Fig. 3. In this example, the color error from the desktop display is compared to the error caused by a point-and-shoot camera, and the gamut mapping of the color management system for the NIST CQS colors. As might be expected, the gamut mapping color error tends to be quite small, except for the few out-of-gamut colors.

However, in this example, the digital camera dominates the color error created in the workflow. These camera errors can be somewhat reduced by using higher quality cameras and daylight illumination. The factors that influence the digital camera color error are discussed in a prior publication. [5] Although the display is a minor contributor to the total color error, it still needs to be minimized. This is commonly done through various color calibration methods. Table 3 gives the summary results for five color calibration methods applied to the medical monitor. Two of the methods allow the user to select the instrument that will perform the color measurements, and are referred to as “detector selectable”. This enables the user to select precision spot spectroradiometers in an effort to improve the quality of the calibration. However, these methods did not consistently outperform the rest. The other methods required dedicated contact detectors. The calibration method supplied by the display manufacturer performed comparatively well, given that a relatively inexpensive contact detector was used. However, the manufacturer’s calibration method was able to access the display’s internal 12-bit look-up table, which seemed to have a big impact compared to the standard 8-bit video input used by the other methods.

Figure 3. Relative contribution of color error produced by a digital camera, the gamut mapping color management system, and the desktop display for the NIST CQS colors.

Table 3. Color error produced by the medical monitor after color calibrated by several different methods.

Display Calibration Type NIST CQS, Post-Cal Flesh Tones, Post-Cal

Mean CIELAB E* Mean CIELAB L* Mean CIELAB E* Mean CIELAB L*

Monitor vendor – Defined detector 2.6 0.6 1.5 0.8

Vendor 2- Detector selectable 4.2 0.2 1.8 -0.6

Vendor 3- Defined detector 6.0 -0.3 3.7 -0.9

Vendor 4- Defined detector 3.5 0.5 2.5 0.4

Silverstein method [6]- Detector selectable 2.6 1.3 1.7 1.3

26.4 / J. Penczek

350 • SID 2014 DIGEST

Page 4: 26.4: Display Color Error in the Medical Digital Image Workflow

4. Impact This study quantifies the variation in color accuracy that a user may expect for different display formats, with the mobile displays generally yielding the worst performance. This informs the medical community of the consequences of using these devices for color critical applications. We have also demonstrated that the display accuracy can be improved by color calibration methods, and have identified a few factors that can make them more effective. And most significantly, we have shown that the color error produced in a medical photography workflow is usually dominated by the camera capture process.

5. References [1] ACR/NEMA, Digital Imaging and Communications in

Medicine (DICOM), Part 3.14, Grayscale Standard Display Function, 2011. Technical Report.

[2] E.A. Krupinski, L.D. Silverstein, S.F. Hashmi, A.R. Graham, R.S. Weinstein, H. Roehrig, “Observer performance using virtual pathology slides: Impact of LCD color reproduction accuracy,” J. Digital Imaging (2012).

[3] W. Davis, Y. Ohno, “Color quality scale” Optical Eng 49: Art. 033602, 2012 http://spiedigitallibrary.org/oe/.

[4] Certain commercial equipment, instruments, materials, systems, and trade names are identified in this paper in order to specify or identify technologies adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the systems or products identified are necessarily the best available for the purpose.

[5] J. Penczek, P.A. Boynton, J.D. Splett, “Color error in the digital camera image capture process,” J. Digital Imaging, 27, 182-191 (2013).

[6] L.D. Silverstein, S.F. Hashmi, K. Lang, E.A. Krupinski, W. Dallas, H. Roehrig, “Achieving high color reproduction accuracy in LCDs for color-critical applications,” SID Symposium Digest, Society for Information Display 42, 1026-1029 (2011).

26.4 / J. Penczek

SID 2014 DIGEST • 351