lecture 4 - stanford university · cell502 poor graphs figure 1. classification of tfbs regions...

46
Lecture 4: Visualization

Upload: others

Post on 25-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Lecture 4: Visualization

Page 2: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

• Basic plotting commands

• Types of plots

• Customizing plots graphically

• Specifying color

• Customizing plots programmatically

• Exporting figures

Outline

Page 3: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Why use Matlab for visualization?

• Matlab is flexible enough to let you quickly visualize data, and powerful enough to give you complete control over the final product

• Features:• Interactive plotting• simple 3D plotting• programmatic annotation

Page 4: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Basic Plots

• 2D Visualization• plot (line plots)• histogram• scatter (scatter plots)• image/imagesc (images)

• 3D Visualization• surf/mesh (surfaces)• plot3 (lines)• scatter3

Page 5: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Working with Figures

• Create a new figure : figure();• Specify a figure number: figure(1)• Hold onto a figure “handle”

• figHandle1 = figure(1); • Re-select a figure:

• figure(figHandle1)• Some useful functions:

• clf: clear figure• close all: closes all figures• gcf: get handle to current figure

Page 6: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Figures

Page 7: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

plot­– Syntax:����������� ������������������  plot(x,y)����������� ������������������  plots����������� ������������������  points����������� ������������������  in����������� ������������������  the����������� ������������������  vector����������� ������������������  y����������� ������������������  against����������� ������������������  points����������� ������������������  in����������� ������������������  the����������� ������������������  vector����������� ������������������  x

Page 8: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

histogram/bar­– Syntax:����������� ������������������  histogram(y)����������� ������������������  plots����������� ������������������  a����������� ������������������  histogram����������� ������������������  of����������� ������������������  the����������� ������������������  values����������� ������������������  in����������� ������������������  y,����������� ������������������  bar(x,y)����������� ������������������  plots����������� ������������������  bars����������� ������������������  at����������� ������������������  the����������� ������������������  points����������� ������������������  given����������� ������������������  by����������� ������������������  (x,y)

Page 9: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

scatter­– Syntax:����������� ������������������  scatter(x,y,s,c)����������� ������������������  lets����������� ������������������  you����������� ������������������  specify����������� ������������������  the����������� ������������������  size����������� ������������������  (s)����������� ������������������  and����������� ������������������  color����������� ������������������  (c)����������� ������������������  of����������� ������������������  each����������� ������������������  point����������� ������������������  given����������� ������������������  by����������� ������������������  (x,y)

Page 10: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

image/imagesc­– Syntax:����������� ������������������  image(C)����������� ������������������  plots����������� ������������������  the����������� ������������������  values����������� ������������������  stored����������� ������������������  in����������� ������������������  the����������� ������������������  matrix����������� ������������������  C����������� ������������������  as����������� ������������������  an����������� ������������������  image

Page 11: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

surf & mesh­– Syntax:����������� ������������������  surf(x,y,z)����������� ������������������  and����������� ������������������  mesh(x,y,z)����������� ������������������  are����������� ������������������  used����������� ������������������  to����������� ������������������  visualize����������� ������������������  a����������� ������������������  surface����������� ������������������  in����������� ������������������  three����������� ������������������  dimensions

Page 12: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

plot3­– Syntax:����������� ������������������  plot3(x,y,z)����������� ������������������  plot����������� ������������������  points����������� ������������������  in����������� ������������������  3D

Page 13: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Plot Types

http://www.mathworks.com/help/matlab/2-and-3d-plots.html

Page 14: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

subplots­– the����������� ������������������  ‘subplot’����������� ������������������  command����������� ������������������  let’s����������� ������������������  you����������� ������������������  plot����������� ������������������  multiple����������� ������������������  plots����������� ������������������  on����������� ������������������  one����������� ������������������  figure����������� ������������������  

­– syntax:����������� ������������������  subplot(nRows,����������� ������������������  nCols,����������� ������������������  index)

(Figure����������� ������������������  1)

subplot(1,3,1) subplot(1,3,2) subplot(1,3,3)

Page 15: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

subplots­– the����������� ������������������  ‘subplot’����������� ������������������  command����������� ������������������  let’s����������� ������������������  you����������� ������������������  plot����������� ������������������  multiple����������� ������������������  plots����������� ������������������  on����������� ������������������  one����������� ������������������  figure����������� ������������������  

­– syntax:����������� ������������������  subplot(nRows,����������� ������������������  nCols,����������� ������������������  index)

(Figure����������� ������������������  2)

subplot(3,2,1)

subplot(3,2,3)

subplot(3,2,5)

subplot(3,2,2)

subplot(3,2,4)

subplot(3,2,6)

Page 16: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Subplots

Page 17: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Other functions­– gca����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  get����������� ������������������  handle����������� ������������������  to����������� ������������������  current����������� ������������������  axis����������� ������������������  

­– panel

Page 18: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Panel()­– user-submitted����������� ������������������  function����������� ������������������  from����������� ������������������  Matlab����������� ������������������  File����������� ������������������  Exchange����������� ������������������  (FEX)����������� ������������������  

­– http://www.mathworks.com/matlabcentral/fileexchange/20003-panel

­– Provides����������� ������������������  MUCH����������� ������������������  more����������� ������������������  control����������� ������������������  over����������� ������������������  subplot����������� ������������������  positioning,����������� ������������������  layout,����������� ������������������  margins,����������� ������������������  etc.

Page 19: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Customizing Graphs Graphically

Plot Tools

Page 20: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Customizing Graphically

Page 21: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

• Plot() plots along dimension 1 of an array. • If there are multiple dimensions, plot creates a

separate line for each column• If your data isn’t constructed this way, just transpose

with the apostrophy character: • plot(data’)

The plot() function (again)

Page 22: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Plot()

Page 23: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

• For line plots, specify the line type using a format string:

• plot(x,y,’b’) % plots blue line (default)• plot(x,y,’b.’) % plots blue dots• plot(x,y,’b:’) % plots blue dotted line• plot(x,y,’k--’) % plots black dashed line• plot(x,y,’ro’) % plots red circles

• Chain together characters for full specification of color, marker, and line

• plot(x,y,’ro-’) % plots red circles with solid line• plot(x,y,’ro:’) % plots red circles with dotted line

The plot() function (again)

Page 24: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

• For line plots, specify the line type using a format string:

• plot(x,y,’b’) % plots blue line (default)• plot(x,y,’b.’) % plots blue dots• plot(x,y,’b:’) % plots blue dotted line• plot(x,y,’k--’) % plots black dashed line• plot(x,y,’ro’) % plots red circles

• Chain together characters for full specification of color, marker, and line

• plot(x,y,’ro-’) % plots red circles with solid line• plot(x,y,’ro:’) % plots red circles with dotted line

Plot Line Style

Page 25: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Plot Line Style

Page 26: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: LineSpec

Page 27: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Customizing Programmatically

• Everything you can do graphically you can also do programmatically.

• DON’T do something by hand if you have to do it more than once!

• Examples• axes labels: xlabel(‘text’), ylabel(‘text’) • plot/axis title: title(‘text’)• Add text: text(x,y, ‘text to add’)

Page 28: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Customizing Programmatically

• Graphics parameters are usually specified as ‘parameter’, value pairs:

• plot(x,y, ’linewidth’, 1.4)• plot(x,y, ’bo-‘, ’linewidth’, 2, ‘markersize’, 15)• plot(x,y, ‘o-’,‘MarkerFaceColor’, [1 0 0],‘markerEdgeColor’, [0 0 1])

Page 29: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Other useful functions

• grid on adds grid lines• axis off turns off the axes• colorbar adds a colorbar to image plot• colormap hot switch colormap

Page 30: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Customizing

Programmatically

Page 31: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Plot colorsMatlab has 8 built-in colors:

Black (k), Red (r), Blue (b), Green (g),Cyan (c), Magenta (m), Yellow (y), White (w)

We can specify other colors using RGB (red, green blue) notation:red = [1 0 0]blue = [0 0 1]green = [0 1 0]gray = [0.2 0.2 0.2]black = [0 0 0]

All RGB colors are 1x3 arrays and all elements between 0-1.

Page 32: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Color

Page 33: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Colormaps

Colormaps are used to specify how data gets mapped onto different colors.

Matlab has a few built-in colormaps, but you can also specify your own!

Page 34: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Why are colormaps important?

Page 35: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Much better!

Page 36: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Avoid the default colormap (jet)

Page 37: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Manipulating Figures

Figures in Matlab are referenced using “handles”, which are pointers to different parts of the figure.

Example:myhandle = plot(x,y);

Will return a handle to the plot. Then you can run the following:

get(myhandle); % to see a list of propertiesset(myhandle,‘Name’,Value); % to set the value of a property

Page 38: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Different parts of the figure are organized hierarchically:

Manipulating Figures

>> gcf

>> gca

>> get(gca,'Children') >> get(gcf,'Children')

Page 39: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Annotating plots

Page 40: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Exporting Figures - Formats

Matlab saves figures using it’s own .fig format.

To share figures or view outside matlab, export to other formats, including:

JPG, PNG, EPS, PDF, TIFF

Page 41: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Bitmap vs. Vector graphics

Two main classes of image formats: bitmap vs. vector graphics

Bitmap (jpg, png):• Fixed image sizes• Best for actual images (pictures of stuff)

Vector (eps, pdf):• Variable image sizes• Best for line / bar graphs, scatter plots, etc.

Page 42: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Exporting Figuresprint(figHandle,filename,formattype)

e.g.: print(figure(1), ’MyPlot’,’-dpng’)

formats: ‘-dpng’, ‘-depsc2’, ‘-dpdf’, etc

add flag for resolution: ‘-r300’, etc

Page 43: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Demo: Exporting Figures

Page 44: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Other Resourceshttp://www.mathworks.com/help/matlab/2-and-3d-plots.html

http://colorbrewer.org

2D and 3D visualization examples:

Custom colormaps:

http://www.mathworks.com/matlabcentral/fileexchange/20003-panel

Panel

Colors in figures (blog post)http://figuredesign.blogspot.com/2012/04/meeting-recap-colors-in-figures.html

Page 45: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Poor GraphsCell502

Figure 1. Classification of TFBS Regions

TFBS regions for Sp1, cMyc, and p53 wereclassified based upon proximity to annota-tions (RefSeq, Sanger hand-curated annota-tions, GenBank full-length mRNAs, and En-sembl predicted genes). The proximity wascalculated from the center of each TFBS re-gion. TFBS regions were classified as follows:within 5 kb of the 5! most exon of a gene,within 5 kb of the 3! terminal exon, or withina gene, novel or outside of any annotation,and pseudogene/ambiguous (TFBS overlap-ping or flanking pseudogene annotations,limited to chromosome 22, or TFBS regionsfalling into more than one of the above cate-gories).

imental data, preliminary evidence for the presence of that are located on the 3! end of the well-characterizedgene appear to be located 5! of the overlapping novelnovel transcripts was derived from chromosome 21 and

22 RNA maps (Kapranov et al., 2002) and from the pub- transcript, which suggests that these transcripts maybe regulated by these factors and in precisely the samelicly available EST data. Novel transcripts were verified

using RT-PCR analyses in 9/11 regions and were found way as protein coding genes.Additional supporting evidence that these TFs mayto have little coding capacity (less then 50 amino acids).

Northern hybridization analysis of these isolated tran- be regulating antisense transcripts was found by relatingthem to full-length mRNAs and ESTs with confidentlyscripts with strand-specific oligonucleotides or ribo-

probes indicate that they are polyadenylated, in some assignable strandedness (determined from splicing andpolyadenylation sites and signals). 1782 clusters of tran-cases spliced, and are present as single and multi-exon

isoforms ranging in size from 800 bp to 9 Kb (Supple- scripts were formed of well-oriented sequences frompublic databases aligning to chromosomes 21 or 22.mental Figure S3 on Cell website). Together with the

strand-specific RT-PCR data, this suggests that several Among these clusters, there was a significant associa-tion (chi-square p value " 10#15) between the propertyof them might also be antisense to known genes, such

as, for example, EP300 (Figures 2C and 2D), UBASH3A of proximity to a noncanonical TF and the property ofhaving evidence for transcription on the opposite strand.(Supplemental Figures S2A and S2B online), SEC14L2

(Supplemental Figures S2C and S2D), and others. In this context, a noncanonical TF is one not located atthe 5! end of a known gene and evidence for transcrip-The Ewing sarcoma gene (EWSR1) (Plougastel et al.,

1993), the tumor suppressor gene, EP300 (Gayther et tion on the opposite strand is based on public sequencedata. Twenty-one percent (363) of these transcript clus-al., 2000), and mitogen-activated protein kinase MAPK1

(Gonzalez et al., 1992) on chromosome 22 illustrate po- ters are made up of sense antisense pairs, 44% (161)have an associated noncanonical TF. Of the 161 sensetential utilization of common TFs to regulate both well-

characterized and novel transcripts (Figure 2). Sequence antisense pairs that have a noncanonical TF, 52% con-tain at least one site conserved between the humananalysis of the novel transcripts that overlap EWSR1

and EP300 indicate that they are spliced RNAs. Interest- and mouse genomes based on BlastZ human-mousealignments (Schwartz et al., 2003).ingly, a conserved region in the 3! UTR of the EWSR1

gene is consistent with the evidence of antisense regula-tion of this gene (Lipman, 1997). The EP300 gene is a Differential Expression Patterns

of Novel Transcriptsstriking example (Figures 2C and 2D), having a TFBSregion 17 kb away from the 3! end and a novel transcript To address the issue of whether the observed overlap-

ping noncoding transcripts are biologically important,that splices from this site into the 3! end of the gene.Additionally, overlapping novel transcripts from the we examined whether some of them exhibited a repro-

ducible and coordinated program of differential expres-genes encoding nuclear protein UBASH3A (Supplemen-tal Figures S2A and S2B), phosphatidylinositol transfer- sion correlated with the companion coding transcripts.

The expression profiles of the poly(A)$ cytosolic RNAlike protein SEC14L2 (Supplemental Figures S2C andS2D), TBC/rabGAP domain protein EPI64 (Supplemental fraction were monitored during the response of a pluri-

potent human germ cell tumor-derived cell line, NCCIT,Figures S2E and S2F), guanine-nucleotide exchangefactor TIAM1 (Supplemental Figures S2G and S2H), which undergoes retinoic acid (RA)-induced differentia-

tion into keratin- and neurofilament-positive somaticKIAA0376 protein (Supplemental Figures S2I and S2J),and GTSE1 (Supplemental Figures S2K and S2L) were cells (Damjanov et al., 1993). Empirically derived tran-

scriptional maps of NCCIT using the chromosome 21verified by RT-PCR and/or Northern blot analyses (Sup-plemental Figure S3). In many of these cases, the TFBS and 22 genome tiling arrays during various stages of

Cawley et. al., Cell, Volume 116, Issue 4, 20 February 2004.

Page 46: Lecture 4 - Stanford University · Cell502 Poor Graphs Figure 1. Classification of TFBS Regions TFBS regions for Sp1, cMyc, and p53 were classified based upon proximity to annota-tions

Poor Graphs

Cotter et. al., Journal of Clinical Epidemiology 57 (2004)

D.J. Cotter et al. / Journal of Clinical Epidemiology 57 (2004) 1086–1095 1093

D.J. Cotter et al. / Journal of Clinical Epidemiology 57 (2004) 1086–1095 1091

<30 30- <33 33- <36 36- <39 >=39 All0%

25%

50%

75%

100%

<=8,738 units/wk >8,738-13,944 units/wk >13,944-21,692 units/wk >21,692 units/wk

Hematocrit Group(%)

Pro

port

ion

Fig. 1. Distribution of epoetin dose by quartiles Q1–Q4, using mean dose per week (units/wk) disaggregated by hematocrit group. Within each epoetindose quartile, the distribution of dosing resembles a bell-shaped curve around the recommended target hematocrit range (33% to !36%). Quartiles arerepresented by shaded segments on histogram bars, darkest for the first quartile (bottom), lightest for the fourth quartile (bar), with the following values:Q1, "8,738; Q2, #8,738 to 13,944; Q3, #13,944 to 21,692; Q4, #21,692.

Surrogates fail for a number of reasons and can be ex-plained by one or another of many failed-surrogate mecha-nisms [13,16]. In the case of epoetin, mistaken conclusionscan potentially occur using two different mechanisms. One,if the surrogate end point is associated with the actual clinicalend point due to a shared common cause, a treatment thataddresses the surrogate end point without affecting thecommon causal agent may not have an effect on the actualend point. In a second possibility, treatments can affect out-comes through unanticipated causal pathways that are unre-lated to the surrogate end point. The difference in mortalityrates among patients with similar hematocrit levels couldbe related to either of these possibilities. We will discussthe clinical interpretation of each of these possibilities.

As shown in Fig. 3B, the observed relationship betweenhematocrit and mortality could be due to other, potentiallyunmeasured, aspects of a patient’s health status that indepen-dently affect hematocrit, epoetin responsiveness and sur-vival. Factors affecting epoetin responsiveness are not wellunderstood [28], but several possibilities have been men-tioned in the literature. Ma et al. [7] cited inflammation as

Table 3Unadjusted 1-year mortality rates per 1,000 patients, by hematocritlevel and epoetin dose quartile

Hematocrit group

30% to 33% to 36% toDose quartilea !30% !33% !36% !39% $39% All

Q1 271 245 185 184 177 203Q2 344 278 212 195 186 232Q3 425 316 247 199 180 265Q4 501 354 280 227 196 310All 412 297 225 200 186 251

a For dose quartiles, see Table 2.

one possible common cause of poor epoetin response andmortality, and Tonelli et al. [29] found an association be-tween sensitivity to epoetin and markers of inflammation.Current guidelines call for investigation of inflammationwhen a patient exhibits a poor response to epoetin [14].Because inflammation and protein-energy malnutrition havea high prevalence and are found to be closely related to eachother in dialysis patients, they are referred to together asmalnutrition–inflammation complex syndrome (MICS)[30]. Taken together, MICS is hypothesized to blunt theresponsiveness of anemia of ESRD to epoetin. Althoughthe possible interactions between inflammatory and nutri-tional markers and their influence on anemia and epoetinhyporesponsiveness require further study, poor responderswho continue to have low hematocrit levels despite receivinghigh doses of epoetin may not benefit significantly frommore epoetin. With patients who are poor responders toepoetin therapy, K/DOQI recommends that they be consid-ered for other approaches that might be complementary toincrease response, such as iron supplementation, improveddialysis adequacy, or improved nutrition.

Recently, Ifudu et al. [31] studied 309 hemodialysis pa-tients to determine the relative effects of adequacy of dial-ysis and intravenous iron on hematocrit. Pointing out thatpatients with low hematocrit levels may have received inade-quate dialysis and may have been inappropriately adminis-tered excess intravenous iron as a corrective measure, Ifuduet al. [31] concluded that adequacy of dialysis predicts theresponse to epoetin therapy. Tonelli et al. [29], however, ina study of 135 chronic hemodialysis patients dialyzed to aKt/Vurea of 1.6 (where K is clearance, t is time, and V isvolume) found no such relationship. The authors theorizeda possible threshold effect. Both Tonelli et al. [29] and Esch-bach et al. [28] reported that lower serum albumin levels