resolution potential of kir typing based on high ... · yields presence/absence calls for all kir...

1
RESOLUTION POTENTIAL OF KIR TYPING BASED ON HIGH- THROUGHPUT NGS GENOTYPING DATA Ines Wagner 1 , Gerhard Schöfl 1 , Bianca Schöne 1 , Alexander H. Schmidt 2 , Vinzenz Lange 1 1 DKMS Life Science Lab, Dresden, Germany; 2 DKMS gemeinnützige GmbH, Tübingen, Germany DKMS Life Science Lab GmbH Fiedlerstr. 34 01307 Dresden, Germany www.dkms-lab.com DKMS Gemeinnützige GmbH Kressbach 1 72070 Dresden, Germany www.dkms.com Results and Conclusion Exons 457: low to medium allelic resolution, LA ranging from 1 to 16 LA > 1 is in most cases due to sequence coverage limitations rather than missing phase information Exons 457389: significantly higher overall allelic resolution, LA ranging from 1 to 4 remaining allele level ambiguities: balance between phasing and sequence coverage issues. Exons 3, 8, 9 have been added to our NGS KIR typing workflow for all newly registered donors in 07/2016. Methods Taking previously published KIR haplotype frequency data into account [1], we randomly generated 1000 artificial KIR genotypes, each consisting of two KIR haplotypes. For each KIR genotype the exon specific sequence compositions were identified from the IPD-KIR database for both settings of exon combinations. Based on these sequences, we recalculated the genotype (Fig. 1). We then compared the level of allelic resolution (LA), that could be achieved when using exons 4, 5, 7 alone or in combination with exons 3, 8, 9. Note, that the level of ambiguity is the number of alleles that could not be resolved at 3-digit level, with LA = 1 referring to an unambiguous typing and LA > 1 indicating an ambiguous typing result. [1] Vierra-Green C et al. (2012), Allele-Level Haplotype Frequencies and Pairwise Linkage Disequilibrium for 14 KIR Loci in 506 European-American Individuals. PLoS ONE 7(11): e47491 Introduction Recently, we implemented an amplicon based next generation sequencing (NGS) workflow for routine KIR typing, which currently yields presence/absence calls for all KIR genes. While presence/absence data is a good starting point, donor selection might further benefit from allele level resolution. However, in our current setting, high resolution KIR typing results can hardly be achieved as sequencing is limited to exons 4, 5 and 7. The major challenges of an amplicon based NGS approach are identical exon sequence configurations (ESCs) among different alleles as well as missing phase information, both causing typing ambiguities (Fig. 1). Here we report that expanding the sequenced regions by exons 3, 8 and 9 significantly reduces allele level ambiguities. Fig. 1: Calculation of the genotype based on amplicon sequences. Colors refer to unique exon sequences. Given a sample’s sequences, the genotype is calculated based on the ESCs of known alleles. Note, that in this case alleles C and E share the same ESC (identical ECS) and furthermore two genotypes (red: C/E+G, black: B+F) can not be distinguished from each other due to missing phase information. Fig. 2: Resolution Potential of amplicon based KIR typing. A and B: Diversity of allelic configurations based on sequenced region. Inner ring: KIR genes, middle ring: 3- digit allele (protein) level, outer ring: allele level. Color coding refers to allele frequency (f), which is the product of amplicon frequencies and is normalized to the number of amplicons (black f = 0; green f = 0.01; red f = 1). Links (black lines) connect identical allelic configurations over the sequenced regions. C and D: “Level” refers to level of ambiguity and is the number of alleles / allele groups that could not be resolved at 3-digit level. Color coding describes the status of the typing level (green: unambiguous typing, LA = 1; not green: ambiguous typing, LA > 1) as well as the reason for remaining ambiguous typing results (blue: identical allelic configuration; orange: missing phase information; red: both, identical allelic configuration and missing phase information). Exons 457 Exons 457389 A B C D

Upload: doque

Post on 08-Apr-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RESOLUTION POTENTIAL OF KIR TYPING BASED ON HIGH ... · yields presence/absence calls for all KIR genes. While presence/absence data is a good starting point, donor selection might

RESOLUTION POTENTIAL OF KIR TYPING BASED ON HIGH-

THROUGHPUT NGS GENOTYPING DATA

Ines Wagner1, Gerhard Schöfl1, Bianca Schöne1, Alexander H. Schmidt2, Vinzenz Lange1

1 DKMS Life Science Lab, Dresden, Germany; 2 DKMS gemeinnützige GmbH, Tübingen, Germany

DKMS Life Science Lab GmbH

Fiedlerstr. 34

01307 Dresden, Germany

www.dkms-lab.com

DKMS Gemeinnützige GmbH

Kressbach 1

72070 Dresden, Germany

www.dkms.com

Results and Conclusion

• Exons 457:

• low to medium allelic resolution, LA ranging from 1 to 16

• LA > 1 is in most cases due to sequence coverage limitations rather

than missing phase information

• Exons 457389:

• significantly higher overall allelic resolution, LA ranging from 1 to 4

• remaining allele level ambiguities: balance between phasing and

sequence coverage issues.

Exons 3, 8, 9 have been added to our NGS KIR typing

workflow for all newly registered donors in 07/2016.

Methods

Taking previously published KIR haplotype frequency data into account [1], we randomly generated 1000 artificial KIR genotypes, each consisting of two KIR haplotypes. For each KIR genotype

the exon specific sequence compositions were identified from the IPD-KIR database for both settings of exon combinations. Based on these sequences, we recalculated the genotype (Fig. 1). We

then compared the level of allelic resolution (LA), that could be achieved when using exons 4, 5, 7 alone or in combination with exons 3, 8, 9. Note, that the level of ambiguity is the number of

alleles that could not be resolved at 3-digit level, with LA = 1 referring to an unambiguous typing and LA > 1 indicating an ambiguous typing result.

[1] Vierra-Green C et al. (2012), Allele-Level Haplotype Frequencies and Pairwise Linkage Disequilibrium for 14 KIR Loci in 506 European-American Individuals. PLoS ONE 7(11): e47491

Introduction

Recently, we implemented an amplicon based next generation sequencing (NGS) workflow for routine KIR typing, which currently

yields presence/absence calls for all KIR genes. While presence/absence data is a good starting point, donor selection might

further benefit from allele level resolution. However, in our current setting, high resolution KIR typing results can hardly be

achieved as sequencing is limited to exons 4, 5 and 7. The major challenges of an amplicon based NGS approach are identical

exon sequence configurations (ESCs) among different alleles as well as missing phase information, both causing typing

ambiguities (Fig. 1). Here we report that expanding the sequenced regions by exons 3, 8 and 9 significantly reduces allele level

ambiguities.

Fig. 1: Calculation of the genotype

based on amplicon sequences.

Colors refer to unique exon sequences.

Given a sample’s sequences, the

genotype is calculated based on the

ESCs of known alleles. Note, that in this

case alleles C and E share the same

ESC (identical ECS) and furthermore

two genotypes (red: C/E+G, black: B+F)

can not be distinguished from each other

due to missing phase information.

Fig. 2: Resolution Potential of amplicon based KIR typing.

A and B: Diversity of allelic configurations based on sequenced region. Inner ring: KIR genes, middle ring: 3-

digit allele (protein) level, outer ring: allele level. Color coding refers to allele frequency (f), which is the product

of amplicon frequencies and is normalized to the number of amplicons (black f = 0; green f = 0.01; red f = 1).

Links (black lines) connect identical allelic configurations over the sequenced regions.

C and D: “Level” refers to level of ambiguity and is the number of alleles / allele groups that could not be

resolved at 3-digit level. Color coding describes the status of the typing level (green: unambiguous typing, LA

= 1; not green: ambiguous typing, LA > 1) as well as the reason for remaining ambiguous typing results (blue:

identical allelic configuration; orange: missing phase information; red: both, identical allelic configuration and

missing phase information).

Exons 457

Exons 457389

A

B

C

D