construction of a yac contig spanning the xq13.3 subband

8
GENOMICS 26, 115-122 (1995) Construction of a YAC Contig Spanning the Xq13.3 Subband LAURENT VILLARD,* JOZEF GECZ, *A LAURENCECOLLEAUX,* ANNE-MARIE Loss~,* JAMEL CHELLY,I YUMIKO ISHIKAWA-BRUSH,-I-ANTHONY P. MONACO,'I" AND MICHEL FONTES *'2 *INSERM U406, Faculte de M~decine de La Timone, 27, Boulevard Jean Moulin, 13385 Marseille CEDE)(5, France," and tHuman Genetics Laboratory, Imperial Cancer Research Fund, Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford OX3 9DU, United Kingdom Received September 22, 1994; revised December 12, 1994 The loci involved in several X-linked mental retarda- tion syndromes have been linked to the pericentro- meric region of the X chromosome long arm (Xq12- q21). To isolate candidate genes for these diseases, we set up the construction of YAC contigs spanning this region. Two of these syndromes (the Juberg-Marsidi syndrome and the a-thalessemia mental retardation syndrome) have been recently linked, with high lod scores, to polymorphic probes previously assigned to Xq13.3. We therefore constructed a first YAC contig, encompassing this band, from DXS441 to PGK1. The physical map, deduced from the isolated clones, ex- tends over 2.1 Mb of genomic DNA. Restriction analysis of the YAC contig allowed us to map precisely the loci previously assigned to that chromosomal region and to define their relative order. The validity of this physical map has been checked by comparing Sfi I digests of the YACs to genomic fragments obtained with the same enzyme. A cDNA selection approach, already per- formed with a previous partial contig, has been ex- tended to cover the whole region. ©1995 Academic P ..... Inc. INTRODUCTION It is well known that in human an excess number of males versus females suffers from mental retardation (MR). Genetic linkage analysis in families segregating MR has demonstrated that several loci located on the X chromosome are genetically linked to MR, named X- linked mental retardation (XLMR). The most recent data demonstrated that up to 127 different XLMR enti- ties (excluding fragile X syndrome) have been assigned to different regions of the human X chromosome whether syndromal or nonsyndromal (reviewed in Schwartz, 1993; Neri et al., 1994). Among these, 50% have been associated with loci in the pericentromeric region of the human X chromosome (Schwartz, 1993; Neri et al., 1994). To obtain molecular resources, which 1 Present address: Department of Genetics, Slovak Academy of Sci- ences, Vlarska 5, 83334 Bratislava, Slovakia. 2 To whom correspondence should be addressed. Telephone: 33-91 78 44 77. Fax: 33-91 80 43 19. would be useful in the isolation of the genes involved in these diseases, we have chosen the following strat- egy: (1) to construct a YAC contig covering the region; (2) to use these resources to isolate candidate genes. Recently, the genes involved in the a-thalassemia mental retardation syndrome (ATR-X) and in the Jub- erg-Marsidi syndrome (JMS) have been closely linked to markers extending from DXS159 (Xql2) to DXYS1 (Xq21) (Gibbons et al., 1992; Saugier-Veber et al., 1993, respectively). As several patients showing deletion of the Xq21 band (Cremers et al., 1989), without the clini- cal presentation of ATR-X or JMS, have been described, we can suppose that the loci for these syndromes are likely to be located in Xql3. Moreover, the maximal lod scores have been obtained with polymorphic probes located in the Xq13.3 subband (DXS441, lod score 3.24 at 0 = 0.00 for JMS; and PGK1, lod score 3.46 at 0 = 0.00 for ATR-X). Thus, we decided to set up the con- struction of a contig covering the region from Xql2 to Xq21 by constructing a first YAC contig spanning the Xq13.3 subband. To achieve this goal we screened two YAC libraries: the ICRF library (screened by hybridization with ex- isting probes) and the CEPH library (screened with previously described PCR primers or with primers de- signed from sequences performed in the laboratory; see below). Using this strategy we had reported the con- struction of a YAC contig linking DXS56 to PGK1 (Gecz et al., 1993). In this paper we describe a 2.1-Mb contig (including previously described and new YACs) encom- passing all of the Xq13.3 subband from DXS441 to PGK1. This approach allowed us to map the probes assigned to that region of the human X chromosome. However, a YAC contig (and thus a physical map de- duced from these clones) may not be always exactly representative of the structure of the corresponding ge- nomic region. The comparison of the map obtained from YACs and the genomic map is difficult, mainly because the rare-cutter sites are not methylated in yeast, but are methylated in mammalian genomes. However, a few rare-cutter sites are not methylated in the human genome. Sfi I is one of them and yields relatively large 115 0888-7543/95 $6.00 Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

Upload: independent

Post on 25-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

GENOMICS 26, 1 1 5 - 1 2 2 (1995)

Construction of a YAC Contig Spanning the Xq13.3 Subband LAURENT VILLARD,* JOZEF GECZ, *A LAURENCE COLLEAUX,* ANNE-MARIE Loss~,* JAMEL CHELLY,I

YUMIKO ISHIKAWA-BRUSH,-I- ANTHONY P. MONACO,'I" AND MICHEL FONTES *'2

*INSERM U406, Faculte de M~decine de La Timone, 27, Boulevard Jean Moulin, 13385 Marseille CEDE)( 5, France," and tHuman Genetics Laboratory, Imperial Cancer Research Fund, Institute of Molecular Medicine,

John Radcliffe Hospital, Headington, Oxford OX3 9DU, United Kingdom

Received September 22, 1994; revised December 12, 1994

The loci i n v o l v e d in several X-l inked menta l retarda- t ion s y n d r o m e s have been l inked to the pericentro- meric reg ion of the X c h r o m o s o m e long arm (Xq12- q21). To isolate candidate genes for these diseases , we set up the cons truct ion of YAC cont igs spann ing this region. Two of these s y n d r o m e s (the J u b e r g - M a r s i d i s y n d r o m e and the a - tha les semia menta l re tardat ion syndrome) h a v e b e e n recent ly l inked, wi th h igh lod scores, to po lymor ph ic probes prev ious ly as s igned to Xq13.3. We therefore cons truc ted a first YAC contig, e n c o m p a s s i n g this band, from DXS441 to PGK1. The phys ica l map, d e d u c e d from the i so lated clones, ex- tends over 2.1 Mb of ge nomic DNA. Restr ic t ion analys is of the YAC cont ig a l lowed us to map prec ise ly the loci prev ious ly as s igned to that c h r o m o s o m a l reg ion and to define their re lat ive order. The val id i ty o f this phys ica l map has b e e n c h e c k e d by compar ing Sfi I digests of the YACs to genomic fragments obta ined wi th the same enzyme. A cDNA se lect ion approach, a lready per- formed wi th a prev ious partial contig, has been ex- t ended to cover the w h o l e region. © 1995 Academic P ..... Inc.

INTRODUCTION

It is well known that in human an excess number of males versus females suffers from mental retardation (MR). Genetic linkage analysis in families segregating MR has demonstrated that several loci located on the X chromosome are genetically linked to MR, named X- linked mental retardation (XLMR). The most recent data demonstrated that up to 127 different XLMR enti- ties (excluding fragile X syndrome) have been assigned to different regions of the human X chromosome whether syndromal or nonsyndromal (reviewed in Schwartz, 1993; Neri et al., 1994). Among these, 50% have been associated with loci in the pericentromeric region of the human X chromosome (Schwartz, 1993; Neri et al., 1994). To obtain molecular resources, which

1 P r e s e n t address : D e p a r t m e n t of Genet ics , S lovak A c a d e m y of Sci- ences , V l a r s k a 5, 83334 Bra t i s l ava , Slovakia .

2 To w h o m cor respondence shou ld be addres sed . Telephone: 33-91 78 44 77. Fax: 33-91 80 43 19.

would be useful in the isolation of the genes involved in these diseases, we have chosen the following strat- egy: (1) to construct a YAC contig covering the region; (2) to use these resources to isolate candidate genes.

Recently, the genes involved in the a- thalassemia mental retardation syndrome (ATR-X) and in the Jub- erg-Mars id i syndrome (JMS) have been closely linked to markers extending from DXS159 (Xql2) to DXYS1 (Xq21) (Gibbons et al., 1992; Saugier-Veber et al., 1993, respectively). As several patients showing deletion of the Xq21 band (Cremers et al., 1989), without the clini- cal presentation of ATR-X or JMS, have been described, we can suppose that the loci for these syndromes are likely to be located in Xql3. Moreover, the maximal lod scores have been obtained with polymorphic probes located in the Xq13.3 subband (DXS441, lod score 3.24 at 0 = 0.00 for JMS; and PGK1, lod score 3.46 at 0 = 0.00 for ATR-X). Thus, we decided to set up the con- struction of a contig covering the region from Xql2 to Xq21 by constructing a first YAC contig spanning the Xq13.3 subband.

To achieve this goal we screened two YAC libraries: the ICRF library (screened by hybridization with ex- isting probes) and the CEPH library (screened with previously described PCR primers or with primers de- signed from sequences performed in the laboratory; see below). Using this strategy we had reported the con- struction of a YAC contig linking DXS56 to PGK1 (Gecz et al., 1993). In this paper we describe a 2.1-Mb contig (including previously described and new YACs) encom- passing all of the Xq13.3 subband from DXS441 to PGK1. This approach allowed us to map the probes assigned to that region of the human X chromosome. However, a YAC contig (and thus a physical map de- duced from these clones) may not be always exactly representative of the structure of the corresponding ge- nomic region. The comparison of the map obtained from YACs and the genomic map is difficult, mainly because the rare-cutter sites are not methylated in yeast, but are methylated in mammalian genomes. However, a few rare-cutter sites are not methylated in the human genome. Sf i I is one of them and yields relatively large

115 0888-7543/95 $6.00

Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

116 VILLARD ET AL.

f r agmen t s . Thus , we com pared Sf i I d igests f rom YACs and f rom a genomic source to va l ida te our contig.

In a f u r t h e r step, we decided to isola te new genes f rom this region (only two genes were k n o w n to be pres- en t in th is region: PGK1 and the gene involved in the Menkes disease, ATP7A, bo th located in the s a m e 300- kb segment ; Michelson et al., 1983; Vulpe et al., 1993; Che l lye t al., 1993; Merce r et al., 1993). For th is pu rpose a cDNA select ion app roach h a s been chosen. This s t ra t - egy, a l r eady appl ied in the d is ta l p a r t of the contig (Gecz et al., 1993), led us to propose four new genes. Th ree o the r sequences , located in th is region, have been iso la ted f rom cDNA l ibrar ies ; however , we are not sure t h a t these sequences r e p r e s e n t t r ue genes.

MATERIALS AND METHODS

YAC clones isolation. ICRF library (Larin et al., 1991) screening was performed as described (Monaco et al., 1992). Pools of the CEPH library (Albertsen et al., 1990) were screened for the presence of the different loci with the primers already published for probe DXS441 (Ram et al., 1993) and with the following primers that we designed after sequencing the pQST24Rla probe (DXS325): 5' CGGTCACAT- ATCTCTGTCTTC and 5' GAACCACAGCATTACTCTGCT (325-1 and 325-2, respectively). Official YAC numbers for ICRF library are as follows: ICRFy900F04146 (4767); ICRFy900Dl156 (4812); ICRFy- 900C0254 (4817); ICRFy900C1059 (4806); ICRFy900G0601 (4622); ICRFy900G0701 (4623); ICRFy900F0762 (4551); ICRFy900A085 (4548).

Restriction analysis of YAC clones and genomic DNA. Yeast cells were grown and YAC DNA was prepared as described (Gecz et al., 1993). For PFGE analysis, the YAC DNA in agarose blocks was di- gested with three restriction enzymes (Sfi I, MluI, and SacII) in buff- ers supplied by the manufacturer (New England Biolabs). DNA from the 4x-containing cell line was digested with 40 units of SfiI for 24 h in the buffer supplied by the manufacturer (New England Biolabs). The digested YACs or genomic DNA was separated in 1% agarose gels in a Chef apparatus (Bio-Rad). After electrophoresis for 18-46 h at 200 V, with pulse times varying from 40 to 120 s, gels were blotted onto Biodyne membranes (Pall Industries) and sequentially hybridized with end (pYAC4 vector restriction fragments; Burke et al., 1987) and internal specific probes and finally with total human DNA.

cDNA selection, cDNA selection was performed as described (Gecz et al., 1993). A human fetal brain cDNA library cloned in Lambda ZAP II (gift of M. Djabali) was used.

Probe preparation and hybridization conditions. All probes were labeled using the random priming method as described (Feinberg and Vogelstein, 1984). All Southern blot hybridizations were per- formed in SSC buffer (5x SSC, 5x Denhardt's, 0.1% SDS) for 16- 20 h at 65°C in a rotary hybridization oven (Hybaid). Al~er hybridiza- tion, filters were washed for 20 min in 5x SSC, 0.5% SDS, 2x SSC, 0.1% SDS, lx SSC, 0.1% SDS and from 0 to 40 min in 0.1x SSC, 0.1% SDS, depending on the stringency of the wash. All membranes were then exposed to Hyperfilm MP (Amersham) from 1 h to 5 days, depending on the blot and the probe used.

DNA sequencing. Plasmid DNA was prepared from 10-ml over- night cultures using the Qiagen kit (Diagen) and sequenced using the dideoxynucleotide chain termination method with fluorescent M13 universal and reverse primers on a Pharmacia (ALF) automated DNA sequencer. Sequencing reactions were carried out as described (Zimmermann et al., 1990).

RESULTS

Y A C Clone I so la t ion a n d C h a r a c t e r i z a t i o n

Severa l DXS loci and one gene have been localized wi th in Xq13.3 by somat ic cell hybr id m a p p i n g (Lafren-

i~re et al., 1991). To bui ld a YAC contig cover ing the whole subband , the I C R F (Lar in et al., 1991) and C E P H (Alber tsen et al., 1990) YAC l ibrar ies were sc reened wi th severa l of the following probes (Table 1): L2.98 (DXS56), pX63c (DXS171), p Q S T 2 4 R l a (DXS325), p R X 8 7 H l a (DXS347), p R X 1 7 6 M l a (DXS356), pRX214- H l c (DXS441), and p S P T - P G K (PGK1). For each locus, severa l YACs were i so la ted (Table 1) by e i ther hybr id- izat ion ( ICRF l ibrary) or PCR screen ing ( C E P H li- brary) . W h e n no PCR p r i m e r s were ava i lab le for screening, we sequenced the concerned probe and de- s igned p r i m e r s (see Ma te r i a l s and Methods) .

YACs were f irst sized by hybr id iza t ion wi th to ta l hu- m a n DNA on und iges t ed P F G E - s e p a r a t e d clones. Sev- era l y e a s t cells con ta ined more t h a n one YAC and were excluded f rom f u r t h e r ana lys i s (da ta not shown).

A first contig, e n c o m p a s s i n g the mos t d is ta l loci DXS56 and PGK1, h a s a l r e ady been descr ibed (Gecz et al., 1993). New YACs, descr ibed in th is work, were used to build a second cont ig s p a n n i n g the loci DXS171, DXS325, DXS347, DXS356, and DXS441. In an effort to l ink the DXS56 YACs wi th the DXS171, DXS325, DXS347, DXS356, and DXS441 YACs, we screened the C E P H MegaYAC l ibrary , u s ing p r i m e r s for DXS441 (Ram et al., 1993) or des igned in the l abo ra to ry for DXS325 (see Ma te r i a l s and Methods) . This sc reen ing al lowed us to isolate addi t iona l YACs, two posi t ive for DXS441 and one posi t ive for DXS325 (Table 1). One DXS441-posi t ive YAC ( 9 5 9 H l l ) was found to conta in DXS441, DXS171, and DXS56, bu t t u r n e d out to be nega t ive for the in -be tween probes (DXS325, DXS347, and DXS356). This YAC was found to have i ts cen t ra l p a r t de le ted (see below). As for the DXS325-posi t ive YAC, it was found to be posi t ive for DXS356, bu t not for DXS56. As these f indings did not al low us to close the gap be tween DXS356 and DXS56, we pe r fo rmed IRS-PCR f ingerpr in t ing of all of the isola ted YACs to detect po ten t i a l over laps be tween these clones. F r o m f ingerpr in t ing d a t a (Fig. 1A and da t a not shown), we deduced t h a t severa l DXS56-posi t ive YACs h a d fin- g e r p r i n t i n g s imi la r i t i es wi th a C E P H MegaYAC (880H6), con ta in ing the loci DXS356 and DXS325. To conf i rm this po ten t ia l over lap, we used an A l u - P C R product , der ived f rom YAC 880H6 (AM1300, Fig. 1A), as a probe and found t h a t i t hybr id izes to YACs 880H6, 4548, 4622, 4623, 9 5 9 H l l , and 4551 (Fig. 1B and da t a not shown). The A l u - P C R produc t cor responding to AM1300, syn thes i zed us ing the DNA of d i f ferent YACs, was gel-purif ied and subjected to res t r ic t ion ana lys i s (Fig. 1C). All t h ree AM1300 f r a g m e n t s showed the s a m e res t r ic t ion p a t t e r n us ing th ree res t r ic t ion en- zymes. Thus , we concluded t h a t these YACs conta ined the s a m e sequence. These f indings conf i rmed the over- lap be tween these clones and closed the gap b e tw een the two prev ious contigs.

Res t r i c t i on A n a l y s i s o f the Y A C Cont ig

M a p p i n g of the different YAC clones was conducted us ing r a r e - c u t t e r res t r ic t ion endonuc leases to deter -

Xq13.3 PHYSICAL MAP

T A B L E 1

Y A C s U s e d t o B u i l d t h e C o n t i g M a p

117

YAC ID/ Size locus DXS441 DXS171 DXS347 DXS325 DXS356 DXS56 PGK1 (kb) Library Status

959Hl l + + + 800 CEPH Deleted 851H7 + + 1000 CEPH Coligated 4767 + + + 440 ICRF N 4812 + + + 650 ICRF N 4817 + + 230 ICRF N 4806 + + 640 ICRF N 880H6 + + 800 CEPH N 4622 + 510 ICRF N 4623 + 560 ICRF N 4551 + 450 ICRF N 4548 + 800 ICRF N 326B10 + 320 CEPH N Probe pRX214Hlc pX63c pRX87Hla pQST24Rla pRX176Mla L2.98 pSPT-PGK

Note. N, not coligated, not deleted. See Materials and Methods for official ICRF YAC numbers.

mine distances between the different loci (Fig. 2). Dur- ing these experiments, several YACs were shown to be deleted and/or rearranged. However, a coherent restric- tion enzyme analysis profile within a given par t of the clones allowed us to include them in the YAC contig (e.g., YAC 9 5 9 H l l showing a 750-kb deletion in the middle of an original YAC clone or YAC 851H7 being

rearranged proximally to DXS441; see below for this lat ter YAC). After collecting all restriction enzyme analysis data, we were able to define distances within the YAC contig. The most centromeric locus within Xq13.3 is DXS441, the most distal being PGK1. From centromere to telomere, the locus order is DXS441- D X S 1 7 1 - D X S 3 4 7 - D X S 3 2 5 - D X S 3 5 6 - D X S 5 6 - P G K 1

A ~

' , t ,4" "4"

AM1300

B < z a

4.5 kb

4622---]

880H6 [ Alul 4 5 5 1 1

MWM

4622"~

MWM 4622 [ 880H6 Sau3A 4651.

FIG. 1. (A) IRS-PCR fingerprinting of Xq13.3 YACs conducted using primers A1 (primers described in Gecz et al., 1993). Under these conditions (56°C annealing temperature) YAC 4812 does not yield any amplification product. (B) Mapping of probe AM1300 derived from YAC 880H6 on a YAC panel (EcoRI digestions, line 959Hl l is underloaded). (C) Restriction analysis of the AM1300 A l u - P C R fragments synthesized from the DNA of three different YACs from the region. Note tha t the restriction pattern, using three different restriction enzymes, is identical for the three YACs.

118 VILLARD ETAL.

Xcen

DXS441

O 4548-7 PGK1

Xtel

F M S M F M F S

I I I I I I I I

8 5 1 H 7 . . . .

DXS441 DXSI71 DXS347 DX8325 DX8356

M

I

R I I I

I I I I R

4 7 6 7 L I I I I R

4 8 1 2 R I I I I I

4 8 1 7 R I L

4 8 0 6 L I I

I L

M F M F F F M F S S F F

I II I I I I I I

AMI300 DXS56 R I I I L 3 2 6 8 1 0

R I I I I I I I I L 4 5 4 8

' ' ' I I L 9 5 9 H I I

L I I I I I R 4 5 5 1

L I ' ' ' ' ' R 4 6 2 3

L I I I I I I I n 4 6 2 2 R

8 8 0 H 6 L I I I I I R

100kb l

FIG. 2. 2.1-Mb YAC contig. Location of Xql3.3 reference loci is indicated by hatched boxes. Restriction sites are indicated (M, MluI; S, SacII; F, SfiI) by small vertical bars on each YAC and on the corresponding consensus map. Dotted lines represent regions where YACs do not correspond to the genomic DNA (deletion for YAC 959Hll and coligation for YAC 851H7). 'I, indicates the location of Xq13.3 polymorphic loci. Right and left ends are indicated for each YAC (R and L, respectively).

within a 2.1-Mb region. Using single and double en- zyme digestions of the YAC clones, we were able to assign a minimal interval for each locus of less than 100 kb (Fig. 2).

Comparison of the YAC Contig with Genomic DNA

the genomic map. These data confirmed the genomic pulsed-field map recently proposed by Lafreni~re et al. (1993). Moreover, we have indicated in Fig. 2 the loca- tion of the polymorphic loci located within Xq13.3 to integrate the genetic map with the presented physical map. Locus 4548-7 is described in Graeber et al. (1992).

To confirm tha t the YACs used to build the contig were fully representat ive of the genomic region under analysis, we performed a comparison of the Sfi I restric- tion fragments within the YAC clones and within geno- mic DNA of a 4x-containing cell line. SfiI is a conve- nient restriction enzyme to use for such a purpose as it is not sensitive to methylat ion events. For locus DXS441, the SfiI fragment tha t contains the locus is different on genomic DNA and on the 851H7 YAC, thus indicating tha t this YAC is rearranged. As identical positions for several restriction sites are found distally to DXS441, we assume tha t this YAC is rearranged proximally to this locus. Identical Sfi I restriction frag- ments were identified with locus DXS325 and YAC 4812 and locus DXS56 and YACs 4548 and 4622, re- spectively. These results led us to the conclusion that the YAC contig tha t we constructed did not contain major rear rangements and/or deletions with respect to

Isolation and Characterization of New cDNA Clones

To isolate new genes from the region, we used a cDNA selection approach (Parimoo et al., 1991; Lovett et al., 1991). In a previous work (Gecz et al., 1993) we have demonstrated the validity of this approach, which allowed us to isolate three new genes in the DXS56- PGK1 interval. We thus conducted similar experi- ments, using the newly isolated YACs described in this paper (CEPH 880H6 and 955Hl l , ICRF 4767; see Fig. 2).

From the clones of the different "mini-libraries" of selected cDNAs, 10 to 40% turned out to be ribosomal DNA clones and 1 to 10% human repetitive element- containing sequences. These clones were discarded. The remaining clones were mapped to YAC panels (Fig. 3). These experiments led us to propose a potential localization for five clones in the genomic region cov-

Xq13.3 PHYSICAL MAP 119

¢,

380 kb

FIG. 3. Mapping of clone 4551-E9 in the Xq13.3 region. Different YAC clones were digested with SfiI or MluI (or both) and separated by pulsed-field gel electrophoresis. The resulting gel was blotted and hybridized with the E9 cDNA. YACs 4623 and 4548 are partially digested with Sfi I. S. cerevisiae chromosomes were used as molecular weight markers.

ered by the YACs. These clones have been hybridized to the other clones of the mini-libraries. One sequence (959HllS1-D4) was found to be present in six other clones. Sequencing of this clone showed tha t it corre- sponds to the region of the XNP gene containing a tri- nucleotide repeat (Gecz et al., 1994). One sequence (880H6S2-A8) was present in three other clones. Other sequences (880H6S2-G5,880H6S2-E5, and 4767S1-B3) were present once in a given mini-library.

An outline of the previous results and of the new data presented in this study could be proposed as follows:

Seven new sequences, presenting homologies to se- quences located in Xq13.3 (Table 2), have been isolated. The different cDNAs were first sequenced and com- pared to each other:

- -Clones 4551-E9, 4551-I5, and 4551-H5 are identi- cal and do not correspond to any other known gene.

~. Hindlll

4812

880H6

4548

Human DNA

FIG. 4. Mapping of the RPS26 cDNA on a YAC panel (EcoRI digestions).

This sequence is present only once in the human ge- nome (data not shown) and is located in Xq13.3 (Fig. 3). Hybridization to poly(A) ÷ Northern blots does not reveal any signal even after long exposure, as well as RT-PCR experiments. Moreover, the clone is colinear with the genomic DNA. Thus, we consider this se- quence a contaminant of the cDNA library (probably a genomic fragment).

- -Clone 326B10-J15 encodes a 4.5-kb transcript, specifically expressed in skeletal muscle (Gecz et al., 1993).

- -Clone 4551-I14 encodes a 1.5-kb ubiquitous tran- script (Gecz et al., 1993). These two clones do not show any homology with a previously isolated sequence.

- -Clone 4767S1-B3 is identical to a part of the ribo- somal protein $26 coding sequence (RPS26) when com- pared against GenBank. It means tha t an RPS26 se- quence is present in this region of the human X chromo- some. Whether this sequence represents a gene or a pseudogene is not easy to determine. However, we may note that, using primers designed from the published sequence, we obtain, on genomic DNA, an amplification product of the same size as the cDNA band and no amplification product on the YAC (data not shown). Our interpretation is that the corresponding sequence present in the YAC contains at least one intron. The RPS26 cDNA probe was used to map the sequence in the Xq13.3 region (Fig. 4). However, using the RPS26 cDNA probe, we detected 26 different bands on genomic

TABLE 2

Description of the Xq13.3 cDNA Clones

Size Identical Clone (bp) Clone status clones Expression

4551-Ell 100 XNP 4 CNS + muscle 959H11-D4 410 XNP 7 CNS + muscle 4551-E9 518 Genomic fragment 3 None detected 4551-I14 550 Potential new gene 1 Ubiquitous 326B10-J15 195 Potential new gene 1 Muscle 326B 10-G9 210 PGK1 3 Ubiquitous 326B10-D14 330 ATP7A 1 Ubiquitous ( - liver) 880H6-E5 450 Potential new gene 1 ND 880H6-A8 400 Potential new gene 3 ND 880H6-G5 370 Potential new gene 1 ND 4767-B3 200 RPS26 1 Ubiquitous

Note. ND, not done.

120 VILLARD ET AL.

DNA, probably corresponding to processed pseudogene sequences (data not shown).

- -Clone 959H11S1-D4 was found to be identical to six other clones and to be part of the XNP transcript (Gecz et al., 1994). This gene is very interesting as it is strongly expressed in the central nervous system and in striated muscle (heart and skeletal). Moreover, in situ hybridization demonstrated that the murine homologue is expressed only in differentiating neurons and stops being accumulated in fully differentiated brain territor- ies. The structure of the potential protein indicates that it could be a transcriptional factor (Gecz et al., 1994).

- -Clone 880H6S2-A8 (and related clones 880H6S2- H2, 880H6S2-B2, and 880H6S2-G6) and clones 880H- 6S2-G5 and 880H6S2-E5 all contained a part of a re- peated element and were therefore subcloned to give unique fragments, which have been localized on the YACs. Using this probe we demonstrated that these clones are all located in the same genomic region, proxi- mal to DXS56. The sequences of these subclones do not show any open reading frame and, when compared to databases, do not show homology with a previously de- scribed sequence. Unfortunately the subclone con- taining a unique X-linked sequence was very small (about 100 bp). This probe did not show any hybridiza- tion signal on Northern blots and was too small to design good primers to be used in RT-PCR experiments. How- ever, these sequences did not look like a contaminant genomic fragment (like E9), as they hybridized to sev- eral genomic fragments (data not shown). Thus, we could not conclude that these sequences represent part of a noncoding region of a new gene from the region or an uncorrect transcript (an incorrectly spliced transcript of the region, as described by Sedlacek et al., 1993).

These data are summarized in Fig. 5, together with the physical data presented above.

D I S C U S S I O N

Using probes (or PCR primers) corresponding to seven STSs, we have built a YAC contig covering the human chromosomal subband Xq13.3. From DXS441 to DXS356, the contig has been constructed using only the STS content of the YACs. As for the distal part of the contig, from DXS356 to PGK1, YAC fingerprint analysis has been necessary to detect potential over- laps (as was already described in a previous paper, Gecz et al., 1993). These overlaps have been confirmed using probes, derived from the YACs, connecting sev- eral YACs together. This approach allowed us to map the loci previously assigned to this chromosomal re- gion. Locus order was found to be: (cen)-DXS441- DXS 171 - D X S 3 4 7 - DXS3 2 5 - DXS3 5 6 - D X S 5 6 - PGKl- ( t e l ) within 2.1 Mb ofgenomic DNA. These data confirmed the previously published genomic pulsed- field map with regard to probe order (Lafreni~re and Willard, 1993); however, we may note that all Xq13.3 loci are present within a shorter genomic interval. This may indicate that the distance between the most proxi- mal locus in Xq13.3 (DXS441) and the most distal locus

in Xq13.2 (DXS128) will be more than 1 Mb, making physical linking of the two loci difficult. This has been confirmed by MegaYAC library screening with these two probes. No YACs containing both loci, or exhibiting a potential overlap, have been found (data not shown). These findings are consistent with the genomic map published by Lafreni~re and Willard, (1993), indicating that the two loci, DXS128 and DXS441, are separated by a distance larger than 1 Mb. We also have indicated the position of the probes recognizing polymorphic loci to integrate genetic data into the physical map. The construction of this contig represents the first step in the isolation of new genes from the region. It represents a very powerful resource, considering the new data on the physical map, as well as the new cloned material produced in this work.

The second part of the discussion regards the pres- ence of new genes in the region. From the present work it appears that the Xq13.3 region could be divided into two subregions, separated by the probe AM1300. In the distal part of the contig, we easily isolated cDNA sequences corresponding to true genes, every 100 kb. We selected several times previously known sequences (such as ATP7A gene and PGK1) and isolated new ones, allowing the identification of more transcripts than previously detected CpG islands (Ttimer et al., 1992; Gecz et al., 1993). We may note that this region is very rich in Alu sequences and contains several translocation breakpoints (two associated with Menkes disease, one, characterized in our laboratory, between the Menkes gene and PGK1). It should be noted that this region is unstable in the YAC clone (data not shown).

In contrast, we have encountered difficulties in iso- lating transcribed sequences from the proximal part of the region. We frequently detected L1 and middle repetitive sequences. We may note that no breakpoints associated with genomic rearrangements have been lo- calized to this region so far. The fact that we were not able to detect transcribed sequences in the central part of the region could be due to the fact that we used only a fetal brain cDNA library. A comparison of the Lafreni~re et al. genomic map and our map, con- structed with YACs, clearly indicates that the majority of sites detected in YACs are methylated in the genome. This is particularly true for MluI sites, which are all methylated, except a site close to DXS56. We may note that five probes lie within the same 2.3-Mb fragment in the genome, but are contained in five different frag- ments in the YACs. From the genomic map, only three CpG islands could be evidenced in this part of the Xq13.3 subband. One is about 400 kb proximal to DXS441 and is probably not present in the region cov- ered by the YAC contig. A second is located between DXS171 and DXS347 and contains a SacII and an EagI sites. We have not indicated EagI sites in our map as we have no good probe in the middle of the region to construct a reliable map with this enzyme. However, we found an EagI site close to the SacII site, adjacent to DXS171 in YACs 959Hl l , 851H7, 4767, and 4812.

Xccn

Xq13.3 P H Y S I C A L MAP 121

x~a

44-1

M

I

171 347 325 356 AM1300 56

M S M F M F S M F F F M F

I I I I I I I I II II

F M S S F F

I I I I

I I ATPTA PGK1

I14

1 J15 E9

133 E5 I)4

G5 A8

100kb

FIG. 5. Transeriptional map of the Xq13.3 subband. Black bars indicate the maximal interval for location of the cDNAs.

Interestingly, this CpG island lies close to the position where we have localized the RPS26 sequence. Whether this CpG island is the one associated with this sequence remains to be determined. The third CpG is located close to and telomeric of DXS56. We have already de- scribed this island (Tiimer et al., 1992; Gecz et al., 1993). Thus, the CpG island content of the Xq13.3 sub- band does not seem to be homogeneous: four islands are located in the 800-kb PGK1-DXS56 interval, whereas only two seem to be detected (Lafreni6re and Willard, 1993; this study) in a minimal 1.5-Mb interval, proxi- mal to DXS56.

These data led us to ask a question about a potential "microheterogeneity" of the genomic structure. For a long time, it has been believed that the chromosome could be divided in several region and subregions (bands and subbands), differing in their content in genes and repetitive elements. This has led to the con- cept of R and G bands. The present results indicate tha t this concept could be extended to smaller regions. An interesting model of the structure of the human genome would be tha t it is formed of more or less homo- geneous alternating blocks, about 1-2 Mb in length, rich and poor, respectively, in A l u sequences, in genes, and in breakpoints associated with genomic re- arrangements. This organization could be related to the "isochores," described by Bernardi (1989). This model is purely speculative, and further work is needed. To validate it, we should extend this kind of comparison between structure, expression, and plastic- ity over large genomic regions.

Among the new genes localized in this region, XNP, a

gene whose transcript is strongly accumulated in brain and striated muscle, will be a good candidate to be involved in an XLMR syndrome. However, neither JMS nor ATR-X patients exhibit neuromuscular problems. On the other hand, another syndrome, the Al lan- Herdnon-Dudley syndrome (AHD), maps to an inter- val compatible with the localization of XNP (Neri et al., 1994) and shows a combination of severe mental retardation and neuromuscular disorders.

Among the other new genes from the region, J15, coding for a muscle-specific transcript, is probably not the best candidate to be involved in a MR syndrome, although we could not totally exclude it.

On the contrary, two ubiquitously expressed genes, I14 and RPS26, could perfectly be involved in these diseases. As for I14, we have no indications about the nature of the potential encoded protein. On the con- trary, concerning RPS26, a deficit in the expression of a ribosomal protein could probably affect the develop- ment of tissues, where cells require a high level of en- ergy and grow very rapidly. This is the case for the developing brain.

The physical map as well as the related transcrip- tional map will be extended proximally and distally to isolate more candidates for the above-mentioned dis- eases.

A C K N O W L E D G M E N T S

We t h a n k M. Ross, G. Zehetner , and H. L e h r a c h for access to ICRF YAC l i b r a ry and the Reference L i b r a r y D a t a b a s e ; D. Le P a s l i e r for access to the C E P H YACs; the U K H u m a n Genome M a p p i n g Project

122 VILLARD ET AL.

Resource Centre for the normalized fetal brain cDNA library; P. Fort for the full-length ribosomal $26 protein cDNA; M. Djabali for the human fetal brain cDNA library; and E. Passage for helpful discus- sion about the technical support of the method. Funding support from ICRF, EC Human Genome Analysis Programme, International Human Frontiers Science Program, Association Fran~aise contre les Myopathies (AFM), and Groupe d'Etude et de Recherche sur le G~- nome (GREG). The GenBank accession numbers are 326B10-J15, L33811; 4551-I14, L33813; 4551-E9, L33807; 880H6-G5, L33808; 880H6-A8, L33809; 880H6-E5, L33810.

REFERENCES

Albertsen et al. (1990). Construction and characterization of a yeast artificial chromosome library containing seven haploid human ge- nome equivalents. Proc. Natl. Acad. Sci. USA 87: 4256-4260.

Bernardi, G. (1989). The isochore organisation of the human genome. Annu. Rev. Genet. 23: 637-661.

Burke, D. T., Carle, G. F., and Olson, M. V. (1987). Cloning of large segments of exogenous DNA into yeast by means of artificial chro- mosomes vectors. Science 236: 806-812.

Chelly, J., Ttimer, Z., Tonnessen, T., Petterson, A., Ishikawa-Brush, Y., Tommerup, N., Horn, N., and Monaco, A. P. (1993). Isolation of a candidate gene for Menkes disease that encodes a potential heavy metal binding protein. Nature Genet. 3: 14-19.

Cremers, F. P. M., Van de Pol, D. J. R., Diergaarde, P. J., Wieringa, B., Nussbaum, R. L., Schwartz, M., and Ropers, H. H. (1989). Phys- ical fine mapping of the choroideremia locus using Xq21 deletions associated with complex syndromes. Genomics 4: 41-46.

Feinberg, A. P., and Vogelstein, B. (1984). A technique for radiolabel- ing DNA restriction endonuclease fragments to high specific activ- ity. Anal. Biochem. 137: 266-267.

Gecz, J., Villard, L., Lossi, A. M., Millasseau, P., Djabali, M., and Fontes, M. (1993). Physical and transcriptional mapping of DXS56- PGK1 1 Mb region: Identification of three new transcripts. Hum. Mol. Genet. 2: 1389-1396.

Gecz, J., Pollard, H., Consalez, G., Villard, L., Stayton, C., Millas- seau, P., Khrestchatisky, M., and Fontes, M. (1994). Cloning and expression of the murine homologue of a putative human X-linked nuclear protein gene closely linked to PGK1 in Xq13.3. Hum. Mol. Genet. 3: 39-44.

Gibbons, R. J., Suthers, G. K., Wilkie, A. O. M., Buckle, V. J., and Higgs, D. R. (1992). X-linked a-thalassemia/mental retardation (ATR-X) syndrome: Localization to Xq12-q21.31 by X inactivation and linkage analysis. Am. J. Hum. Genet. 51: 1136-1149.

Graeber, M. B., Monaco, A. P., Chelly, J., and Mtiller, U. (1992). Isolation of DNTR polymorphisms from yeast artificial chromo- somes encompassing X chromosomal loci PGK1 and DXS56. Hum. Genet. 90: 270-274.

Lafreni~re, R. G., Brown, C. J., Powers, V. E., Carrel, L., Davies, K. E., Barker, D. F., and Willard, H. F. (1991). Physical mapping

of 60 DNA markers in the p21.1-q21.3 region of the human X chromosome. Genomics 11: 352-363.

Lafreni~re, R. G., and Willard, H. F. (1993). Pulsed-field map of Xql3 in the region of the human X inactivation center. Genomics 17: 502-506.

Larin, Z., Monaco, A. P., and Lehrach, H. (1991). Yeast artificial chromosome libraries containing large inserts from mouse and hu- man DNA. Proc. Natl. Acad. Sci. USA 88: 4123-4127.

Lovett, M., Kere, J., and Hinton, M. L. (1991). Direct selection: A method for the isolation of cDNAs encoded by large genomic re- gions. Proc. Natl. Acad. Sci. USA 88: 9628-9632.

Mercer, J. F. B., Livingston, J., Hall, B., Paynter, J. A., Begy, C., Chandrasekharappa, S., Lockhart, P., Grimes, A., Bhave, M., Sie- mieniak, D., and Glover, T. W. (1993). Isolation of a partial candi- date gene for Menkes disease by positional cloning. Nature Genet. 3: 20-25.

Michelson, A. M., Markham, A. F., and Orkin, S. H. (1983). Isolation and DNA sequence of a full-length cDNA clone for human X chro- mosome-encoded phosphoglycerate kinase. Proc. Natl. Acad. Sci. USA 80: 472-476.

Monaco, A. P., Walker, A. P., Millwood, I., Larin, Z., and Lehrach, H. (1992). A yeast artificial chromosome contig containing the com- plete Duchenne muscular dystrophy gene. Genomics 12: 465-473.

Neri, G., Chiurazzi, P., Arena, J. F., and Lubs, H. A. (1994). XLMR genes: Update 1994. Am. J. Med. Genet. 51: 542-549.

Parimoo, S., Pantanjali, S. R., Shukla, H., Chaplin, D. D., and Weiss- man, S. M. (1991). cDNA selection: Efficient approach for the selec- tion of cDNAs encoded by large chromosomal DNA fragments. Proc. Natl. Acad. Sci. USA 88: 9623-9627.

Ram, K. T., Barker, D. F., and Puck, J. M. (1993). Dinucleotide repeat polymorphism at the DXS441 locus. Nucleic Acids Res. 20: 1428.

Saugier-Veber, P., Abadie, V., Moncla, A., Mathieu, M., Piussan, C., Turleau, C., Mattei, J. F., Munnich, A., and Lyonnet, S. (1993). The Juberg-Marsidi syndrome maps to the proximal long arm of the X chromosome (Xq12-q21). Am. J. Hum. Genet. 52: 1040- 1045.

Schwartz, C. E. (1993). X-linked mental retardation: In pursuit of a gene map. Am. J. Hum. Genet. 52: 1025-1031.

Sedlacek, Z., Korn, B., Konecki, D. S., Siebenhaar, R., Coy, J. F., Kioschis, P., and Poustka, A. (1993). Construction of a transcrip- tion map of a 300 kb region around the human G6PD locus by direct cDNA selection. Hum. Mol. Genet. 2: 1865-1869.

Ttimer, Z., Chelly, J., Tommerup, N., Ishikawa-Brush, Y., Tonnesen, T., Monaco, A. P., and Horn, N. (1992). Characterization of a 1.0 Mb YAC contig spanning two chromosome breakpoints related to Menkes disease. Hum. Mol. Genet. 1: 483-489.

Vulpe, C., Levinson, B., Whitney, S., Packman, S., and Gitschier, J. (1993). Isolation of a candidate gene for Menkes disease and evi- dence that it encodes a copper-transporting ATPase. Nature Genet. 3: 7-13.

Zimmermann, J., Voss, H., Schwager, C., Stegeman, J., Erfle, H., Stucky, K., Kristensen, T., and Ansorge, W. (1990). A simplified protocol for fast plasmid DNA sequencing. Nucleic Acids Res. 18: 1067.