comparative analysis of the pig bac sequence involved in the regulation of myostatin gene

168 Science in China Ser. C Life Sciences 2005 Vol.48 No.2 168—180

Copyright by Science in China Press 2005

Comparative analysis of the pig BAC sequence involved in the regulation of myostatin gene*

YU Zhengquan1, LI Yan2, MENG Qingyong1, YUAN Jing1, ZHAO Zhihui1, LI Wei2, HU Xiaoxiang1, YAN Bingxue1, FAN Baoliang1, YU Shuyang1 & LI Ning1

1. State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing 100094, China; 2. Beijing Genomics Institute / Genomics and Bioinformatics Center, Institute of Genetics and Development Biology, Chinese Academy of Sciences, Beijing 100101, China Correspondence should be addressed to Li Ning (email: [email protected])

Received October 21, 2003; revised June 1, 2004

Abstract Myostatin (GDF8, MSTN) is a member of the transforming growth factor beta su-perfamily that is essential for proper regulation of skeletal muscle mass. In order to study its ex-pression and regulatory mechanism deeply, we have presented a comparative analysis of about 170-kb pig BAC sequence containing the myostatin gene among pig, human and mouse. The genomic region is characterized by high interspersed repeats and low G+C content. As for the myostatin gene, a higher sequence similarity is found between human and pig than between these species and the mouse. One striking feature is that the structure of two TATA-boxes in the nearby downstream of CCAAT-box is identified in the promoter. Further analysis reveals that the TATA-box1 is responsible for the transcription in pig and human, but the TATA-box2 acts on the transcription in mouse. The other interesting feature is that two polyadenylation signal sequences (AATAAA) exist in 3′UTR of the pig myostatin gene. Moreover, a large number of potential tran-scription factor-binding sites are also identified in evolutionary conserved regions (ECRs), which may be associated with the regulation of myostatin. Many putative transcription factors play an important role in the muscle development, and the complex interaction between myostatin and these factors may be required for proper muscle development. Keywords: myostatin, comparative analysis, transcriptional factors, muscle development.

DOI: 10.1360/03yc0217

Myostatin (GDF-8, MSTN) is a member of transforming growth factors (TGF-β) superfamily, which was first described by McPherron et al. in 1997[1]. Myostatin appears to act as a negative regulator of muscle development and controls not only fibre size but also fibre number[2,3]. Mutations in the third exon of the myostatin gene have been shown to cause double muscling in cattle[4]. By knocking out the gene of

myostatin in mice, they were able to show that the transgenic mice developed two to three times more muscle than mice that contained the same gene intact. Lee commented that the myostatin gene knockout mice “look like Schwarzenegger mice.” Myostatin- null mice show a dramatic and widespread increase in skeletal muscle mass due to an increase in number of muscle fibres (hyperplasia) and thickness of fibres

* The sequence data described in this paper have been submitted to GenBank under accession No. AY208121.

Comparative analysis of the pig BAC sequence involved in regulation of myostatin gene 169

(hypertrophy)[1]. Some data indicate that myostatin acts in clinical settings such as cachexia, where muscle is desired[5]. Myostatin plays an important role in fully developed skeletal muscle.

The expression of pig myostatin gene during em-bryo development and after birth varies at different stages and piglets with low birth weight have a mark-edly higher level of expression than those with high birth weight[6]. In pigs, the myostatin gene has been mapped to 15q2.3 by florescence in situ hybridiza-tion[7]. The partial genomic structure of pig myostatin gene has been determined, and is composed of three exons and two introns and the partial encoding se-quence has been presented[8]. Three mutations in non-conserved region for the pig myostatin gene may be associated with the lean meat mass[9].

The comparative sequence analysis provides an important tool for studying the complex genomic or-ganization and the putative regulatory regions. Here we present a comparative sequence analysis of about 170-kb pig BAC containing the myostatin gene among pig, human and mouse. The data illustrate that the ob-served sequence is characterized by high repeats and low G+C content. A large number of evolutionary conserved regions (ECRs) are identified, which are most likely explained by the complex regulation. The pig complete genome will be sequenced in the near future. The large scale sequence comparison between pig and other species is of great use in interpreting the genome evolution and finding the potential functional region.

1 Materials and methods

1.1 pUC18 shotgun library construction

One BAC clone containing the myostatin gene was identified from a pig genome library by hybridi-zation. The BAC DNA was isolated via alkaline lysis (Qiagen Plasmid Purification kit). BAC DNA was sonicated using Heat Hystems Ultrasonics sonicator (model JY92-II) with a chilled-cup horn probe. The sheared DNA was precipitated with 1/10 volume of 3 mol/L Na-acetate and 2 volumes of prechilled ethanol and washed twice with 70% ethanol and resuspended in 80 μL water. The 80 μL DNA was end-filled with

T4 DNA polymerase. The DNA and size marker lanes ran on the 0.8% sea Plaque agarose gel in 1×TAE for 4 h at 80 volts. The 1.5 kb—3.0 kb fragments in the BAC lane were excised and gel-purified according to the Qiagen protocol (Qiagen) and diluted in 15 μL water. The inserts were ligated into the prepared pUC18 vector. The pUC18 shotgun library was con-structed by transformation of ligation product.

1.2 pUC18 sequence template purification

Individual pUC18 clones were picked with a ster-ile toothpick and eluted in the tube of 96-deep-well plate containing YT culture and incubated for 14 h at 37 with vigorous shaking (220℃ rpm). The bacterial cells were harvested by centrifugation at 2000 g for 15 min, and resuspended with 200 μL solution I (50 mmol Tris, pH 7.8; 10 mmol EDTA; 10 mg/mL RNase A) in every tube, and added to 200 μL solution II (0.2 mol NaOH, 1% SDS) and mixed gently and thor-oughly, were added to 200 μL solution III (3.0 mol potassium acetate, pH 5.5) and mixed immediately and thoroughly and incubated on ice for 15 min. The lys-ates were filtered with a 96-well multiscreen (Whatman) into a new 96-deep-well plate by centrifu-gation at 2000 g for 5 min. The solution was precipi-tated by adding 150 μL (0.7 volume) isopropanol and centrifuged at 12000 g for 20 min, and the DNA pel-lets were washed twice with 70% ethanol. The air-dried DNA was redissolved in 40 μL water.

1.3 Sequencing of pUC18 template

The sequencing reaction was run in GeneAmp PCR system 9700 with universal M13 reverse and forward primer by using the Big Dye Terminal se-quencing kit (Perkin Elmer Applied Biosystem). The reaction condition is 95 2 min, 95 15 s, 50 20 s, ℃ ℃ ℃

60 4℃ min, 4 1℃ h and 30 cycles. Sequence was run on a 4% longer acrylamide gel using an ABM PRISM 377 DNA sequencer (Perkin Elmer Applied Biosys-tem). Electrophoresis run time was 7.5 h, and read length was truncated at 650 bases.

1.4 DNA sequence analysis

Sequences were assembled by Phred/phrap/Consed software package (http://www.genome.washington.

170 Science in China Ser. C Life Sciences

edu/phrap.docs/phrap.html). Sequence alignment be-tween the pig BAC and human counterpart was ana-lyzed by Genomic BLAST at NCBI (http://www. ncbi.nlm.nih.gov/BLAST/). The repeats were analyzed by RepeatMasker (version May 2002, sensitive set-tings, RepBase release 5.3). The tandem repeats were identified by Tandem Repeat Founder. CpG islands were identified by Cpgplot at NCBI-EBI. A CpG is-land is defined as a DNA stretch at least 200 bp long with G+C content >50%. Gene organization analysis of the myostatin was done by DNAMAN software. The transcription factor binding sites were identified by TFSEARCH (http://www.cbrc.jp/research/db/ TFSEARCH.html).

2 Results

2.1 Myostatin region genomic sequence

One BAC clone containing the myostatin gene was isolated from a pig genome library by hybridiza-tion of probe. We used the shotgun sequencing ap-proach to obtain the BAC sequence[10,11]. The BAC was subcloned into pUC18 shotgun libraries. The first phrase of sequencing generated five contigs (10—60 kb). Gaps were filled by using PCR and primer walk-ing. We obtained a finished sequence in length of about 170 kb from the region of pig chromosome 15q2.3.

By the genomic BLAST between the BAC (170 kb) and human genome (fig. 1), there were 22 signifi-cant hits (totally about 20 kb) matched with human 2q32.2 region in length of 214 kb. And all of the hits are interspersed through 145 kb in the pig BAC, 172 kb in the human counterpart. Each order and orienta-tion of the hits in the BAC is consistent with that in the human. Among the human sequences, there is a vital gene called GDF8 (myostatin) which was con-firmed by gene prediction and annotation in public database from Refseq, Unigene, GenBank. At the same time, a GDF8 gene was found out in their con-sensus sequence of pig BAC. However, what about the other hits sequence? The further results show they are mostly matched to the conserved sequence between human and mouse (fig. 1, Track 12).

The sequence of pig myostatin region and the corresponding human sequence are screened for re-peats by RepeatMasker (table 1, fig. 1). In the region, the total percentage of repeats is higher in human (57.44%) than in pig (48.62%). Repeats for both spe-cies are higher than the average level of the whole ge-nome, since approximately 46% of the human genome can be recognized as interspersed repeats. All mam-mals have essentially the same four classes of repeti-tive elements: (i) the LINE-dependent, short RNA-derived short interspersed nucleotide elements (SINEs); (ii) the autonomous long interspersed nu-cleotide element (LINE)-like elements; (iii) retrovi-rus-like elements with long terminal repeats (LTRs); and (iv) DNA transposons. The frequency of SINEs elements in pig (12.78%) is slightly higher, in com-parison with the region in human (10.50%), but the number and the frequency of the mammalian inter-spersed repeat (MIR) relics, which are a class of SINE found in all mammals, are very similar between pig and human (1.45% and 1.46% respectively). Alu in-sertion elements, the most abundant class of SINEs in human, are dimeric sequences; the percentage of ALUs in human is 9.04%. In pig sequence region, the percentage of ruminant specific elements is 0.07% and that of GLUs is 11.26%. The frequency of LINEs is significantly higher than SINEs in this region for both species. In pig and human sequenced region, the per-centage of LINE elements is similar, respectively 31.57% and 32.20%. LINEs include LINE1, LINE2 and L3/CR1 family. The percentages of three families in pig are highly similar to that of human. Perhaps most interestingly, the LTR elements in human (8.67%) are 4 times that in pig (2.21%) and the ERV-class I and ERV-class II elements do not exist in the region of pig. The percentage of the DNA transposons in human (6.08%) is more than twice higher than that in pig (2.03%).

Tandem repeats were analyzed by Tandem Re-peat Founder in the myostatin region between pig and human. We found 34 pieces of tandem repeats inter-spersed in pig BAC, among which simple repeats (A or T) were at 5 sites, microsatellites at 16 sites and minisatellites at 12 sites. In human corresponding


Fig.

1.

Illu

stra

tion

of c

ompa

rison

bet

wee

n th

e pi

g B

AC

seq

uenc

e an

d hu

man

cou

nter

part.

Tra

ck 1

sho

ws

the

rule

r bas

ed o

n th

e nu

cleo

tide

posi

tion

in h

uman

chr

omos

ome.

Tra

ck 2

(hits

in h

uman

) and

Tra

ck 3

(hits

in B

AC

) ref

lect

mat

chin

g on

the

basi

s of

the

BLA

ST re

sults

bet

wee

n th

e B

AC

and

hum

an g

enom

e. T

rack

4 (

fgen

esh

) dis

play

s th

e ge

ne p

redi

ctio

n in

the

BA

C s

eque

nce.

Tra

ck 5

( R

epea

tMas

ker )

dis

play

s th

e re

peat

s in

the

BA

C s

eque

nce.

Tra

ck >

=6 d

ispl

ay th

e an

nota

tion

to h

uman

cou

nter

part

from

pub

lic d

atab

ase.

Tra

ck 6

dis

play

sth

e si

tes

of h

uman

gen

etic

s m

ap. T

rack

s 7—

11 d

ispl

ay th

e an

nota

tion

of G

DF8

. Tra

ck 1

2 di

spla

ys th

e se

quen

ce c

onst

ancy

bet

wee

n hu

man

and

mou

se. T

rack

13

disp

lays

the

repe

ats

inth

e hu

man

sequ

ence

.


Table 1 Analysis of interspersed repeats in the myostatin region of pig and human

Pig Human

Repeats number of elements

length occupied/bp

percentage of sequence

number of elements

length occupied/bp

percentage of sequence

SINEs 108 21401 12.78% 95 22849 10.50% ALUs 71 19675 9.04%

Ruminant-spec 1 109 0.07% MIRs 21 2429 1.45% 24 3174 1.46% GLUs 86 18863 11.26%

LINEs: 80 52850 31.57% 79 70087 32.20% LINE1 59 46743 27.92% 55 63045 28.96% LINE2 20 6052 3.62% 23 6985 3.21% L3/CR1 1 55 0.03% 1 57 0.03%

LTR elements: 15 3705 2.21% 38 18866 8.67% MaLRs 13 2588 1.55% 17 8744 4.02% ERVL 2 1117 0.67% 7 3213 1.48%

ERV_classI 0 0 0.00% 13 5841 2.68% ERV_classII 0 0 0.00% 1 1068 0.49%

DNA transposons 17 3396 2.03% 31 13223 6.08% MER1_type 15 2668 1.59% 15 4063 1.87% MER2_type 1 631 0.38% 13 8856 4.07%

Total interspersed repeats: 81352 48.60% 125025 57.44%

sequences, we identified 43 pieces of tandem repeats in the counterpart, among which simple repeats were at 6 sites, microsatellites at 17 sites and minisatellites at 20 sites. The result shows the types and distribution of the microsatellites and minisatellites were markedly inconsistent in porcine region and in human region, and there is no significant relationship of tandem re-peats between them. It is clear now there are two mi-crosatellites respectively at 10 kb away from upstream and downstream of porcine myostatin gene, which has a positive signification to locate this gene.

The pig and human sequenced regions have simi-lar G+C content of 36.92% and 37.52%, which are both lower than the average level of the whole genome (41%, human and 42%, mouse). Three CpG islands were identified by Cpgplot in the pig BAC, whereas the number of CpG island in the human corresponding region is four. Although the number of CpG islands is different, their distribution between pig and human is conserved (fig. 2).

2.2 Organization of the pig myostatin gene The exon or intron organization of the pig my-

ostatin gene was deduced by aligning the pig BAC sequence with mRNA sequences from human (NM_005259) and mouse (NM_010834) and the pig partial myostatin coding sequence (AJ237920, AJ133580, AJ237662), for there is no pig complete mRNA sequence available in the GenBank. At the same time, the transcription start site of the first exon was deduced by the TATA box of promoter and the human myostatin exon1 sequence (fig. 3), and the 3′ end of the third exon was deduced by the polyadenyla-tion signal sequence and the alignment of the 3′UTR among pig, human and mouse (fig. 4). The mouse and human corresponding myostatin structure was identi-fied by aligning their genome sequence with their mRNA sequence (NM_005259, NM_010834). The exon and intron sizes and sequence identities among the pig, human and mouse genes are presented in table 2. The identities of exons and introns between pig and human are evidently higher than those of pig and mouse or human and mouse respectively. The first exon size is different among pig, human, and mouse. The exon1 size of pig is 5 bp shorter than that of hu-man because of lacking a 5-bp GAAAA repeat in 5′


Fig. 2. CpG islands were determined by Cpgplot at NCBI-EBI. (a) Three putative CpG islands distributed in the pig BAC; (b) four putative CpG islands distributed in the human corresponding region.

pig ACAGGGTTTTAACCTCTGACAGCGAGATTCATTGTGGAGCAAGAGCCAATCATAGATCCT

human ACAGGGTTTTAACCTCTGACAGCGAGATTCATTGTGGAGCAAGAGCCAATCATAGATCCT

mouse ACAGGGTTTTAACCTCTGACAGCGAGATTCATTGTGGAGCAGGAGCCAATCATAGATCCT

***************************************** ******************

pig GACGACACTTGTCTCATCAAG--TGGAATATAAAAAGCCACTTGGAATACAGTATAAAAG

human GACGACACTTGTCTCATCTAAGTTGGAATATAAAAAGCCACTTGGAATACAGTATAAAAG

mouse GACGACACTTGTCTCCTCTAAGTTGGAATATAAAAAGCCACTTGGAATACAGTATACAGG

*************** ** * ********************************* * *

pig ATTCACTGGTGTGGCAAGTTGTCTCTCAGACAGTGCAGGCATTAAAATTTTGCTTGGCGT

human ATTCACTGGTGTGGCAAGTTGTCTCTCAGACTGTACATGCATTAAAATTTTGCTTGGCAT

mouse ACTCCCTGGCGTGGCAGGTTGTCTCTCGGACGGTACATGCACTAATATTTCACTTGGCAT

* ** **** ****** ********** *** ** ** *** *** **** ****** *

pig TACTCAAAAGCAAAAG-----TAAAAGGAAGAAATAAGAACAAGGAGAAAGATTGTATTG

human TACTCAAAAGCAAAAGAAAAGTAAAAGGAAGAAACAAGAACAAGAAAAAAGATTATATTG

mouse TACTCAAAAGCAAAAA-----GAAGAAATAAGAACAAGGG-AAAAAAAAAGATTGTGCTG

*************** ** * * ** *** ** * ******* * **

pig ATTTT-AAAATCATGCAAAAACTGCAAATCTATGTTTATATTTACCTGTTTATGCTGATT

human ATTTT-AAAATCATGCAAAAACTGCAACTCTGTGTTTATATTTACCTGTTTATGCTGATT

mouse ATTTTTAAAATGATGCAAAAACTGCAAATGTATGTTTATATTTACCTGTTCATGCTGATT

***** ***** *************** * * ****************** *********

Fig. 3. Sequences alignment analysis of promoter region and partial exon1 in pig, human and mouse. The partial exon1 sequences are underlined. Identities among three species are indicated by star. The CAAT boxes, the TATA boxes and the translation start sites are indicated by gray box.


↓(5968) (6026)↓

pig TTTTGTAAATAAATGTCTCCTTTTTTATTTACTTTGGTATATTTTTATG-TAAGGATATT

human TTTTGTAAATAAGTGTCTCCTTTTTTATTTACTTTGGTATATTTTTACACTAAGGACATT

mouse TTTTGTAAATAAGTGTCTCCTTTTATATTTACTTTGGTATATTTTTACACTAATGAAATT

************ *********** ********************** *** ** ***

↓(6322)

pig TCTGAAATCAGAATAATAAACTGATGATATCTTAAGAAT---TGTTAATTTAATTTTATA

human TCTGAAAT--GAAGAATAAACTGATGCTATCTCAACAATAACTGTTACTTTTATTTTATA

mouse TCTAAAGA------AATACAAATATGGTATCTCAATAACAGCTACT-TTTTTATTTTATA

*** ** **** * *** ***** ** ** * * *** ********

(6438)↓

pig ATTCGATAATGAATATATTTCTCCATATATTTACTTCTATTTTGTAAATTAGGATTTTGT

human ATTTGATAATGAATATATTTCTGCATTTATTTACTTCTGTTTTGTAAATTGGGATTTTGT

mouse ATTTGACAATGAATACATTTCT---TTTATTTACTTCAGTTTTATAAATTGGAACTTTGT

*** ** ******** ****** * ********** **** ****** * * *****

↓(6498) (6542)↓

pig AACAATATAAATTATATTAAAGTGTTTTCAC-CTTTTTTGAAAGACACAACAGTTTTATG

human AACAGTATAAGTTATATTAAAGTGTTTTCACATTTTTTTGAAAGACACAACAGTTTT-TA

mouse AAC--TATAA-----ATTAAAGTGTTTTCAC--ATTTTTGAAAGGCAT--CAGTTTTATG

*** ***** **************** ********** ** ******* *

pig TTATAATGATTAATTCTGAATTTTTGG-TTTTCATTTTATTATAACAGTTTAATGATTTA

human TTCTAATGATTAATTCTGGATTTCTGA-TTTTCACTTTATTATAAAAGTCTAATTGTTTA

mouse TCATAATGATTAATTGTGGGTTTTTAAATTTTTATTTTATTATAAG-----------TTT

* ************ ** *** * **** * ********** **

Fig. 4. sequences alignment analysis of 3′UTR and partial 3′ flanking region in pig, human and mouse. The flanking regions are underlined. The position is based on the transcription start site of the pig myostatin gene. Identities among three species are indicated by star. The TTTT repeats are indicated by gray color. The polyadenylation signal sequences are indicated by boxes.

Table 2 The exon/intron sizes and identities of the myostatin gene in pig, human and mouse

Length/bp Pairwise align scores (%) MSTN Gene

pig human mouse P/H P/M H/M 1 501 506 479 93 85 86 2 374 374 374 97 95 94 Exon 3 1871 1939 1829 83 69 67 1 1809 1788 1741 77 56 57

Intron 2 1978 2423 1994 55 45 47

Note: Pairwise alignment scores were determined by DNAMAN.

UTR of exon 1 (fig. 3). The mouse exon1 is shortest because of lacking 5 bp at the same site and 22 bp at the beginning of 5′UTR. The first exon contains the

start codon region (AAAATCATGC) which is consis-tent with the kozak sequence[12,13]. Perhaps most in-terestingly, the start codon region (AAAATGATGC)


of the mouse exon1 has two contiguous ATG (fig. 3) because of the pointed mutation of G-C transversion. The second exon is evolutionarily well conserved ac-cording to their identities (94%—97%) among three species. The identities of the third exon are low, rang-ing from 67% to 83%.

The level of identity of introns sequences shows considerable variation (table 2), but they are substan-tially more conserved than the flanking sequence. In general, intron1 is more identical than intron 2. The splice junctions between intron and exon are highly conserved, match almost perfectly to the overall boundary consensus sequence exon/GTPuAG…intron and intron…PyPyPyNCAG/N…exon (table 3)[14].

The partial 5′ flanking region and the partial exon1 were analyzed by alignment of sequence among pig (AY208121), human (NT_022197) and mouse (NW_000166) (fig. 3). The promoter region shows a high sequence identity among three species. Align-ment analysis of the promoter reveals that: the region contains a CCAAT-box between nucleotides −71 and −67 from the pig putative transcription start site, most interestingly, two TATA-boxes exist in the nearby downstream of the CCAAT-box between nucleotides −30 and −27 (TATA-box 1) and between nucleotides -6 and -3 (TATA-box 2) respectively (fig. 3). In human and pig, the sequence (TATAAAA) of box1 and box2 is entirely identical, but the Box 2 (TATACAG) is dif-ferent from the box1 (TATAAAA) in mouse. Moreover, further analysis reveals that the TATA-box 1 is respon-sible for the transcription in pig and human, but the TATA-box 2 acts on the transcription in mouse, de-ducing by their mRNA sequence and the conservation of the distance between the transcription start site and TATA-box. The result may account for the lacking of 22 bp at the end of mouse 5′UTR.

The partial 3′ UTR and the 3′ flanking region were also analyzed by alignment of sequence among pig (AY208121), human ((NT_022197)) and mouse (NW_000166) (fig. 4). Just as the 5′UTR, the length of human myostatin 3′UTR is the longest, and that of mouse is the shortest. Interestingly, two clear polyadenylation signal sequences (AATAAA) exist in

the pig 3′UTR, and the distance between them is 355 bp. However, only one polyadenylation signal was found in human, for the former is changed by a G-A transversion and no one exists in mouse because of mutation. The terminal region of 3′UTR is very much conserved, which is characterized by some continual TTTT sequence. The conservation of 3′UTR extends to the 3′ flanking region over 200 bp past the tran-scription stop site, where the TTTT repeats also exist.

2.3 Analysis of the putative transcription fac-tor-binding sites

Many evolutionary conserved regions (ECRs) were identified besides the myostatin gene by a com-parative analysis between the pig and human se-quences and mostly matched to the conserved region between mouse and human (fig. 1). Because the se-quence region for mouse myostatin gene still has many gaps, ECRs among pig, mouse and human are not identified perfectly. A large number of the putative transcription factor binding sites were found in these ECRs by TFSEARCH (table 4). Because the my-ostatin gene is essential for proper regulation of skele-tal muscle mass, we focused mainly on these regula-tory elements which may be involved in the regulation of the muscle development. First, the development of skeletal muscle is a highly regulated process governed by the four myogenic regulatory factors (MRFs) MyoD, myf-5, myogenin and MRF4[15]. MyoD can remodel chromatin at binding sites in muscle gene enhancers and activate transcription at previously si-lent loci, but TGF-beta, basic-FGF, and sodium bu-tyrate blocked MyoD-mediated chromatin reorganiza-tion and the initiation of transcription. Further results show that myostatin inhibits MyoD activity and ex-pression via Smad 3 resulting in the failure of the myoblasts to differentiate into myotubes[16]. At the same time, some results indicate the myostatin gene is a downstream target gene of MyoD[17]. Second, Sox-5 is one of a family of genes which show homology to the HMG box region of the testis determining gene SRY[18]. In mouse, the long form of Sox5 (L-Sox5) is co-expressed and interacts with Sox6, and these two proteins appear to play a key role in chondrogenesis and myogenesis. Like SOX6, L-SOX5 shows strong


Table 3 Genomic organization of the myostatin gene in pig, human and mouse

Pig Human Mouse Exon 5′splice donor 3′splice acceptor 5′splice donor 3′splice acceptor 5′splice donor 3′splice acceptor

1 TATAAA/agattc cagagt/GTAAGT TATAAA/agattc cagagt/GTAAGT CAGGTT/gtctct cagagt/GTAAGT2 TTTTAG/ctgatc gggctg/GTAAGA TTATAG/ctgatt gggctg/GTAAGT TTGTAG/ctgact gggctg/GTAAGT3 TCACAG/aatccc aaagac/ACAACA AAACAG/gaatcc aaagac/ACAACA ACACAG/aatccc aaaggc/ATCAGT

Table 4 The putative transcription factor-binding sites are identified in evolutionary conserved regions (ECRs) by TFSEARCH

Position The putative transcription factors

22017-23286 AML-1a AP-1 Brn-2 CdxA CDP-CR C/EBP c-Ets1 CHOP-c GATA HNF HSF Ik-2 Nkx-2.5 Oct-1 RORalp S8 SOX-5 SRY TATA Tst-1 XFD-3 YY1

51719-52246 AML-1a C/EBP CDP-CR CdxA c-Ets1 CREB DeltaE Evi-1 GATA HFH-1 HNF HSF MZF1 Nkx-2.5 Oct-1 S8 SOX-5 SRY Tst-1 v-ErbA YY1

53857-54749 AML-1a AP-1 C/EBP c-Ets1 CHOP-c Elk-1 GATA HFH HNF-3b MEF NF- E2 Nkx-2.5 RSRFC4 SRY STSTx VBP

71241-71369 CdxA GATA HSF1 MZF1 IK-2 Oct-1 SRY

79679-81009 AML-1a Brn-2 C/EBP CdxA c-Ets1 CRE-BP E4BP4 GATA HFH-2 HLF HNF Ik-2 Lyf-1 MZF1 Nkx-2.5 Oct-1 OCT-x Pbx-1 RORalp S8 SRY STATx TATA USF v-Myb

Upstream of myostatin

81104-82697 AP-1 C/EBPb CREB deltaE E4BP4 Evi-1 GATA HFH-2 HNF-3b MyoD NF-E2 Nkx-2.5 N-Myc Oct-1 Pbx-1 SOX-5 SREBP SRY TATA Tst-1 USF v-Myb XFD-1

Myostatin 81865-88406 Exon 1,2,3 and intron 1,2

88406-89007 AP-1 AML-1a C/EBPb CdxA DeltaE Evi-1 GATA HFH-2 HNF-3b IK-2 Nkx-2.5 Oct-1 Pbx-1 RO-Ralp S8 SRY Tat-1 UBP XFD-1

92083-92552 AP-1 C/EBPb CdxA c-Est1 c-Myc CRE-BP DeltaE GATA HFH-2 HNF-3b IK-2 Nkx-2.5 Oct-1 OCT-X Pbx-1 RORalp SRY USF XFD-1 YY1

93611-94732 AML-1a AP-1 C/EBP CDP-CR CdxA CP2 deltaE GATA HFH-2 HNF-3b HSF1 HSF2 IK-1 IK-2 Lyl-1 MZF1 Nkx-2.5 Oct-1 Pbx-1 S8 SOX-5 SRY STATx TATA v-Myb XFD-1

102083-102407 AML-1a AP-1 CdxA C-Ets1 c-Myc CP2 DeltaE GATA Nkx-2.5 Oct-1 STATA Tst-1 USF

104212-104939 AP-1 C/EBP CDP-CR CdxA c-Ets1 E2F GATA HFH-1 HNF-3b HSF2 MZF1 Nkx-2.5 Oct-1 Pbx-1 SOX-5 SRY XFD-1

105196-105652 C/EBP CdxA GATA Ik-2 Lyf-1 Nkx-2.5 Oct-1 Pbx-1 RORalp SOX-5 SRY TATA XFD-3 111124-111582 ALM-1a AP-1 Arnt CdxA deltaE E47 GATA Ik-2 Lyf-1 MyoD Nkx-2.5 N-Myc Pbx-1 S8 UBP USF

116599-118003 AML-1a C/EBP CDR-CP CdxA c-Ets1 E4BP4 GATA HFH-2 HNF-3b Ik-2 IRF-2 Lyf-1 MZF1 Nkx-2.5 Oct-1 S8 Pbx-1 ROR alp SOX-5 SRY STATx Tst-1 UBP v-Myb

142924-143401 ALM-1a C/EBPa CDP-CR CdxA deltaE Evi-1 GATA HSF2 Nkx-2.5 Oct-1 S8 SRY TATA

154997-156173 AML-1a AP-1 Brn-2 C/EBP CdxA c-Ets1 deltaE E47 Evi-1 GATA HFH-1 HNF-1 HNF-3b HSF Ik-2 Nkx-2.5 Oct-1 P300 Pbx-1 RORalp S8 SKY UBP USF XFD-1 YY1

164765-166041 ADR1 AML-1a AP-1 BR-c2 Brn-2 C/EBP cap CdxA C-Et- F2-II CHOP-c CP2 CRE-BP CroC deltaE Dfd dl Evi GATA GCN4 Hb HSF Ik-2 Lyf-1 MATalp Max MyoD NF-E2 NIF2 Nkx-2.5 Oct-1 Pbx-1 S8 Skn-1 SKY STAT Tal-1a TATA USF VBP v-Myb XFD-1 YY1

Downstream of myostatin

166104-167090 AML-1a AP-1 Brn-2 C/EBPb CdxA CHOP-c CRE-BP deltaE GATA HSF Ik-2 Lyf-1 NF-E2 Nkx-2.5 Oct-1 S8 SKY SOX-5 STAT v-Myb XFD-1

Note: Evolutionary conserved regions (ECRs) are identified by alignment comparison between pig and human. The identity is higher than 75%. The position of ECRs is based on the pig BAC sequence. The putative transcription factor-binding sites associated with muscle development are indi-cated by underline.

expression in chondrocytes and striated muscles, indi-cating a likely role in human cartilage and muscle de-velopment[19]. Third, CCAAT/enhancer- binding pro-teins (C/EBPs) are basic region/leucine zipper tran-scription factors that function as regulators of cell growth and differentiation in numerous cell types.

Some evidence suggested that the C/EBP site at −54 bp for EhPgp1 gene stabilizes the transcription pre- initiation complex[20]. Previous results indicated that C/EBPs have important functions in the process of TGF-beta signal transduction during monocyte differ-entiation[21]. Some evidence showed that TGF-beta


induced the activation and binding of a C/EBPbeta- containing transcriptional complex to this sequence[22]. Fourth, the role of activating protein-1 (AP-1) in mus-cle cells is currently equivocal. While some studies propose that AP-1 is inhibitory for myogenesis, others implicate a positive role in this process. Further stud-ies indicate that AP-1 function during myogenesis is dependent on its subunit composition[23]. In VSMC (vascular smooth muscle cells), the AP-1 complex stimulated by Ang II may inhibit cell growth through active TGF-beta production[24]. Fifth, Oct-1 is a tran-scription factor involved in the cell cycle regulation of histone H2B gene transcription and in the transcription of other cellular housekeeping genes[25]. The cardiac troponin I gene is one of the few sarcomeric protein genes exclusively expressed in cardiac muscle. MEF2 and Oct-1 transcription factors bind to the same A/T-rich element. A mutation that blocks this binding markedly reduces gene activation in vivo and in vi-tro[26].The contribution of the IIB MyHC gene to specification of the myogenic phenotype is at least partially regulated by MEF-2 and Oct-1[27]. Sixth, up-stream stimulatory factor-1 (USF1) plays an important role in muscle development. It is by modification of USF1 that the contractile stimulus mediates changes in myocyte gene transcription[28]. USF factors contribute to the regulation of APEG-1 (Aortic preferentially ex-pressed gene-1) expression and may influence the dif-ferentiation of VSMC[29]. The binding of USF to a conserved site in the XMyoDa promoter decreased basal activity of the promoter and inhibited MyoD- dependent autoactivation

An interesting feature in this region is the high frequency of interspersed repeats. 30% of the human 1-Mb region on Chr 11p15 is composed of inter-spersed repeats

[30]. Seventh, Nkx-2.5 has a role in regulation and/or maintenance of specialized fate selection by embryonic myocardial cells[31]. The smooth and cardiac muscle has a shared transcriptional machinery and that the GATA and NK families confer muscle specificity on the serum response factor/CArG interaction[32]. Eighth, muscle-restricted transcription of sarcomeric actin genes is negatively controlled by the zinc finger protein YY1, which is down-regulated at the protein level during myogenic differentia-tion[33,34]. Finally, the previous result demonstrates that the transcription factors CREB, GATA-2 and SOX-5 play a significant role in the expression of the skeletal muscle dihydropyridine-sensitive receptor (DHPR) or

L-type Ca2+ channel alpha(1S)[35]. Other transcription factors in the region may have less relation with the muscle development, but they may be important for the proper transcription. It shows that these putative transcription factor binding sites may be involved in the myostatin gene muscle-specific expression.

3 Discussion

As significant amounts of the pig genome are sequenced, the opportunity to use cross-species se-quence comparison as an analytical tool becomes in-creasingly attractive. The detailed analysis and com-parison in conserved region may help our understand-ing of the genomic organization of complex gene and putative regulatory elements. Here we present a com-parative sequence analysis of about 170 kb containing the myostatin gene among pig, human and mouse. Many putative transcription binding sites were identi-fied in these ECRs, which may be relative with the muscle development.

[36], and the average frequency for hu-man genome is about 45%[37], and the average level of repeats for mouse genome is 37.5% [38], so the my-ostatin region is characterized by higher interspersed repeats (48.62% in pig and 57.44% in human) than the average frequency. Although the total percentage of repeats in the region is higher in human than in pig, there is no evidence to know whether repeats in human whole genome are higher than those in pig whole ge-nome. It has previously been suggested that simple repeats may play an important role in regulation of imprinted genes[39], but no clear mechanism has been characterized so far. Whether the high interspersed repeats have any relationship with the regulation of myostatin requires further study.

The myostatin region is characterized by the low G+C content (human, 37.52% and pig 36.92%), for the average level of human and mouse genome are 41% and 42% respectively[38]. Some previous evi-dence suggests that the higher G+C content is a gen-eral trend of the pig genome compared with the human


genome[36,40]. However, our results are not consistent with this tendency, because the G+C content in human (37.52%) is slightly higher than in pig (36.92%). An independent comparative study of 130-kb genomic region on pig Chr 15 and human Chr 2[40] also shows a similar G+C content in both species (45.6% and 45.4% in pig and in human respectively). A variety of evidence suggests that vertebrate genomes are mosaic of isochores of differing G+C content, repeat and gene distribution. G+C-rich isochores are rich in SINE ele-ments and genes, whereas the G+C-poor isochores are gene poor and relatively deleplated in SINE elements. G+C rich isochores prominently localized to R-bands, whereas the G+C poor isochores are enriched in G-bands. In situ hybridization studies demonstrate that SINEs appear to be largely localized in R-bands and LINEs in G-bands[41]. Thus the low G+C content in the region is consistent to the low content of SINEs. CpG islands are present in close association with all housekeeping genes as well as some tissue-specific genes in the mammalian genome. Methylation of CpG islands strongly influences both structural organization and function of chromatin[42]. 50%—60% of the hu-man genes exhibit a CpG island over the transcription start site (TSS), which means that some of CpG Is-lands are not associated with TSS. In the pig BAC, CpG islands are not relative with the transcription start sites of myostatin. Most interestingly, their distribution is conserved for pig and human, which may be re-quired for regulation of myostatin.

The divergence between the pig and human line-age is estimated 70 million years ago, whereas human and mice diverged approximately 100 million years[43]. The sequence identity of the myostatin gene for the three species comparison is remarkably similar (table 1). The general trend is a higher similarity between pig and human than those species and mouse as expected. The same conclusion was reached by comparisons of the INS-IGF2 and H19 genes among three species[40]. The very different origin of these species may be re-flected by the different identities. Therefore, the re-sults illustrate that comparative sequence analysis among three or more species will be widely used in the interpretation of the genome evolution.

For the myostatin gene, one striking feature is the two TATA-boxes in the nearby downstream of CCAAT-box. The distance between two TATA-boxes is 20 bp, and the distance between CCAAT-box and TATA-box1 and box 2 is 36 and 60 bp respectively. Our results show that the TATA-box1 is responsible for the transcription in pig and human, but the TATA-box 2 acts on the transcription in mouse. It causes 5′ UTR in mouse being 22 bp shorter than that in pig, and the different sizes of 5′ UTR may be asso-ciated with the mRNA stability. In mouse, the box2 sequence is TATACAG, different from the box1 and the human and mouse box2 whose sequence is TA-TAAAA. In general, TATAAAA sequence is suitable to the transcription, compared to the TATACAG se-quence.

Additionally, mouse is different from human and pig at the initiation codon site, for the two contiguous ATGs result from the pointed mutation. The first ATG region is consistent with the Kozak sequence which is important for the proper initiation of translation. If we assume that the start codon is the second ATG, the at-gATGc sequence also resembles best the Kozak con-sensus sequence[12,13]. The Slc23a1 gene has the same phenomenon, which is suggested to be helpful for translation[44]. This structure may not be relative with MSTN function, but be associated with the initiation of translation.

The length of myostatin-encoding transcript is different among pig, human and mouse, which mainly results from the 3′UTR discrepancy. The human tran-script is the longest, and mouse is the shortest. Some evidence has shown that the length of transcript may regulate the transcript stability, so the human transcript may be the most stable. It is striking that there are two polyadenylation signal sequences AATAAA in the pig 3′UTR, but not in human and mouse, even mice do not have the AATAAA structure. For SCD (stearoyl-CoA desaturase) gene, alternative usage of polyadenylation sites generates two transcripts of 3.9 and 5.2 kb[45]. Whether the two polyadenylation signals lead to dif-ferent length for the pig transcript requires further confirmation. In mouse, the lack of polyadenylation signal may affect the polyA tail formation of mRNA


which is important for mRNA stability. Another fea-ture is that the transcription stop region is character-ized by many continual TTTT repeats, which may be responsible for the proper transcription stop and the polyadenylation reaction.

The evolutionary conserved region that exceeds a defined threshold of sequence homology is likely to represent a function element. In the myostatin region, many ECRs were found by comparative analysis be-tween pig and human (fig. 1). Myostatin, a member of the TGF-beta family, negatively regulates skeletal muscle development. Depression of myostatin activity leads to increased muscle growth and carcass lean yield[46]. Our results show that many transcription fac-tor binding sites associated with muscle development were identified in the ECRs, including MyoD, Sox-5, C/EBPbeta, AP-1, Oct-1,Nkx-2.5, USF and YY1 and so on. These transcription factors interact with TGF-beta family in the regulation of muscle develop-ment. On the one hand, the putative transcription fac-tors regulate myostain expression. Sequence analysis of 1.6 kb of the bovine myostatin gene upstream re-gion revealed that it contains 10 E-box motifs (E1 to E10), arranged in three clusters, and a single MEF2 site. Furthermore, cotransfection experiments indicate that among the myogenic regulatory factors, MyoD preferentially up-regulates myostatin promoter activ-ity[17]. In VSMC (vascular smooth muscle cells) that produce active TGF-beta (CNC), the AP-1 complex stimulated by Ang II may inhibit cell growth through active TGF-beta production[24]. On the other hand, myostatin also controls these factors expression. Some evidence indicates that MyoD plays an important role in the muscle development, whereas its functions can be suppressed through inhibition of their expression or activity by TGF-beta[47]. Further results showed my-ostatin inhibits myoblast differentiation by down- regulating MyoD expression[16]. Some results also showed that TGF-beta induced the activation and binding of a C/EBPbeta containing transcriptional complex to this sequence[22]. Taken together, it sug-gests that myostatin expression may be complicatedly regulated by these factors, and the complex interaction between myostatin and these transcription factors may

be required for the muscle development.

Acknowledgements The work was supported by State Major Basic Research Development Program of China (G20000161). The authors thank Changxin Wu for sequence analysis.

References

1. McPherron, A. C., Lawler, A. M., Lee, S. J., Regulation of skele-tal muscle mass in mice by new TGF-beta superfamily member, Nature, 1997, 387: 83—90.

2. Thomas, M., Langley, B., Berry, C. et al., Myostatin, a negative regulator of muscle growth, functions by inhibiting myoblast pro-liferation, J. Biol. Chem., 2000, 275(51): 40235—40243.

3. Yamanouchi, K., Soeta, C., Naito, N., Tojo, H., Expression of myostatin gene in regenerating skeletal muscle of the rat and its localization, Biochemical and Biophysical Research Communica-tions, 2000, 270: 510—516.

4. Kambadur, R., Sharma, M., Smith, T. P. L., Bass, J. J., Mutations in myostatin (GDF-8) in double muscled Belgian Blue and Pied-montese cattle, Genome Res., 1997, 7: 910—916.

5. Zimmers, T. A., Davies, M. V., Koniaris, L. G. et al., Induction of cachexia in mice by systemically administered myostatin, Science, 2002, 296: 1486—1488.

6. Ji, S., Losinki, R. L., Cornelius, S. G. et al., Myostatin expression in porcine tissues: Tissue specificity and developmental and post-natal regulation, Am. J. Physiol., 1998, 275: R1265—1273.

7. Sonstegard, T. S., Rohrer, G. A., Smith, T. P. L., Myostatin maps to porcine chromosome 15 by linkage and physical analyses, Anim. Genet., 1998, 29(1): 19—22.

8. Strail, A., Kopecny, M., Genomic organization, sequence and polymorphism of the porcine myostatin (GDF8;MSTN) gene, Animal Genetics, 1999, 30(6): 468—470.

9. Jiang, Y. L., Li, N., Plastow, G. et al., Identification of three SNPs in the porcine myostatin gene (MSTN), Animal Biotechnology, 2002, 13: 173—178.

10. Sulston, J., Du, Z., Thomas, K. et al., The C. elegans genome se-quencing project a beginning, Nature, 1992, 365: 37—41.

11. Ansari-Lari, M. A., Oeltjen, J. C., Schwartz, S. et al., Comparative sequence analysis of a gene-rich cluster at human chromosome 6, Genome Res., 1998, 8: 29—40.

12. Kozak, M., An analysis of 5′-noncoding sequences from 699 ver-tebrate messenger RNAs, Nuc. Acids Res., 1987, 15: 8125—8148.

13. Kozak, M., An analysis of vertebrate mRNA sequence: Intima-tions of translational control, J. Cell Biol., 1991, 115: 887—903.

14. Mount, S. M., A catalogue of splice junction sequences, Nucleic Acids Res., 1982, 10: 459—472.

15. Pin, C. L., Konieczny, S. F., A fast fiber enhancer exists in the muscle regulatory factor 4 gene promoter, Biochem. Biophys. Res. Commun., 2002, 299: 7—13.

16. Langley, B., Thomas, M., Bishop, A. et al., Myostatin inhibits myoblast differentiation by down-regulating myoD expression, J. Biol. Chem., 2002, 277(51): 49831—49840.

17. Spiller, M. P., Kambadur, R., Jeanplong, F. et al., The myostatin gene is a downstream target gene of basic helix-loop-helix tran-scription factor MyoD, Mol. Cell Biol., 2002, 22(20): 7066—


7082. 18. Denny, P., Swift, S., Connor, F., Ashworth, A., An SRY-related

gene expressed during spermatogenesis in the mouse encodes a sequence-specific DNA-binding protein, EMBO J., 1992, 11: 3705—3712.

19. Ikeda, T., Zhang, J., Chano, T., Mabuchi, A., Fukuda, A., Identifi-cation and characterization of the human long form of Sox5 (L-SOX5) gene, Gene, 2002, 298: 59—68.

20. Marchat, L. A., Gomez, C., Perez, D. G. et al., Two CCAAT/enhancer binding protein sites are cis-activator elements of the Entamoeba histolytica EhPgp1 (mdr-like) gene expression, Cell Microbiol., 2002, 11: 725—737.

21. Pan, Z., Hetherington, C. J., Zhang, D. E., CCAAT/enhancer-binding protein activates the CD14 promoter and mediates transforming growth factor beta signaling in mono-cyte development, J. Biol. Chem., 1999, 274: 23242—23248.

22. Garcia-Trevijano, E. R., Iraburu, M. J., Fontana, L. et al., Two domains of MyoD mediate transcriptional activation of genes in repressive chromatin: a mechanism for lineage determination in myogenesis, Genes Dev., 1997, 11: 436—450.

23. Andreucci, J. J., Grant, D., Cox, D. M. et al., Composition and function of AP-1 transcription complexes during muscle cell dif-ferentiation, J. Biol. Chem., 2002, 277(19): 16426—16432.

24. Morishita, R., Gibbons, G. H., Horiuchi, M., Kaneda, Y., Ogihara, T., Dzau, V. J., Role of AP-1 complex in angiotensin II-mediated transforming growth factor-beta expression and growth of smooth muscle cells: Using decoy approach against AP-1 binding site, Biochem. Biophys. Res. Commun., 1998, 243(2): 361—367.

25. Segil, N., Roberts, S. B., Heintz, N., Mitotic phosphorylation of the Oct-1 homeodomain and regulation of Oct-1 DNA binding ac-tivity, Science, 1991, 254(5039): 1814—1816.

26. Di Lisi, R., Millino, C., Calabria, E. et al., Combinatorial cis-acting elements control tissue-specific activation of the cardiac troponin I gene in vitro and in vivo., J. Biol. Chem., 1998, 273: 25371—25380.

27. Lakich, M. M., Diagana, T. T., North, D. L., Whalen, R. G., MEF-2 and Oct-1 bind to two homologous promoter sequence elements and participate in the expression of a skeletal muscle- specific gene, J. Biol. Chem., 1998, 273: 15217—15226.

28. Xiao, Q., Kenessey, A., Ojamaa, K., Role of USF1 phosphoryla-tion on cardiac alpha-myosin heavy chain promoter activity, Am. J. Physiol. Heart Circ. Physiol., 2002, 283: H213—219.

29. Chen, Y. H., Layne, M. D., Watanabe, M., Yet, S. F., Perrella, M. A., Upstream stimulatory factors regulate aortic preferentially ex-pressed gene-1 expression in vascular smooth muscle cells, J. Biol. Chem., 2001, 276: 47658—47663.

30. Lun, Y., Sawadogo, M., Perry, M., Autoactivation of Xenopus MyoD transcription and its inhibition by USF, Cell Growth Differ., 1997, 8: 275—282.

31. Thomas, P. S., Kasahara, H., Edmonson, A. M. et al., Elevated expression of Nkx-2.5 in developing myocardial conduction cells, Anat. Rec., 2001, 263(3): 307—313.

32. Nishida, W., Nakamura, M., Mori, S. et al., A triad of serum re-sponse factor and the GATA and NK families governs the tran-

scription of smooth and cardiac muscle genes, J. Biol. Chem., 2002, 277(9): 7308—7317.

33. Kalenik, J. L., Chen, D., Bradley, M. E., Chen, S. J., Lee, T. C., Yeast two-hybrid cloning of a novel zinc finger protein that inter-acts with the multifunctional transcription factor YY1, Nucleic Acids Res., 1997, 25: 843—849.

34. Walowitz, J. L., Bradley, M. E., Chen, S., Lee, T., Proteolytic regulation of the zinc finger transcription factor YY1, a repressor of muscle-restricted gene expression, J. Biol. Chem., 1998, 273: 6656—6661.

35. Zheng, Z., Wang, Z. M., Delbono, O., Charge movement and transcription regulation of L-type calcium channel alpha(1S) in skeletal muscle cells, J. Physiol., 2002, 540: 397—409.

36. Onyango, P., Miller, W., Lehoezky, J. et al., Sequence and com-parative analysis of the mouse 1-megabase region orthologoud to the human 11p5 imprinted domain, Genome Res., 2000, 10: 1697—1710.

37. Lander, E. S., Linton, L. M., Birren, B. et al., Initial sequencing and analysis of the human genome, Nature, 2001, 409: 860—892.

38. Waterston, R. H., Lindblad-Toh, K., Birney, E. et al., Initial se-quencing and comparative analysis of the mouse genome, Nature, 2002, 420: 520—562.

39. Shibata, H., Yoda, Y., Kato, R. et al., A methylation imprint mark in the mouse imprinted gene Grf1/Cdc25Mm locus shares a common feature with U2afbp-rs gene: An association with a short tandem repeat and hypermethylated region, Genomics, 1998, 4930—4937.

40. Amarger, V., Nguyen, M., Laere, A. S. V. et al., Comparative se-quence analysis of the INS-IGF2-H19 gene cluster in pigs, Mamm Genome, 2002, 13: 388—398.

41. Boyle, A. L., Ballard, S. G., Ward, D. C., Differential distribution of long and short interspersed element sequence in mouse genome: Chromosome karyotyping by fluorescence in situ hybridization, Proc. Natl. Acad. Sci., 1990, 87: 7757—7761.

42. Kundu, T. K., Rao, M. R,. CpG islands in chromatin organization and gene expression, J. Biochem. (Tokyo), 1999, 125(2): 217—222.

43. Andersson, L., Archibald, A., Ashburner, M. et al., Comparative genome organization of vertebrates the first international work-shop on comparative genome organization, Mamm. Genome, 1996, 7: 717—734.

44. Gispert, S., Dutra, A., Lieberman, A., Friedlich, D., Nussbaum, R. L., Cloning and genomic organization of the mouse gene Slc23a1 encoding a vitamin C transporter, DNA Research, 2000, 7: 339—345.

45. Zhang, L., Ge, L., Parimoo, S., Stenn, K., Prouty, S. M., Human stearoyl-CoA desaturase: Alternative transcripts generated from a single gene by usage of tandem polyadenylation sites, Biochem. J., 1999, 340: 255—264.

46. Yang, J., Ratovitski, T., Brady, J. P. et al., Expression of myostatin pro domain results in muscular transgenic mice, Mol. Reprod. Dev., 2001, 60(3): 351—361.

47. Dias, P., Dilling, M., Houghton, P., The molecular basis of skeletal muscle differentiation, Semin. Diagn. Pathol., 1994, 11(1): 314.

comparative analysis of the pig bac sequence involved in the regulation of myostatin gene

Documents