| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetic Epidemiology and Molecular Epidemiology Laboratories (C.A.A., G.Z., S.A.T., N.G.M., P.M.V., G.W.M.), Queensland Institute of Medical Research, Brisbane, Queensland, Australia 4029; Twin Research and Genetic Epidemiology Unit (M.F., T.D.S.), St. Thomas Hospital, Kings College, London, United Kingdom SE1 7EH; Department of Biological Psychology (S.M.v.d.B., D.I.B.), Vrije Universiteit Amsterdam, 1081 HV Amsterdam, The Netherlands; Department of Veterinary Medicine (S.M.v.d.B.), Utrecht University, 3508 GA Utrecht, The Netherlands; Institute of Evolutionary Biology (C.A.A.), School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland EH9 3JT, United Kingdom; and Bioinformatics and Statistical Genetics (C.A.A.), Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom OX3 7BN
Address all correspondence and requests for reprints to: Grant W. Montgomery, Molecular Epidemiology Laboratory, Queensland Institute of Medical Research, Post Office, Royal Brisbane Hospital, Queensland 4029, Australia. E-mail: grant.montgomery{at}qimr.edu.au.
| Abstract |
|---|
|
|
|---|
Objective: The objective of the study was to identify genetic loci influencing variation in AAM in large population-based samples from three countries.
Design/Participants: Recalled AAM data were collected from 13,697 individuals and 4,899 pseudoindependent sister-pairs from three different populations (Australia, The Netherlands, and the United Kingdom) by mailed questionnaire or interview. Genome-wide variance components linkage analysis was implemented on each sample individually and in combination.
Results: The mean, SD, and heritability of AAM across the three samples was 13.1 yr, 1.5 yr, and 0.69, respectively. No loci were detected that reached genome-wide significance in the combined analysis, but a suggestive locus was detected on chromosome 12 (logarithm of the odds = 2.0). Three loci of suggestive significance were seen in the U.K. sample on chromosomes 1, 4, and 18 (logarithm of the odds = 2.4, 2.2 and 3.2, respectively).
Conclusions: There was no evidence for common highly penetrant variants influencing AAM. Linkage and association suggest that one trait locus for AAM is located on chromosome 12, but further studies are required to replicate these results.
| Introduction |
|---|
|
|
|---|
A secular decrease in AAM has occurred in developed countries over the past century. This decrease is believed to be associated with improved nutritional status, greater amounts of adipose tissue, and the improved general health of adolescent females. A large multinational study carried out by the World Health Organization showed the median AAM to be 14 yr (6). In populations of European descent, the mean AAM is typically about 13 yr (except for Mediterranean populations, who have an early AAM) (7).
Menarche is reached after a series of complex developmental and neuroendocrine events leading to full activation of the hypothalamic-pituitary-gonadal axis including maturation of the KISS1/KISS1R system (8). AAM is a complex trait determined by an array of genetic and environmental factors (9, 10, 11). Genetic factors clearly play a role in AAM, with monozygotic twin correlations in the range of 0.65–0.93 and dizygotic twin correlations in the range of 0.18–0.62 (11). AAM is a highly heritable trait, with twin studies reporting heritabilities of 0.45–0.95 (10).
To date, two genome-wide linkage scans for genes underlying variation in AAM have been performed (12, 13). Using 98 sister pairs, three suggestive linkage peaks were found on chromosomes 16q21, 16q12, and 8p12 for weight-adjusted AAM (13). A study of 2461 sister pairs found significant evidence of linkage to 22q13 and suggestive evidence of linkage to 11q23 and 22q11 (12). No replicated linkages have been reported for AAM.
Several candidate genes have been associated with AAM, including the estrogen receptor-
(ESR1) gene (14), the estrogen-biosynthetic gene aromatase CYP19A1 (15), and the SHBG gene (16). The estrogen receptor-β (ESR2) gene has also been associated with AAM and has been shown to interact with the ESR1 locus (17). These associations are yet to be replicated in independent samples.
We performed a genome-wide linkage scan in large population-based samples from three populations of European descent. Our combined sample size is the largest reported study for AAM.
| Subjects and Methods |
|---|
|
|
|---|
Australian samples
The Australian samples were drawn from three different studies.
Adolescent twin families Adolescent twins and their families were recruited for an ongoing study of melanoma risk factors. Twins were interviewed at ages 12 and 14 yr. Nontwin siblings were asked to attend at least one interview. As part of the clinical protocol female adolescents were asked during interview to provide their age at menarche (18). The present analysis used AAM estimates collected between May 1992 and February 2006. A second sample of adolescent twin families was recruited to an ongoing study of cognitive ability. Twins and their siblings were interviewed at age 16 yr. As part of the clinical protocol, female participants were asked by a research nurse to give their AAM (19). This current study uses AAM data collected between July 1996 and February 2006. Many of the twins taking part in the cognitive ability study had previously taken part in the melanoma risk factor study. However, 226 of the 1351 adolescents had a censored AAM (i.e. menarche had not occurred by the last interview) and were removed from further analysis. This conservative approach will cause a small reduction in power to detect linkage but not increase the false-positive rate. The mean age at last interview for these censored individuals was 12.4 yr, well below the mean age at menarche in the Australia sample (13.0 yr). Therefore, the censored data include individuals from across the range in AAM and not just the upper end of this distribution. Furthermore, because these individuals form only a small part of the total study population (1.68%) and statistical power is proportional to the total number of sibling pairs, their removal is unlikely to have a large influence on the linkage results. Data were also discarded when the reported AAM was higher than the age at interview or when participants indicated they had not yet experienced menarche but nevertheless reported an AAM. In total, 17.6% of the adolescent data were removed (16.7% due to censoring). When AAM was ascertained from the same individual more than once, the correlation between the estimates was 0.9. Three genome-wide scans using microsatellite markers were carried out on extracted DNA from blood samples collected from these families. Details of the genotyping and genetic data cleaning procedures have been described previously (20). In addition, 166 adolescent females were genotyped with 100k single-nucleotide polymorphism (SNP) chips (Affymetrix, Santa Clara, CA). Genetic data from all four genome screens are used in the present study. The adolescent samples provided 826 individuals with both AAM and genotypic data to the present analysis.
Adult twin families From 1980 to 1982, AAM information was collected from both members of 1888 female twin pairs as part of a health survey mailed to twins on the Australian Twin Registry. The cohort comprised twins born between 1913 and 1964. Females were asked the question, "How old were you when you had your first menstrual period?" (years and months) via questionnaire. From November 1989 onward, the first-degree relatives of this twin cohort were also surveyed and asked the same AAM question. AAM data from this cohort of female relatives are included in the present analysis. A second cohort of adult twins, born between 1964 and 1971, was recruited in 1989 and hence were aged 18–24 yr when they were initially surveyed by mailed questionnaire. Immediate family members were also asked to participate. As part of the questionnaire, participants were asked the same question regarding menarche as the 1913–1964 cohort. This second cohort were followed up, via telephone interview, between 1996 and 2000 as part of a study into alcoholism and psychiatric morbidity (21) and were again asked to give their AAM. Four microsatellite genome-wide scans have been completed on DNA extracted from blood or buccal samples collected from the two adult cohorts. The genotyping and genetic data cleaning procedures for all four scans have been described previously (22). The adult cohorts provided 3544 individuals with both phenotypic and genotypic data to the present study.
Endometriosis families AAM information was obtained from women diagnosed with endometriosis ascertained from the Australian component of the International Endogene Study (23). From 1995 to 2002, the Australian group recruited 931 families, each with at least two affected members (mostly affected sister pairs) with surgically diagnosed endometriosis for a genome-wide linkage scan (24). Included in the sample were case-parent trios and some cases without parents (23). All affected female family members were asked the question, "How old were you when your periods began" (whole years) via mailed questionnaire. Details of the genotyping and genetic data cleaning procedures for the endometriosis cohort have been detailed elsewhere (24). A single 400-marker genome-wide scan was carried out on the Australian families. The final batch of 79 families was genotyped using only 113 markers across chromosomes 9, 10, 11, 19, 20, 21, 22, and X.
For all study cohorts, where two or more AAM estimates were available for an individual, the estimate provided at the first data collection after menarche was used. It was assumed that the recall closest in time to menarche would be the most accurate. In total, the Australian sample comprised 6892 individuals with AAM and genotype data.
Dutch samples
In total, the Dutch sample comprises 1549 individuals with AAM and genotype data.
Phenotype data were collected from twins and their family members who are registered with The Netherlands Twin Registry (25). In 1991, the The Netherlands Twin Registry started a longitudinal survey study of health, personality, and lifestyle. Surveys were sent out in 1991, 1993, 1995, 1997, 2000, and 2002 to adolescent and adult twins and their family members. Twin pairs could participate in all waves; siblings were included from 1995; parents of twins participated in 1991, 1993, 1995, and 2002; and spouses from 2000. Families with adolescent and young adult twins were recruited through city councils in 1990–1993. After 1993 an effort was also made to recruit adult and older twins using several approaches. Further details on response rates and demographic characteristics of the sample can be found elsewhere (26).
In 1991, 1993, and 1995, female participants were asked to indicate the AAM in years and months, whereas in 2000 and 2002, participants were asked to indicate age in years. In 1993, 1995, and 2002, participants were first asked whether they had had their first menstruation, and if so, they were asked to indicate the age at which it had occurred. All data concerning AAM were rescaled to number of months, in which 6 months were added to the data from 2000 and 2002. This way we avoided bias due to the fact that we only had data on the age in years (27). Data were discarded when the reported AAM was higher than the age at time of the questionnaire and when an AAM report differed by more than 12 months from others reported for the same individual. Data points were also excluded when participants indicated they had not yet had their first menses but nevertheless reported an age. Total discarded data were less than 1%.
DNA was extracted from either whole-blood or buccal swabs using standard protocols (28). Samples were genotyped by the Mammalian Genotyping Service in Marshfield and the Molecular Epidemiology Section, Leiden University Medical Centre. Genotype data from these screens were combined. Pedigree relations were checked with graphic representation of relationships. Errors of Mendelian inheritance were detected with Pedstats. Markers and samples were removed if their total error rate was more than 1%; in all other cases the specific erroneous genotypes were set as unknown. Unlikely recombinants were detected using Merlin (29) and erroneous genotypes were removed with Pedwipe. The mean number of markers was 343 and 717 sibling pairs that had at least 200 autosomal markers genotyped for each individual were selected. In the Dutch families, 352 parents were genotyped.
United Kingdom samples
Twins identified from the St. Thomas United Kingdom adult twin registry (TwinsUK) were invited to participate in a study of common diseases and traits (30). Study subjects were assessed for an extensive range of clinical phenotypes. General medical and lifestyle data were also obtained by questionnaire. Genome scans were performed using DNA from leukocytes. Gemini Genomics conducted the genotyping, with 365 core microsatellites and 372 markers providing gap fills, as described previously (31), and 1494 SNP markers from the HuSNP GeneChip linkage mapping set (Affymetrix), providing approximate intermarker spacing of less than 10 cM. Twin zygosity and family relationships were rigorously investigated and discrepant pairs discarded from further analyses. The estimated genotyping error rate was less than 1%.
In total, the U.K. sample comprises 5181 females in dizygotic twin families with records on AAM and genotype data.
Genetic marker maps
To analyze the Australian, Dutch, and U.K. data jointly, a combined marker map was produced from the marker maps supplied by the three research groups. The Dutch and U.K. markers were placed onto the Australian map using a LOESS regression method implemented through R (32). Where the same marker appeared in multiple scans, the multiple versions were included in the analysis by offsetting the markers by 0.001cM on the marker map. This method negates the need for the rebinning of alleles across multiple genomic screens. Samples from Australia and The Netherlands show a very low level of population differentiation (FST = 0.003), indicating that our combined analysis is appropriate (33). Furthermore, when we contrasted the allele frequencies of 317,000 SNPs between a sample of 462 female Australians and the HapMap sample of 60 individuals of northwestern European descent, we found no evidence for stratification (results not shown).
Data cleaning
To reduce the influence of univariate outliers, all estimates which differed from the mean by more than 3 SD were regressed back to this threshold. Phenotypic bivariate outliers were identified and removed to ensure that linkage signals were not overly influenced by extremely discordant sibling pairs (see Appendix 1). A bivariate outlier is defined as a pseudoindependent sibling pair (pisp) with an extreme difference in AAM. Any pisp with a squared difference in AAM of more than 2878 months2 was identified as a bivariate outlier. This is equivalent to an absolute difference of 4.47 yr between the AAM of pisps. For each pisp as a bivariate outlier, the AAM that differed most from the sample mean was removed from further analysis. In total, 63 individuals were removed from the study because they created a bivariate outlier. The AAM data were then standardized to a normal distribution with a mean of 0 and a SD of 1. After the removal of bivariate outliers, the sample consisted of 13,618 individuals and 4,899 pisps (2893 from Australia, 1289 from the United Kingdom, and 717 from The Netherlands) with AAM and genome-wide genetic marker.
Statistical analysis
Genome-wide variance-component linkage analysis was carried out using MERLIN (29). MERLIN estimates the probability of the number of alleles shared (0, 1, or 2) identical by descent (IBD) among relatives in a pedigree. The resulting IBD coefficient that is used in the analysis is the estimate of the proportion of alleles shared IBD. Identity by descent coefficients were calculated at 5-cM intervals. Variance-component models are based on the correlation between the genetic similarity of relatives at a given locus (IBD coefficients) and the relatives similarity with respect to the phenotype (AAM). Phenotypic data are assumed to follow a multivariate-normal distribution. Linkage analysis was carried out on each population individually and in combination by combining within study standardized Z scores from each of the three samples. In accordance with previously proposed significance levels (33) for genome-wide linkage analysis, a significance threshold of logarithm of the odds (LOD) = 3.3 was adopted. Suggestive evidence of linkage was reported when a LOD score of 1.9 or greater was observed.
| Results |
|---|
|
|
|---|
|
|
| Discussion |
|---|
|
|
|---|
We did not have data on body fat or body mass index taken around the time of menarche and were therefore unable to account for these in our analyses. It is possible that the regions showing suggestive linkage to AAM actually underpin variation in these or any other unobserved covariates. If these regions are confirmed, further work is needed to ascertain whether the loci suggested here act directly on AAM or through correlated phenotypes. The same is true for any genetic linkage or association study because association does not imply causation.
Despite our large number of sibling pairs, the combined analysis (with or without bivariate outliers) revealed no significant linkage peaks, suggesting that the genetic architecture of AAM is complex. If a single quantitative trait locus (QTL) explained the variation in AAM, then we would expect to detect a gene of this effect size in a study of this magnitude. The fact that we have not detected a significant LOD score suggests that multiple loci of small or modest effect are involved in the heritability of AAM. With a sample of 4899 pisps, we have 78% power to detect a QTL (at
= 0.0001, genome-wide significance) that explains 15% of the trait variation (assuming a heritability of 70% and a recombination fraction of 0.0) (35, 36). In addition, our scan did not replicate peaks reported in previous scans (12, 13). This suggests that AAM variation is underpinned by multiple QTL of small effect and is similar to the genetic architecture of other complex traits. Large numbers of sibling pairs are needed to identify QTL with small effects using linkage analysis (37). If a QTL existed that explained 10% of the trait variation, for a trait with a heritability of 70% such as AAM, then approximately 11,390 pisps would be required to detect the QTL with 80% power. Rare alleles with large effects on AAM could be segregating within the population, but these would explain little of the population variation in the trait.
We carried out our analysis after excluding bivariate outliers. The combined analysis with the bivariate outliers included showed some differences (data not shown). Ideally a LOD score peak is positively influenced by many sibling pairs and families, but this is seldom the case because the extremely concordant and discordant sibling pairs provide the majority of the power to detect linkage. For a marker in which an extremely discordant sibling pair share no alleles IBD, the sibling pair will have the effect of increasing the LOD score, and where they share two alleles IBD, they will decrease the LOD score at that locus. For this reason, it is difficult to know how to correctly handle bivariate outliers in linkage analysis. It seems reasonable to suggest that one may have more confidence in a linkage peak that is still present in the absence of bivariate outliers than those that are seen only in their presence. Furthermore, the aim of this study was to identify loci underlying the normal variation in AAM, and therefore, removing bivariate outliers would seem the logical approach. From data alone one cannot distinguish between the hypotheses that bivariate outliers are caused by (rare) alleles of large effect or by nongenetic effects, for example measurement errors. We recently discussed the issue of bivariate outliers in detail in a separate study on height (38).
We found no evidence for genetic linkage around several genes thought to play critical roles in AAM. Association between AAM and ESR1 has been reported in several studies (14, 17, 39). We found no evidence for linkage to the region of the ESR1 gene in any of the three populations studied. Recently the KISS1/KISS1R pathway has been shown to be critical for puberty. Both the ligand and receptor are expressed in the mediobasal hypothalamus of primates and play important roles in the regulation of GnRH neurons and the reproductive axis (40). We found no evidence for linkage to either KISS1 or KISS1R, suggesting that normal variation in the timing of menarche is not explained by highly penetrant variants of large effect in these genes. We had sufficient power to detect such major loci, but the genes could still harbor common loci of small effect or very rare alleles of large effect because our linkage study is underpowered to detect such variants.
The region of suggestive linkage on chromosome 12 contains the IGF-I (IGF1) gene. IGF1 plays an important role in control of the female reproductive system including pubertal development and menarche (41). A recent family-based association study including 1048 females in 354 Caucasian nuclear families identified significant association between AAM and a common SNP (rs6214) in exon 4 of the IGF1 gene (42). The SNP rs6214 is common with a minor allele frequency of 0.41, and there was also significant association with a haplotype including rs6214 and rs5742694. The mean AAM for subjects carrying the GA haplotype was 13.13 yr and the mean AAM for noncarriers was 12.86 yr (41). Association with common variants in IGF1 and our evidence for suggestive linkage to chromosome 12 suggest the IGF1 locus may contribute to variation in AAM and warrants further investigation. However, the association does not reach the level of genome-wide significance and has not been replicated in other samples. Given our finding of linkage to this region is only suggestive; additional studies must be carried out to establish whether this locus contributes to variation in AAM.
In summary, we have carried out the largest genome-wide scan for loci influencing variation in AAM. We report one region of suggestive linkage on chromosome 12q supported by evidence from two populations. The IGF1 gene lies directly under this suggestive linkage peak and significant association with AAM was reported for SNP rs6124 in exon 4 of IGF1. Further studies are needed to confirm the role of IGF1 and/or other variants in this region of chromosome 12 on AAM. A hypothesis-free whole genome association study may reveal other common polymorphisms of small effect on AAM.
| Acknowledgments |
|---|
| Footnotes |
|---|
Disclosure Statement: The authors have nothing to disclose.
First Published Online July 22, 2008
1 C.A.A. and G.Z. contributed equally to this work. ![]()
Abbreviations: AAM, Age at menarche; IBD, identical by descent; LOD, logarithm of the odds; pisp, pseudoindependent sibling pair; QTL, quantitative trait locus; SNP, single-nucleotide polymorphism.
Received November 20, 2007.
Accepted July 11, 2008.
| References |
|---|
|
|
|---|
gene with the age of menarche. Hum Reprod 17:1101–1105
and estrogen receptor β genotypes influence the age of menarche. Hum Reprod 21:554–557
gene is linked and/or associated with age of menarche in different ethnic groups. J Med Genet 42:796–800
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Endocrinology | Endocrine Reviews | J. Clin. End. & Metab. |
| Molecular Endocrinology | Recent Prog. Horm. Res. | All Endocrine Journals |