Most people have an intuitive notion of heritability being the genetic component of why close relatives tend to resemble each other more than strangers. More technically, heritability is the fraction of the variance of a trait within a population that is due to genetic factors. This is the pedagogical post on heritability that I promised in a previous post on estimating heritability from genome wide association studies (GWAS).
One of the most important facts about uncertainty and something that everyone should know but often doesn’t is that when you add two imprecise quantities together, while the average of the sum is the sum of the averages of the individual quantities, the total error (i.e. standard deviation) is not the sum of the standard deviations but the square root of the sum of the square of the standard deviations or variances. In other words, when you add two uncorrelated noisy variables, the variance of the sum is the sum of the variances. Hence, the error grows as the square root of the number of quantities you add and not linearly as it had been assumed for centuries. There is a great article in the American Scientist from 2007 called The Most Dangerous Equation giving a history of some calamities that resulted from not knowing about how variances sum. The variance of a trait can thus be expressed as the sum of the genetic variance and environmental variance, where environment just means everything that is not correlated to genetics. The heritability is the ratio of the genetic variance to the trait variance.
Consider a trait that varies like height. If you plot a histogram of the heights of males or females, you will get a normal distribution. Heritability is about what determines the variance of the distribution and not the mean. That is not to say that the mean does not depend on genetics. Obviously, humans are taller than rhesus monkeys and that has everything to do with the genes. However, the mean is mostly determined by the genetic (and environmental) components that everyone shares. The variance is about what is different between people and that is what we can measure. For example, say one person is 178 cm and another is 176 cm and are genetically identical except for a handful of genetic factors. If they were subjected to the same identical environmental conditions then we could attribute the difference in height to those genetic differences. Obviously, there will be many other genetic factors that specify why the height is on average 177 cm and not say 100 cm, namely all the genetic factors that are identical. However, there is no way to figure out which of those identical genes are responsible for height as opposed to say kidney function with this information. The difference between individuals is also what natural selection can work on. That is why population genetics is so focused on variances.
In my previous post on population genetics, I introduced the concept of additive genetic effects. These are genes or more technically alleles whose contributions to the trait are independent of other genes or the environment. What this means is that if you want to know the difference from the mean, you simply add up the contributions of all the additive alleles that influence that trait. The genetic variance can thus be divided into additive genetic variance and non-additive genetic variance. The non-additive parts include everything that has a nonlinear effect such as dominance, where the presence of just one allele contributes as much as two of the same allele, or epistasis where alleles act differently depending on what other alleles are present, or gene-environment effects where the contribution of an allele changes depending on the environment. The fraction of the variance explained by the additive genetic effects is called the narrow-sense heritability as opposed to the broad-sense heritability, which includes all the genetic effects.
The classical way to measure narrow-sense heritability is to take a group of close relatives, say mothers and daughters, and plot the height of daughters versus the height of mothers. The best fit line picks up the additive genetic effects. If we standardize the heights of each generation, i.e. rescale the heights so that the mean is zero and the standard deviation is one, then the slope of the line is given by the correlation between the heights of the daughters and mothers. Note that the magnitude of a correlation is always less than one. Hence, on average daughters will be closer to the mean than their mothers. This is called regression to the mean. Mothers and daughters share exactly half of their genetic material. The heritability is thus twice the slope (i.e. slope divided by coefficient of relatedness). If you plot the height deviations of the daughters against the average of the height deviations of the two parents, then the slope is the heritability. What this means is that you can estimate the average height of your children or any other heritable trait by taking the average height of you and your spouse and multiplying by the narrow sense heritability. The narrow sense heritability of height is about 0.8, so if you and your wife are two standard deviations above the mean, then the average of your children will be 1.6 standard deviations above the mean. If the heritability of the trait is zero, then the average of your children will be the population mean. A recently developed method, as I described in a previous post, can estimate the heritability contained in a set of genetic markers for a population of strangers.
These days, most biologists seem to downplay the importance of additive genetic effects. To me this is a perfect example of discounting the obvious as I blogged about before. Most people seem to believe that the interaction of genes or epistasis must be more important. What I like to say is that epistasis is likely to be important for biology but additive genetic effects are most important for natural selection. The reason is that we inherit genes and not genotypes. Mozart may have been the genius he was because of the specific combination of genes that he possessed but that perfect combination would not be passed on to his children. Thus any allele that confers an advantage will likely only persist in the population if it confers an advantage additively. However, in cases where the population is small and there is some inbreeding, then it could be that combinations of genes that confer a large advantage together but little individually could become fixed in the population. Hence, the way I see evolution proceeding is that it takes small additive steps and then every once in awhile it takes a big nonadditive step.
The genetic variation between people can be divided into common and rare variants. The human genome has about ten million common single nucleotide polymorphisms (SNPs) but each individual will also carry many rare mutations. However, it is possible that the variation in the common variants alone could lead to mind-boggling differences in phenotype. Consider an example due to Steve Hsu. Suppose a trait depends on 400 alleles and there is a 50% chance of getting one of these alleles. Then on average you will have 400*0.5= 200 alleles and the variance around this mean will be 400*0.5*0.5=100. Hence, the standard deviation will be 10 alleles. That means 95% of the population will have between 180 and 220 alleles. This also means on average each allele contributes 0.1 standard deviations. A superoutlier who is four standard deviations away has 240 alleles. That still leaves a lot of room for improvement. If you happened to have all of the alleles, which has a probability of one half to the 400 power, you will be 20 standard deviations above the mean! Now, it could be that nonlinear effects could kick in if you have lots of alleles to saturate the effect. I wouldn’t expect any person could be 20 standard deviations above the mean but some traits could have great room for expansion and selective breeding on additive effects in animals have shown dramatic increases in phenotype.