1
0
Fork 0
mirror of https://github.com/MillironX/beefblup.git synced 2025-01-06 13:52:08 -05:00
beefblup/dev/how-to-calculate-epds/index.html
2021-09-01 01:08:28 +00:00

192 lines
32 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>How to Calculate EPDs · beefblup</title><script data-outdated-warner src="../assets/warner.js"></script><link rel="canonical" href="https://millironx.com/beefblup/how-to-calculate-epds/"/><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.039/juliamono-regular.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.11/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit"><a href="../">beefblup</a></span></div><form class="docs-search" action="../search/"><input class="docs-search-query" id="documenter-search-query" name="q" type="text" placeholder="Search docs"/></form><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li class="is-active"><a class="tocitem" href>How to Calculate EPDs</a><ul class="internal"><li><a class="tocitem" href="#The-mathematical-model"><span>The mathematical model</span></a></li><li><a class="tocitem" href="#The-statistical-model:-the-setup"><span>The statistical model: the setup</span></a></li><li><a class="tocitem" href="#The-statistical-model:-environment-as-fixed-effects"><span>The statistical model: environment as fixed effects</span></a></li><li><a class="tocitem" href="#The-statistical-model:-genotype-as-random-effect"><span>The statistical model: genotype as random effect</span></a></li><li><a class="tocitem" href="#Solving-the-equations"><span>Solving the equations</span></a></li><li><a class="tocitem" href="#Footnotes"><span>Footnotes</span></a></li></ul></li><li><a class="tocitem" href="../beefblup-cli/">CLI Reference (WIP)</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>How to Calculate EPDs</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>How to Calculate EPDs</a></li></ul></nav><div class="docs-right"><a class="docs-edit-link" href="https://github.com/MillironX/beefblup/blob/master/docs/src/how-to-calculate-epds.md#" title="Edit on GitHub"><span class="docs-icon fab"></span><span class="docs-label is-hidden-touch">Edit on GitHub</span></a><a class="docs-settings-button fas fa-cog" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-sidebar-button fa fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a></div></header><article class="content" id="documenter-page"><h1 id="How-to-Calculate-EPDs"><a class="docs-heading-anchor" href="#How-to-Calculate-EPDs">How to Calculate EPDs</a><a id="How-to-Calculate-EPDs-1"></a><a class="docs-heading-anchor-permalink" href="#How-to-Calculate-EPDs" title="Permalink"></a></h1><p>Not to exclude our Australian comrades or our dairy friends, this guide could alternately be called</p><ul><li>How to Calculate Expected Breeding Values (EBVs)</li><li>How to Calculate Predicted Transmitting Abilities (PTAs)</li><li>How to Calculate Expected Progeny Differences (EPDs)</li></ul><p>Since I&#39;m mostly talking to American beef producers, though, we&#39;ll stick with EPDs for most of this discussion.</p><p>Expected Breeding Values (EBVs) (which are more often halved and published as Expected Progeny Differences (EPDs) or Predicted Transmitting Abilities (PTAs) in the United States) are generally found using Charles Henderson&#39;s linear mixed-model equations. Great, you say, what is that? I&#39;m glad you asked...</p><h2 id="The-mathematical-model"><a class="docs-heading-anchor" href="#The-mathematical-model">The mathematical model</a><a id="The-mathematical-model-1"></a><a class="docs-heading-anchor-permalink" href="#The-mathematical-model" title="Permalink"></a></h2><p>Every genetics textbook starts with the following equation</p><p class="math-container">\[P = G + E\]</p><p>Where:</p><ul><li><span>$P$</span> = phenotype</li><li><span>$G$</span> = genotype (think: breeding value)</li><li><span>$E$</span> = environmental factors</li></ul><p>Now, we can&#39;t identify <em>every</em> environmental factor that affects phenotype, but we can identify some of them, so let&#39;s substitute <span>$E$</span> with some absolutes. A good place to start is the &quot;contemporary group&quot; listings for the trait of interest in the <a href="https://beefimprovement.org/wp-content/uploads/2018/03/BIFGuidelinesFinal_updated0318.pdf">BIF Guidelines</a>, though for the purposes of this example, I&#39;m only going to consider sex, and birth year.</p><p class="math-container">\[P = G + E_{year} + E_{sex}\]</p><p>Where:</p><ul><li><span>$E_n$</span> is the effect of <span>$n$</span> on the phenotype</li></ul><p>Now let&#39;s say I want to find the weaning weight breeding value (<span>$G$</span>) of my favorite herd bull. I compile his stats, and then plug them into the equation and solve for <span>$G$</span>, right? Let&#39;s try that.</p><h3 id="Calf-Records"><a class="docs-heading-anchor" href="#Calf-Records">Calf Records</a><a id="Calf-Records-1"></a><a class="docs-heading-anchor-permalink" href="#Calf-Records" title="Permalink"></a></h3><table><tr><th style="text-align: left">ID</th><th style="text-align: left">Birth Year</th><th style="text-align: left">Sex</th><th style="text-align: left">YW (kg)</th></tr><tr><td style="text-align: left">1</td><td style="text-align: left">1990</td><td style="text-align: left">Male</td><td style="text-align: left">354</td></tr></table><p class="math-container">\[354 \ \textup{kg} = G_1 + E_{1990} + E_{male}\]</p><p>Hmm. I just realized I don&#39;t know any of those <span>$E$</span> values. Come to think of it, I remember from math class that I will need as many equations as I have unknowns, so I will add equations for other animals that I have records for.</p><h3 id="Calf-Records-2"><a class="docs-heading-anchor" href="#Calf-Records-2">Calf Records</a><a class="docs-heading-anchor-permalink" href="#Calf-Records-2" title="Permalink"></a></h3><table><tr><th style="text-align: left">ID</th><th style="text-align: left">Birth Year</th><th style="text-align: left">Sex</th><th style="text-align: left">YW (kg)</th></tr><tr><td style="text-align: left">1</td><td style="text-align: left">1990</td><td style="text-align: left">Male</td><td style="text-align: left">354</td></tr><tr><td style="text-align: left">2</td><td style="text-align: left">1990</td><td style="text-align: left">Female</td><td style="text-align: left">251</td></tr><tr><td style="text-align: left">3</td><td style="text-align: left">1991</td><td style="text-align: left">Male</td><td style="text-align: left">327</td></tr><tr><td style="text-align: left">4</td><td style="text-align: left">1991</td><td style="text-align: left">Female</td><td style="text-align: left">328</td></tr><tr><td style="text-align: left">5</td><td style="text-align: left">1991</td><td style="text-align: left">Male</td><td style="text-align: left">301</td></tr><tr><td style="text-align: left">6</td><td style="text-align: left">1991</td><td style="text-align: left">Female</td><td style="text-align: left">270</td></tr><tr><td style="text-align: left">7</td><td style="text-align: left">1992</td><td style="text-align: left">Male</td><td style="text-align: left">330</td></tr></table><p class="math-container">\[\begin{aligned}
251 \ \textup{kg} &amp;= G_2 + E_{1990} + E_{female} \\
327 \ \textup{kg} &amp;= G_3 + E_{1991} + E_{male} \\
328 \ \textup{kg} &amp;= G_4 + E_{1991} + E_{female} \\
301 \ \textup{kg} &amp;= G_5 + E_{1991} + E_{male} \\
270 \ \textup{kg} &amp;= G_6 + E_{1991} + E_{female} \\
330 \ \textup{kg} &amp;= G_7 + E_{1992} + E_{male}
\end{aligned}\]</p><p>Drat! Every animal I added brings more variables into the system than it eliminates! In fact, since each cow brings in <em>at least</em> one term (<span>$G_n$</span>), I will never be able to write enough equations to solve for <span>$G$</span> numerically. I will have to use a different approach.</p><h2 id="The-statistical-model:-the-setup"><a class="docs-heading-anchor" href="#The-statistical-model:-the-setup">The statistical model: the setup</a><a id="The-statistical-model:-the-setup-1"></a><a class="docs-heading-anchor-permalink" href="#The-statistical-model:-the-setup" title="Permalink"></a></h2><p>Since I can never solve for <span>$G$</span> directly, I will have to find some way to estimate it. I can switch to a statistical model and solve for <span>$G$</span> that way. The caveat with a statistical model is that there will be some level of error, but so long as we know and can control the level of error, that will be better than not knowing <span>$G$</span> at all.</p><p>Since we&#39;re switching into a statistical space, we should also switch the variables we&#39;re using. I&#39;ll rewrite the first equation as</p><p class="math-container">\[y = b + u + e\]</p><p>Where:</p><ul><li><span>$y$</span> = Phenotype</li><li><span>$b$</span> = Environment</li><li><span>$u$</span> = Genotype</li><li><span>$e$</span> = Error</li></ul><p>It&#39;s not as easy as simply substituting <span>$b$</span> for every <span>$E$</span> that we had above, however. The reason for that is that we must make the assumption that environment is a <strong>fixed effect</strong> and that genotype is a <strong>random effect</strong>. I&#39;ll go over why that is later, but for now, understand that we need to transform the environment terms and genotype terms separately.</p><p>We&#39;ll start with the environment terms.</p><h2 id="The-statistical-model:-environment-as-fixed-effects"><a class="docs-heading-anchor" href="#The-statistical-model:-environment-as-fixed-effects">The statistical model: environment as fixed effects</a><a id="The-statistical-model:-environment-as-fixed-effects-1"></a><a class="docs-heading-anchor-permalink" href="#The-statistical-model:-environment-as-fixed-effects" title="Permalink"></a></h2><p>To properly transform the equations, I will have to introduce <span>$b_{mean}$</span> terms in each animal&#39;s equation. This is part of the fixed effect statistical assumption, and it will let us obtain a solution.</p><p>Here are the transformed equations:</p><p class="math-container">\[\begin{aligned}
354 \ \textup{kg} &amp;= u_1 + b_{mean} + b_{1990} + b_{male} + e_1 \\
251 \ \textup{kg} &amp;= u_2 + b_{mean} + b_{1990} + b_{female} + e_2 \\
327 \ \textup{kg} &amp;= u_3 + b_{mean} + b_{1991} + b_{male} + e_3 \\
328 \ \textup{kg} &amp;= u_4 + b_{mean} + b_{1991} + b_{female} +e_4 \\
301 \ \textup{kg} &amp;= u_5 + b_{mean} + b_{1991} + b_{male} + e_5 \\
270 \ \textup{kg} &amp;= u_6 + b_{mean} + b_{1991} + b_{female} + e_6 \\
330 \ \textup{kg} &amp;= u_7 + b_{mean} + b_{1992} + b_{male} + e_7
\end{aligned}\]</p><p>Statistical methods work best in matrix form, so I&#39;m going to convert the set of equations above to a single matrix equation that means the exact same thing.</p><p class="math-container">\[\begin{bmatrix}
354 \ \textup{kg} \\
251 \ \textup{kg} \\
327 \ \textup{kg} \\
328 \ \textup{kg} \\
301 \ \textup{kg} \\
270 \ \textup{kg} \\
330 \ \textup{kg}
\end{bmatrix}
=
\begin{bmatrix}
u_1 \\
u_2 \\
u_3 \\
u_4 \\
u_5 \\
u_6 \\
u_7
\end{bmatrix}
+
b_{mean}
+
\begin{bmatrix}
b_{1990} \\
b_{1990} \\
b_{1991} \\
b_{1991} \\
b_{1991} \\
b_{1991} \\
b_{1992}
\end{bmatrix}
+
\begin{bmatrix}
b_{male} \\
b_{female} \\
b_{male} \\
b_{female} \\
b_{male} \\
b_{female} \\
b_{male}
\end{bmatrix}
+
\begin{bmatrix}
e_1 \\
e_2 \\
e_3 \\
e_4 \\
e_5 \\
e_6 \\
e_7
\end{bmatrix}\]</p><p>That&#39;s a nice equation, but now my hand is getting tired writing all those <span>$b$</span> terms over and over again, so I&#39;m going to use <a href="https://www.khanacademy.org/math/precalculus/x9e81a4f98389efdf:matrices/x9e81a4f98389efdf:multiplying-matrices-by-matrices/v/matrix-multiplication-intro">the dot product</a> to condense this down.</p><p class="math-container">\[\begin{bmatrix}
354 \textup{kg} \\
251 \textup{kg} \\
327 \textup{kg} \\
328 \textup{kg} \\
301 \textup{kg} \\
270 \textup{kg} \\
330 \textup{kg}
\end{bmatrix}
=
\begin{bmatrix}
u_1 \\
u_2 \\
u_3 \\
u_4 \\
u_5 \\
u_6 \\
u_7
\end{bmatrix}
+
\begin{bmatrix}
1 &amp; 1 &amp; 0 &amp; 0 &amp; 1 &amp; 0 \\
1 &amp; 1 &amp; 0 &amp; 0 &amp; 0 &amp; 1 \\
1 &amp; 0 &amp; 1 &amp; 0 &amp; 1 &amp; 0 \\
1 &amp; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 1 \\
1 &amp; 0 &amp; 1 &amp; 0 &amp; 1 &amp; 0 \\
1 &amp; 0 &amp; 0 &amp; 1 &amp; 1 &amp; 0
\end{bmatrix}
\begin{bmatrix}
b_{mean} \\
b_{1990} \\
b_{1991} \\
b_{1992} \\
b_{male} \\
b_{female}
\end{bmatrix}
+
\begin{bmatrix}
e_1 \\
e_2 \\
e_3 \\
e_4 \\
e_5 \\
e_6 \\
e_7
\end{bmatrix}\]</p><p>That matrix in the middle with all the zeros and ones is called the <strong>incidence matrix</strong>, and essentially reads like a table with each row corresponding to an animal, and each column corresponding to a fixed effect. For brevity, we&#39;ll just call it <span>$X$</span>, though. One indicates that the animal and effect go together, and zero means they don&#39;t. For our record, we could write a table to go with <span>$X$</span>, and it would look like this:</p><table><tr><th style="text-align: left">Animal</th><th style="text-align: left">mean</th><th style="text-align: left">1990</th><th style="text-align: left">1991</th><th style="text-align: left">1992</th><th style="text-align: left">male</th><th style="text-align: left">female</th></tr><tr><td style="text-align: left">1</td><td style="text-align: left">yes</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left">2</td><td style="text-align: left">yes</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">no</td><td style="text-align: left">no</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left">3</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left">4</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">no</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left">5</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left">6</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">no</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left">7</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">yes</td><td style="text-align: left">no</td></tr></table><p>Now that we have <span>$X$</span>, we have the ability to start making changes to allow us to solve for <span>$u$</span>. Immediately, we see that <span>$X$</span> is <strong>singular</strong>, meaning it can&#39;t be solved directly. We kind of already knew that, but now we can quantify it. We calculate the <a href="https://math.stackexchange.com/a/2080577">rank of <span>$X$</span></a>, and find that there is only enough information contained in it to solve for 4 variables, which means we need to eliminate two columns.</p><p>There are several ways to effectively eliminate fixed effects in this type of system, but one of the simplest and the most common methods is to declare a <strong>base population</strong>, and lump the fixed effects of animals within the base population into the mean fixed effect. Note that it is possible to declare a base population that has no animals in it, but that gives weird results. For this example, we&#39;ll follow the convention built into <code>beefblup</code> and pick the last occuring form of each variable.</p><h3 id="Base-population"><a class="docs-heading-anchor" href="#Base-population">Base population</a><a id="Base-population-1"></a><a class="docs-heading-anchor-permalink" href="#Base-population" title="Permalink"></a></h3><ul><li><strong>Year</strong>: 1992</li><li><strong>Sex</strong>: Female</li></ul><p>Now in order to use the base population, we simply drop the columns representing conformity with the traits in the base population from <span>$X$</span>``. Our new equation looks like</p><p class="math-container">\[\begin{bmatrix}
354 \ \textup{kg} \\
251 \ \textup{kg} \\
327 \ \textup{kg} \\
328 \ \textup{kg} \\
301 \ \textup{kg} \\
270 \ \textup{kg} \\
330 \ \textup{kg}
\end{bmatrix}
=
\begin{bmatrix}
u_1 \\
u_2 \\
u_3 \\
u_4 \\
u_5 \\
u_6 \\
u_7
\end{bmatrix}
+
\begin{bmatrix}
1 &amp; 1 &amp; 0 1 \\
1 &amp; 1 &amp; 0 0 \\
1 &amp; 0 &amp; 1 1 \\
1 &amp; 0 &amp; 1 0 \\
1 &amp; 0 &amp; 1 1 \\
1 &amp; 0 &amp; 0 1
\end{bmatrix}
+
\begin{bmatrix}
b_{mean} \\
b_{1990} \\
b_{1991} \\
b_{male} \\
\end{bmatrix}
+
\begin{bmatrix}
e_1 \\
e_2 \\
e_3 \\
e_4 \\
e_5 \\
e_6 \\
e_7
\end{bmatrix}\]</p><p>And the table for humans to understand:</p><table><tr><th style="text-align: left">Animal</th><th style="text-align: left">mean</th><th style="text-align: left">1990</th><th style="text-align: left">1991</th><th style="text-align: left">female</th></tr><tr><td style="text-align: left">1</td><td style="text-align: left">yes</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left">2</td><td style="text-align: left">yes</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left">3</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left">4</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left">5</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left">6</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">yes</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left">7</td><td style="text-align: left">yes</td><td style="text-align: left">no</td><td style="text-align: left">no</td><td style="text-align: left">no</td></tr></table><p>Even though each animal is said to participate in the mean, the result for the mean will now actually be the average of the base population. Math is weird sometimes.</p><p>Double-checking, the rank of <span>$X$</span> is still 4, so we can solve for the average of the base population, and the effect of being born in 1990, the effect of being born in 1991, and the effect of being male.</p><p>Whew! That was some transformation. We still haven&#39;t constrained this model enough to solve it, though. Now on to the genotype.</p><h2 id="The-statistical-model:-genotype-as-random-effect"><a class="docs-heading-anchor" href="#The-statistical-model:-genotype-as-random-effect">The statistical model: genotype as random effect</a><a id="The-statistical-model:-genotype-as-random-effect-1"></a><a class="docs-heading-anchor-permalink" href="#The-statistical-model:-genotype-as-random-effect" title="Permalink"></a></h2><p>Remember I said above that genotype was a <strong>random effect</strong>? Statisticians say &quot;<em>a random effect is an effect that influences the variance and not the mean of the observation in question.</em>&quot; I&#39;m not sure exactly what that means or how that is applicable to genotype, but it does let us add an additional constraint to our model.</p><p>The basic gist of genetics is that organisms that are related to one another are similar to one another. Based on a pedigree, we can even say how related to one another animals are, and quantify that as the amount that the genotype terms should be allowed to vary between related animals.</p><p>We&#39;ll need a pedigree for our animals:</p><h3 id="Calf-Records-3"><a class="docs-heading-anchor" href="#Calf-Records-3">Calf Records</a><a class="docs-heading-anchor-permalink" href="#Calf-Records-3" title="Permalink"></a></h3><table><tr><th style="text-align: left">ID</th><th style="text-align: left">Sire</th><th style="text-align: left">Dam</th><th style="text-align: left">Birth Year</th><th style="text-align: left">Sex</th><th style="text-align: left">YW (kg)</th></tr><tr><td style="text-align: left">1</td><td style="text-align: left">NA</td><td style="text-align: left">NA</td><td style="text-align: left">1990</td><td style="text-align: left">Male</td><td style="text-align: left">354</td></tr><tr><td style="text-align: left">2</td><td style="text-align: left">NA</td><td style="text-align: left">NA</td><td style="text-align: left">1990</td><td style="text-align: left">Female</td><td style="text-align: left">251</td></tr><tr><td style="text-align: left">3</td><td style="text-align: left">1</td><td style="text-align: left">NA</td><td style="text-align: left">1991</td><td style="text-align: left">Male</td><td style="text-align: left">327</td></tr><tr><td style="text-align: left">4</td><td style="text-align: left">1</td><td style="text-align: left">NA</td><td style="text-align: left">1991</td><td style="text-align: left">Female</td><td style="text-align: left">328</td></tr><tr><td style="text-align: left">5</td><td style="text-align: left">1</td><td style="text-align: left">2</td><td style="text-align: left">1991</td><td style="text-align: left">Male</td><td style="text-align: left">301</td></tr><tr><td style="text-align: left">6</td><td style="text-align: left">NA</td><td style="text-align: left">2</td><td style="text-align: left">1991</td><td style="text-align: left">Female</td><td style="text-align: left">270</td></tr><tr><td style="text-align: left">7</td><td style="text-align: left">NA</td><td style="text-align: left">NA</td><td style="text-align: left">1992</td><td style="text-align: left">Male</td><td style="text-align: left">330</td></tr></table><p>Now, because cows sexually reproduce, the genotype of one animal is halfway the same as that of either parent (exception: inbreeding, see below). It should go without saying that each animal&#39;s genotype is identical to that of itself. From this we can then find the numerical multiplier for any relative (grandparent = 1/4, full sibling = 1, half sibling = 1/2, etc.). Let&#39;s write those values down in a table.</p><table><tr><th style="text-align: left">ID</th><th style="text-align: left">1</th><th style="text-align: left">2</th><th style="text-align: left">3</th><th style="text-align: left">4</th><th style="text-align: left">5</th><th style="text-align: left">6</th><th style="text-align: left">7</th></tr><tr><td style="text-align: left">1</td><td style="text-align: left">1</td><td style="text-align: left">0</td><td style="text-align: left">1/2</td><td style="text-align: left">1/2</td><td style="text-align: left">1/2</td><td style="text-align: left">0</td><td style="text-align: left">0</td></tr><tr><td style="text-align: left">2</td><td style="text-align: left">0</td><td style="text-align: left">1</td><td style="text-align: left">0</td><td style="text-align: left">0</td><td style="text-align: left">1/2</td><td style="text-align: left">1/2</td><td style="text-align: left">0</td></tr><tr><td style="text-align: left">3</td><td style="text-align: left">1/2</td><td style="text-align: left">0</td><td style="text-align: left">1</td><td style="text-align: left">1/4</td><td style="text-align: left">1/4</td><td style="text-align: left">0</td><td style="text-align: left">0</td></tr><tr><td style="text-align: left">4</td><td style="text-align: left">1/2</td><td style="text-align: left">0</td><td style="text-align: left">1/4</td><td style="text-align: left">1</td><td style="text-align: left">1/4</td><td style="text-align: left">0</td><td style="text-align: left">0</td></tr><tr><td style="text-align: left">5</td><td style="text-align: left">1/2</td><td style="text-align: left">1/2</td><td style="text-align: left">1/4</td><td style="text-align: left">1/4</td><td style="text-align: left">1</td><td style="text-align: left">1/4</td><td style="text-align: left">0</td></tr><tr><td style="text-align: left">6</td><td style="text-align: left">0</td><td style="text-align: left">1/2</td><td style="text-align: left">0</td><td style="text-align: left">0</td><td style="text-align: left">1/4</td><td style="text-align: left">1</td><td style="text-align: left">0</td></tr><tr><td style="text-align: left">7</td><td style="text-align: left">0</td><td style="text-align: left">0</td><td style="text-align: left">0</td><td style="text-align: left">0</td><td style="text-align: left">0</td><td style="text-align: left">0</td><td style="text-align: left">1</td></tr></table><p>Hmm. All those numbers look suspiciously like a matrix. Why don&#39;t I put them into a matrix called <span>$A$</span>?</p><p class="math-container">\[\begin{bmatrix}
1 &amp; 0 &amp; \frac{1}{2} &amp; \frac{1}{2} &amp; \frac{1}{2} &amp; 0 &amp; 0 \\
0 &amp; 1 &amp; 0 &amp; 0 &amp; \frac{1}{2} &amp; \frac{1}{2} &amp; 0 \\
\frac{1}{2} &amp; 0 &amp; 1 &amp; \frac{1}{4} &amp; \frac{1}{4} &amp; 0 &amp; 0 \\
\frac{1}{2} &amp; 0 &amp; \frac{1}{4} &amp; 1 &amp; \frac{1}{4} &amp; 0 &amp; 0 \\
\frac{1}{2} &amp; \frac{1}{2} &amp; \frac{1}{4} &amp; \frac{1}{4} &amp; 1 &amp; \frac{1}{4} &amp; 0 \\
0 &amp; \frac{1}{2} &amp; 0 &amp; 0 &amp; \frac{1}{4} &amp; 1 &amp; 0 \\
0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1
\end{bmatrix}\]</p><p>Now I&#39;m going to take the matrix with all of the <span>$u$</span> values, and call it <span>$μ$</span>. To quantify the idea of genetic relationship, I will then say that</p><p class="math-container">\[\textup{var}(μ) = A σ_μ^2\]</p><p>Where:</p><ul><li><span>$A$</span> = the relationship matrix defined above</li><li><span>$σ_μ^2$</span> = the standard deviation of all the genotypes</li></ul><p>To fully constrain the system, I have to make two more assumptions: 1) that the error term in each animal&#39;s equation is independent from all other error terms, and 2) that the error term for each animal is independent from the value of the genotype. I will call the matrix holding the <span>$e$</span> values <span>$ε$</span> and then say</p><p class="math-container">\[\textup{var}(ϵ) = I σ_ϵ^2\]</p><p class="math-container">\[\textup{cov}(μ, ϵ) = \textup{cov}(ϵ, μ) = 0\]</p><p>Substituting in the matrix names, our equation now looks like</p><p class="math-container">\[\begin{bmatrix}
354 \textup{kg} \\
251 \textup{kg} \\
327 \textup{kg} \\
328 \textup{kg} \\
301 \textup{kg} \\
270 \textup{kg} \\
330 \textup{kg}
\end{bmatrix}
= μ + X
\begin{bmatrix}
b_{mean} \\
b_{1990} \\
b_{1991} \\
b_{male} \\
\end{bmatrix}
+ ϵ\]</p><p>We are going to make three changes to this equation before we are ready to solve it, but they are cosmetic details for this example.</p><ol><li>Call the matrix on the left side of the equation <span>$Y$</span> (sometimes it&#39;s called the <strong>matrix of observations</strong>)</li><li>Multiply <span>$μ$</span> by an identity matrix called <span>$Z$</span>. Multiplying by the identity matrix is the matrix form of multiplying by one, so nothing changes, but if we later want to find one animal&#39;s genetic effect on another animal&#39;s performance (e.g. a <strong>maternal effects model</strong>), we can alter <span>$Z$</span> to allow that</li><li>Call the matrix with all the <span>$b$</span> values <span>$β$</span>.</li></ol><p>With all these changes, we now have</p><p class="math-container">\[Y = Z μ + X β + ϵ\]</p><p>This is the canonical form of the mixed-model equation, and the form that Charles Henderson used to first predict breeding values of livestock.</p><h2 id="Solving-the-equations"><a class="docs-heading-anchor" href="#Solving-the-equations">Solving the equations</a><a id="Solving-the-equations-1"></a><a class="docs-heading-anchor-permalink" href="#Solving-the-equations" title="Permalink"></a></h2><p>Henderson proved that the mixed-model equation can be solved by the following:</p><p class="math-container">\[\begin{bmatrix}
\hat{β} \\
\hat{μ}
\end{bmatrix}
=
\begin{bmatrix}
X&#39;X &amp; X&#39;Z \\
Z&#39;X &amp; Z&#39;Z+A^{-1}λ
\end{bmatrix}^{-1}
\begin{bmatrix}
X&#39;Y \\
Z&#39;Y
\end{bmatrix}\]</p><p>Where</p><ul><li>The variables with hats are the statistical estimates of their mixed-model counterparts<ul><li>The predicted value of <span>$μ$</span> is called the <em>Best Linear Unbiased Predictor</em> or <em>BLUP</em></li><li>The estimated value of <span>$β$</span> is called the <em>Best Linear Unbiased Estimate</em> or <em>BLUE</em></li></ul></li><li>&#39; is the transpose operator</li><li><span>$λ$</span> is a single real number that is a function of the heritability for the trait being predicted. It can be left out in many cases (<span>$λ = 1$</span>).<ul><li><span>$λ = \frac{1-h^2}{h^2}$</span></li></ul></li></ul><p>What happened to</p><h2 id="Footnotes"><a class="docs-heading-anchor" href="#Footnotes">Footnotes</a><a id="Footnotes-1"></a><a class="docs-heading-anchor-permalink" href="#Footnotes" title="Permalink"></a></h2><h3 id="Exception"><a class="docs-heading-anchor" href="#Exception">Exception</a><a id="Exception-1"></a><a class="docs-heading-anchor-permalink" href="#Exception" title="Permalink"></a></h3><p>An animal <strong>can</strong> share its genome with itself by a factor of more than one: that&#39;s called inbreeding! We can account for this, and <code>beefblup</code> does as it calculates <span>$A$</span>. This is an area that actually merits a good deal of study: see chapter 2 of <em>Linear Models for the Prediction of Animal Breeding Values</em> by Raphael A. Mrode (ISBN 978 1 78064 391 5).</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../">« Home</a><a class="docs-footer-nextpage" href="../beefblup-cli/">CLI Reference (WIP) »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.5 on <span class="colophon-date" title="Wednesday 1 September 2021 01:08">Wednesday 1 September 2021</span>. Using Julia version 1.5.4.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>