kloppenborg.cahttps://www.kloppenborg.ca/2022-11-21T00:00:00-05:00Explaining A- and B-Basis Values2022-11-21T00:00:00-05:002022-11-21T00:00:00-05:00Stefan Kloppenborgtag:www.kloppenborg.ca,2022-11-21:/2022/11/explaining-basis-values/<p>I&#8217;ve been thinking about how to explain an A- or B-Basis value to people without much statistical knowledge. These are the names used in aircraft certification for the lower tolerance bounds on material strength. The definition used for transport category aircraft is given in <a href="https://drs.faa.gov/browse/excelExternalWindow/B30DF8E84C71FFBD86256D7A00712EAD.0001">14 <span class="caps">CFR</span> 25.613</a> (the …</p><p>I&#8217;ve been thinking about how to explain an A- or B-Basis value to people without much statistical knowledge. These are the names used in aircraft certification for the lower tolerance bounds on material strength. The definition used for transport category aircraft is given in <a href="https://drs.faa.gov/browse/excelExternalWindow/B30DF8E84C71FFBD86256D7A00712EAD.0001">14 <span class="caps">CFR</span> 25.613</a> (the same definition is used for other categories of aircraft too). This definition is precise, but not easy to&nbsp;understand.</p> <blockquote> <p>&#8230; (b) Material design values must be chosen to minimize the probability of structural failures due to material variability. &#8230; compliance must be shown by selecting material design values which assure material strength with the following&nbsp;probability:</p> <p>(1) Where applied loads are eventually distributed through a single member within an assembly, the failure of which would result in loss of structural integrity of the component, 99 percent probability with 95 percent&nbsp;confidence.</p> <p>(2) For redundant structure, in which the failure of individual elements would result in applied loads being safely distributed to other load carrying members, 90 percent probability with 95 percent confidence.&nbsp;&#8230;</p> </blockquote> <p>Another way of stating this definition is the lower 95% confidence bound on the 1-st or 10-th percentile of the population, respectively. But, describing it that way doesn&#8217;t help to explain the concept to a person who&#8217;s not well versed in&nbsp;statistics.</p> <h1>The&nbsp;Explanation</h1> <p>There&#8217;s some random variation in all material properties. Some pieces of any material will be a little bit different than other pieces of the same material. To account for this variation in the material properties when we design aircraft structure, we design it so that there&#8217;s at least a 90% chance that redundant structure is stronger than it needs to be (or 99% of non-redundant&nbsp;structure).</p> <p>When we test a material property, we get a <em>sample</em>. This <em>sample</em> is not a perfect representation of the material property. A good analogy is that a sample is like a low-resolution photo: it gives us an idea of what we&#8217;re seeing, but we don&#8217;t get all the detail. We can get a better idea of what we&#8217;re seeing by taking a higher resolution photo: this is akin to testing more and getting a larger sample&nbsp;size.</p> <p>We choose a statistical distribution that fits the data, then find the tenth (or first) percentile of that distribution. But since we only have a sample of the material property (a &#8220;low-resolution photo&#8221;, in the analogy), we&#8217;re not sure if the distribution that we chose is correct. To account for that uncertainty, we try out many possible distributions for the material property and determine how likely each is to be true based on the sample (the data). Distributions that look a lot like the data are highly likely; distributions that look different than the data are less likely, but depending on how &#8220;low-resolution&#8221; our data is, they <em>could</em> be correct. For each of these possible distributions, we find the 10th percentile (for B-Basis; it would be the 1st percentile for A-Basis). Next, we weight each of those individual 10th percentiles based on the likelihood that the corresponding distribution is true, and we find a lower bound where 95% of those weighted 10th percentiles are above that lower&nbsp;bound.</p> <p>Or in graphical&nbsp;form:</p> <p><img alt="The explanatory graph" src="explaining-basis-values_files/figure-markdown/explanation-graph-1.png"></p> <p>I hope that this explanation helps explain this complicated topic. If you think you have a better explanation, please connect with me on <a href="https://linkedin.com/in/stefankloppenborg/">LinkedIn</a> and message me&nbsp;there.</p> <h1>Developing the&nbsp;Graph</h1> <p>Let&#8217;s look at how I developed this graph. The graph was developed using the R language. As for most R code, we start by loading the required&nbsp;packages.</p> <div class="highlight"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span> <span class="nf">library</span><span class="p">(</span><span class="n">cmstatr</span><span class="p">)</span> <span class="nf">library</span><span class="p">(</span><span class="n">stats4</span><span class="p">)</span> </code></pre></div> <p>In this example, we&#8217;ll use some of the sample data that comes with the <a href="https://www.cmstatr.net"><code>cmstatr</code></a> package. We&#8217;ll be using the room-temperature warp-tension example&nbsp;data.</p> <div class="highlight"><pre><span></span><code><span class="n">dat</span> <span class="o">&lt;-</span> <span class="n">carbon.fabric.2</span> <span class="o">%&gt;%</span> <span class="nf">filter</span><span class="p">(</span><span class="n">test</span> <span class="o">==</span> <span class="s">&quot;WT&quot;</span> <span class="o">&amp;</span> <span class="n">condition</span> <span class="o">==</span> <span class="s">&quot;RTD&quot;</span><span class="p">)</span> </code></pre></div> <p>Let&#8217;s start by plotting this data. We&#8217;re plotting 1-D data, so we only need one axis. But in order to make sure that none of the data points overlap, we&#8217;ll add some &#8220;jitter&#8221; (random vertical position). We&#8217;ll also hide the vertical axis since this axis is not&nbsp;meaningful.</p> <div class="highlight"><pre><span></span><code><span class="n">dat</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">strength</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_jitter</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">y</span> <span class="o">=</span> <span class="m">0.01</span><span class="p">),</span> <span class="n">height</span> <span class="o">=</span> <span class="m">0.005</span><span class="p">)</span> <span class="o">+</span> <span class="nf">scale_y_continuous</span><span class="p">(</span><span class="n">name</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> <span class="n">breaks</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-3-1" src="https://www.kloppenborg.ca/2022/11/explaining-basis-values/explaining-basis-values_files/figure-markdown/unnamed-chunk-3-1.png"></p> <p>We can fit a normal distribution to this data. The sample mean and standard deviation are point-estimates of the mean and standard deviation of the distribution. We&#8217;ll use those point-estimates and draw the <span class="caps">PDF</span> superimposed over the data, assuming that the distribution is normal. We can also add the 10-th percentile of this distribution to the&nbsp;plot.</p> <div class="highlight"><pre><span></span><code><span class="n">dat</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">strength</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_jitter</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">y</span> <span class="o">=</span> <span class="m">0.01</span><span class="p">),</span> <span class="n">height</span> <span class="o">=</span> <span class="m">0.005</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;magenta&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">stat_function</span><span class="p">(</span><span class="n">fun</span> <span class="o">=</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="nf">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="nf">sd</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">)))</span> <span class="o">+</span> <span class="nf">geom_vline</span><span class="p">(</span><span class="n">xintercept</span> <span class="o">=</span> <span class="nf">qnorm</span><span class="p">(</span><span class="m">0.1</span><span class="p">,</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="nf">sd</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">)),</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;blue&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">scale_y_continuous</span><span class="p">(</span><span class="n">name</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> <span class="n">breaks</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-4-1" src="https://www.kloppenborg.ca/2022/11/explaining-basis-values/explaining-basis-values_files/figure-markdown/unnamed-chunk-4-1.png"></p> <p>But, the distribution that we&#8217;ve drawn is just a point-estimate from the data. There is uncertainty in our estimate. Based on the data, we&#8217;ve concluded that this estimate is the most likely, but we shouldn&#8217;t be surprised if the true population distribution is a bit different. This point-estimate is actually the Maximum Likelihood Estimate (<span class="caps">MLE</span>), based on this particular data. We can calculate the likelihood of various potential estimates of distribution (or rather, the parameters of the distribution) using the following&nbsp;equation:</p> <p>$$L\left(\mu, \sigma\right) = \prod_{i=1}^{n} f\left(X_i;\,\mu, \sigma\right)&nbsp;$$</p> <p>Here $X_i$ are the data and there are two parameters for a normal distribution are $\mu$ and $\sigma$. The function $f()$ is the probability density&nbsp;function.</p> <p>It turns out that computers have trouble multiplying a bunch of small numbers together and coming up with an accurate result. We can avoid this problem by using a log&nbsp;transform:</p> <p>$$\mathcal{L}\left(\mu, \sigma\right) = \sum_{i=1}^{n} \log f\left(X_i;\,\mu, \sigma\right)&nbsp;$$</p> <p>Implementing this in&nbsp;R:</p> <div class="highlight"><pre><span></span><code><span class="n">log_likelihood</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span><span class="p">)</span> <span class="p">{</span> <span class="nf">sum</span><span class="p">(</span><span class="nf">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span><span class="p">,</span> <span class="n">log</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">))</span> <span class="p">}</span> </code></pre></div> <p>To make sure that this function works, we can find the log likelihood of our <span class="caps">MLE</span> of the parameters. The actual numerical value of the likelihood doesn&#8217;t mean very much to us, but we&#8217;ll be interested in the <em>distribution</em> of the likelihoods as we change the&nbsp;parameters.</p> <div class="highlight"><pre><span></span><code><span class="nf">log_likelihood</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">,</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="nf">sd</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">))</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] -92.55627 </code></pre></div> <p>It&#8217;s going to make our life a lot easier if we can work with a single parameter instead of two ($\mu$and$\sigma$). We&#8217;ll treat$\sigma$as a nuisance parameter and find the value of$\sigma$that produces the greatest likelihood for any given values of$\mu$. To avoid working with very tiny numbers, we&#8217;ll calculate the relative likelihood (the likelihood divided by the maximum likelihood). We can do this in R as&nbsp;follows:</p> <div class="highlight"><pre><span></span><code><span class="n">rel_likelihood_mu</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">)</span> <span class="p">{</span> <span class="n">ll_hat</span> <span class="o">&lt;-</span> <span class="nf">log_likelihood</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="nf">mean</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="nf">sd</span><span class="p">(</span><span class="n">x</span><span class="p">))</span> <span class="n">opt</span> <span class="o">&lt;-</span> <span class="nf">optimize</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">sigma</span><span class="p">)</span> <span class="nf">exp</span><span class="p">(</span><span class="nf">log_likelihood</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span><span class="p">)</span> <span class="o">-</span> <span class="n">ll_hat</span><span class="p">),</span> <span class="n">lower</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">upper</span> <span class="o">=</span> <span class="m">20</span> <span class="o">*</span> <span class="nf">sd</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="c1"># pick an upper bound that&#39;s big</span> <span class="n">maximum</span> <span class="o">=</span> <span class="kc">TRUE</span> <span class="p">)</span> <span class="c1"># We&#39;ll return a list of the sigma and the relative likelihood:</span> <span class="nf">list</span><span class="p">(</span> <span class="n">sigma</span> <span class="o">=</span> <span class="n">opt</span><span class="o">$</span><span class="n">maximum</span><span class="p">,</span> <span class="n">rel_likelihood</span> <span class="o">=</span> <span class="n">opt</span><span class="o">$</span><span class="n">objective</span> <span class="p">)</span> <span class="p">}</span> </code></pre></div> <p>We can also do the same thing to calculate the relative likelihood of a a particular 10th percentile ($x_p$). We use the transformation$\mu = x_p - \sigma&nbsp;\Phi(0.1)$.</p> <div class="highlight"><pre><span></span><code><span class="n">rel_likelihood_xp</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">xp</span><span class="p">)</span> <span class="p">{</span> <span class="n">ll_hat</span> <span class="o">&lt;-</span> <span class="nf">log_likelihood</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="nf">mean</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="nf">sd</span><span class="p">(</span><span class="n">x</span><span class="p">))</span> <span class="n">opt</span> <span class="o">&lt;-</span> <span class="nf">optimize</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">sigma</span><span class="p">)</span> <span class="nf">exp</span><span class="p">(</span><span class="nf">log_likelihood</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">xp</span> <span class="o">-</span> <span class="n">sigma</span> <span class="o">*</span> <span class="nf">qnorm</span><span class="p">(</span><span class="m">0.1</span><span class="p">),</span> <span class="n">sigma</span><span class="p">)</span> <span class="o">-</span> <span class="n">ll_hat</span><span class="p">),</span> <span class="n">lower</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">upper</span> <span class="o">=</span> <span class="m">20</span> <span class="o">*</span> <span class="nf">sd</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="c1"># pick an upper bound that&#39;s big</span> <span class="n">maximum</span> <span class="o">=</span> <span class="kc">TRUE</span> <span class="p">)</span> <span class="c1"># We&#39;ll return a list of the sigma and the relative likelihood:</span> <span class="nf">list</span><span class="p">(</span> <span class="n">sigma</span> <span class="o">=</span> <span class="n">opt</span><span class="o">$</span><span class="n">maximum</span><span class="p">,</span> <span class="n">rel_likelihood</span> <span class="o">=</span> <span class="n">opt</span><span class="o">$</span><span class="n">objective</span> <span class="p">)</span> <span class="p">}</span> </code></pre></div> <p>Now, we can draw our same plot again, but this time, we&#8217;ll draw a bunch of potential distributions (and 10-th percentiles), coloring them according to their&nbsp;likelihood:</p> <div class="highlight"><pre><span></span><code><span class="n">p</span> <span class="o">&lt;-</span> <span class="n">dat</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">strength</span><span class="p">))</span> <span class="nf">walk</span><span class="p">(</span> <span class="nf">seq</span><span class="p">(</span><span class="n">from</span> <span class="o">=</span> <span class="m">0.95</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">to</span> <span class="o">=</span> <span class="m">1.05</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">55</span><span class="p">),</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">)</span> <span class="p">{</span> <span class="n">rl</span> <span class="o">&lt;-</span> <span class="nf">rel_likelihood_mu</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">,</span> <span class="n">mu</span><span class="p">)</span> <span class="n">p</span> <span class="o">&lt;&lt;-</span> <span class="n">p</span> <span class="o">+</span> <span class="nf">stat_function</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">alpha</span> <span class="o">=</span> <span class="n">rl</span><span class="o">$</span><span class="n">rel_likelihood</span><span class="p">),</span> <span class="n">fun</span> <span class="o">=</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="nf">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">rl</span><span class="o">$</span><span class="n">sigma</span><span class="p">))</span> <span class="p">}</span> <span class="p">)</span> <span class="n">xp_dist</span> <span class="o">&lt;-</span> <span class="nf">imap_dfr</span><span class="p">(</span> <span class="nf">seq</span><span class="p">(</span><span class="n">from</span> <span class="o">=</span> <span class="m">0.9</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">to</span> <span class="o">=</span> <span class="m">0.98</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">55</span><span class="p">),</span> <span class="nf">function</span><span class="p">(</span><span class="n">xp</span><span class="p">,</span> <span class="n">ii</span><span class="p">)</span> <span class="p">{</span> <span class="n">rl</span> <span class="o">&lt;-</span> <span class="nf">rel_likelihood_xp</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">,</span> <span class="n">xp</span><span class="p">)</span> <span class="nf">data.frame</span><span class="p">(</span><span class="n">xp</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">rl</span> <span class="o">=</span> <span class="n">rl</span><span class="o">$</span><span class="n">rel_likelihood</span><span class="p">)</span> <span class="p">})</span> <span class="n">p</span> <span class="o">+</span> <span class="nf">geom_vline</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">xintercept</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">alpha</span> <span class="o">=</span> <span class="n">rl</span><span class="p">),</span> <span class="n">data</span> <span class="o">=</span> <span class="n">xp_dist</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;blue&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">geom_jitter</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">y</span> <span class="o">=</span> <span class="m">0.01</span><span class="p">),</span> <span class="n">height</span> <span class="o">=</span> <span class="m">0.005</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;magenta&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">scale_y_continuous</span><span class="p">(</span><span class="n">name</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> <span class="n">breaks</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-9-1" src="https://www.kloppenborg.ca/2022/11/explaining-basis-values/explaining-basis-values_files/figure-markdown/unnamed-chunk-9-1.png"></p> <p>Next, we can add a plot of the distribution of the 10th percentiles. We&#8217;ll also plot the B-Basis, as calculated by the package <a href="https://www.cmstatr.net"><code>cmstatr</code></a>.</p> <div class="highlight"><pre><span></span><code><span class="n">p</span> <span class="o">&lt;-</span> <span class="n">dat</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">f</span> <span class="o">=</span> <span class="m">1</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">strength</span><span class="p">))</span> <span class="nf">walk</span><span class="p">(</span> <span class="nf">seq</span><span class="p">(</span><span class="n">from</span> <span class="o">=</span> <span class="m">0.95</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">to</span> <span class="o">=</span> <span class="m">1.05</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">55</span><span class="p">),</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">)</span> <span class="p">{</span> <span class="n">rl</span> <span class="o">&lt;-</span> <span class="nf">rel_likelihood_mu</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">,</span> <span class="n">mu</span><span class="p">)</span> <span class="n">p</span> <span class="o">&lt;&lt;-</span> <span class="n">p</span> <span class="o">+</span> <span class="nf">stat_function</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">alpha</span> <span class="o">=</span> <span class="n">rl</span><span class="o">$</span><span class="n">rel_likelihood</span><span class="p">),</span> <span class="n">fun</span> <span class="o">=</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="nf">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">rl</span><span class="o">$</span><span class="n">sigma</span><span class="p">))</span> <span class="p">}</span> <span class="p">)</span> <span class="n">xp_dist</span> <span class="o">&lt;-</span> <span class="nf">imap_dfr</span><span class="p">(</span> <span class="nf">seq</span><span class="p">(</span><span class="n">from</span> <span class="o">=</span> <span class="m">0.9</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">to</span> <span class="o">=</span> <span class="m">0.98</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">55</span><span class="p">),</span> <span class="nf">function</span><span class="p">(</span><span class="n">xp</span><span class="p">,</span> <span class="n">ii</span><span class="p">)</span> <span class="p">{</span> <span class="n">rl</span> <span class="o">&lt;-</span> <span class="nf">rel_likelihood_xp</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">,</span> <span class="n">xp</span><span class="p">)</span> <span class="nf">data.frame</span><span class="p">(</span><span class="n">xp</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">rl</span> <span class="o">=</span> <span class="n">rl</span><span class="o">$</span><span class="n">rel_likelihood</span><span class="p">)</span> <span class="p">})</span> <span class="n">p</span> <span class="o">+</span> <span class="nf">geom_vline</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">xintercept</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">alpha</span> <span class="o">=</span> <span class="n">rl</span><span class="p">),</span> <span class="n">data</span> <span class="o">=</span> <span class="n">xp_dist</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;blue&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">geom_vline</span><span class="p">(</span> <span class="nf">aes</span><span class="p">(</span><span class="n">xintercept</span> <span class="o">=</span> <span class="nf">basis_normal</span><span class="p">(</span><span class="n">dat</span><span class="p">,</span> <span class="n">strength</span><span class="p">)</span><span class="o">$</span><span class="n">basis</span><span class="p">),</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;red&quot;</span><span class="p">,</span> <span class="n">data</span> <span class="o">=</span> <span class="n">xp_dist</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">f</span> <span class="o">=</span> <span class="m">2</span><span class="p">),</span> <span class="n">inherit.aes</span> <span class="o">=</span> <span class="kc">FALSE</span> <span class="p">)</span> <span class="o">+</span> <span class="nf">geom_line</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">rl</span><span class="p">),</span> <span class="n">data</span> <span class="o">=</span> <span class="n">xp_dist</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">f</span> <span class="o">=</span> <span class="m">2</span><span class="p">),</span> <span class="n">inherit.aes</span> <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;black&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">geom_jitter</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">y</span> <span class="o">=</span> <span class="m">0.01</span><span class="p">),</span> <span class="n">height</span> <span class="o">=</span> <span class="m">0.005</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;magenta&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">facet_grid</span><span class="p">(</span><span class="n">f</span> <span class="o">~</span> <span class="n">.</span><span class="p">,</span> <span class="n">scales</span> <span class="o">=</span> <span class="s">&quot;free_y&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">scale_y_continuous</span><span class="p">(</span><span class="n">name</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> <span class="n">breaks</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span> <span class="o">+</span> <span class="nf">theme</span><span class="p">(</span><span class="n">strip.text</span> <span class="o">=</span> <span class="nf">element_blank</span><span class="p">())</span> </code></pre></div> <p><img alt="unnamed-chunk-10-1" src="https://www.kloppenborg.ca/2022/11/explaining-basis-values/explaining-basis-values_files/figure-markdown/unnamed-chunk-10-1.png"></p> <p>The astute reader might recognize the lower curve as a non-central t-distribution. Since it&#8217;s a relative likelihood and not a probability, the vertical scale (which is hidden) won&#8217;t match the non-central t-distribution, but it&#8217;s the same shape. Just for fun, we can plot the lower curve shown above and a non-central&nbsp;t-distribution:</p> <div class="highlight"><pre><span></span><code><span class="nf">bind_rows</span><span class="p">(</span> <span class="n">xp_dist</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">f</span> <span class="o">=</span> <span class="s">&quot;Relative Likelihood&quot;</span><span class="p">),</span> <span class="n">xp_dist</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="nf">length</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">rl</span> <span class="o">=</span> <span class="nf">dt</span><span class="p">(</span><span class="nf">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">)</span> <span class="o">-</span> <span class="n">xp</span><span class="p">)</span> <span class="o">/</span> <span class="nf">sd</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">df</span> <span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="m">1</span><span class="p">,</span> <span class="n">ncp</span> <span class="o">=</span> <span class="nf">qnorm</span><span class="p">(</span><span class="m">0.9</span><span class="p">)</span> <span class="o">*</span> <span class="nf">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">)),</span> <span class="n">f</span> <span class="o">=</span> <span class="s">&quot;t-Distribution&quot;</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">select</span><span class="p">(</span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="n">n</span><span class="p">))</span> <span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">rl</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_line</span><span class="p">()</span> <span class="o">+</span> <span class="nf">facet_grid</span><span class="p">(</span><span class="n">f</span> <span class="o">~</span> <span class="n">.</span><span class="p">,</span> <span class="n">scales</span> <span class="o">=</span> <span class="s">&quot;free_y&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">ylab</span><span class="p">(</span><span class="s">&quot;&quot;</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-11-1" src="https://www.kloppenborg.ca/2022/11/explaining-basis-values/explaining-basis-values_files/figure-markdown/unnamed-chunk-11-1.png"></p> <p>Turning our attention back to creating the graph for this blog post, we&#8217;ll improve the aesthetics of the graph and also add the&nbsp;annotations:</p> <div class="highlight"><pre><span></span><code><span class="n">p</span> <span class="o">&lt;-</span> <span class="n">dat</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">f</span> <span class="o">=</span> <span class="m">1</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">strength</span><span class="p">))</span> <span class="nf">walk</span><span class="p">(</span> <span class="nf">seq</span><span class="p">(</span><span class="n">from</span> <span class="o">=</span> <span class="m">0.95</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">to</span> <span class="o">=</span> <span class="m">1.05</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">55</span><span class="p">),</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">)</span> <span class="p">{</span> <span class="n">rl</span> <span class="o">&lt;-</span> <span class="nf">rel_likelihood_mu</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">,</span> <span class="n">mu</span><span class="p">)</span> <span class="n">p</span> <span class="o">&lt;&lt;-</span> <span class="n">p</span> <span class="o">+</span> <span class="nf">stat_function</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">alpha</span> <span class="o">=</span> <span class="n">rl</span><span class="o">$</span><span class="n">rel_likelihood</span><span class="p">),</span> <span class="n">fun</span> <span class="o">=</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="nf">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">rl</span><span class="o">$</span><span class="n">sigma</span><span class="p">))</span> <span class="p">}</span> <span class="p">)</span> <span class="n">xp_dist</span> <span class="o">&lt;-</span> <span class="nf">imap_dfr</span><span class="p">(</span> <span class="nf">seq</span><span class="p">(</span><span class="n">from</span> <span class="o">=</span> <span class="m">0.9</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">to</span> <span class="o">=</span> <span class="m">0.98</span> <span class="o">*</span> <span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">),</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">55</span><span class="p">),</span> <span class="nf">function</span><span class="p">(</span><span class="n">xp</span><span class="p">,</span> <span class="n">ii</span><span class="p">)</span> <span class="p">{</span> <span class="n">rl</span> <span class="o">&lt;-</span> <span class="nf">rel_likelihood_xp</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength</span><span class="p">,</span> <span class="n">xp</span><span class="p">)</span> <span class="nf">data.frame</span><span class="p">(</span><span class="n">xp</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">rl</span> <span class="o">=</span> <span class="n">rl</span><span class="o">$</span><span class="n">rel_likelihood</span><span class="p">)</span> <span class="p">})</span> <span class="n">p</span> <span class="o">+</span> <span class="nf">geom_vline</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">xintercept</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">alpha</span> <span class="o">=</span> <span class="n">rl</span><span class="p">),</span> <span class="n">data</span> <span class="o">=</span> <span class="n">xp_dist</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;blue&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">geom_vline</span><span class="p">(</span> <span class="nf">aes</span><span class="p">(</span><span class="n">xintercept</span> <span class="o">=</span> <span class="nf">basis_normal</span><span class="p">(</span><span class="n">dat</span><span class="p">,</span> <span class="n">strength</span><span class="p">)</span><span class="o">$</span><span class="n">basis</span><span class="p">),</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;red&quot;</span><span class="p">,</span> <span class="n">data</span> <span class="o">=</span> <span class="n">xp_dist</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">f</span> <span class="o">=</span> <span class="m">2</span><span class="p">),</span> <span class="n">inherit.aes</span> <span class="o">=</span> <span class="kc">FALSE</span> <span class="p">)</span> <span class="o">+</span> <span class="nf">geom_line</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">xp</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">rl</span><span class="p">),</span> <span class="n">data</span> <span class="o">=</span> <span class="n">xp_dist</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">f</span> <span class="o">=</span> <span class="m">2</span><span class="p">),</span> <span class="n">inherit.aes</span> <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;black&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">geom_jitter</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">y</span> <span class="o">=</span> <span class="m">0.01</span><span class="p">),</span> <span class="n">height</span> <span class="o">=</span> <span class="m">0.005</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">&quot;magenta&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">facet_grid</span><span class="p">(</span><span class="n">f</span> <span class="o">~</span> <span class="n">.</span><span class="p">,</span> <span class="n">scales</span> <span class="o">=</span> <span class="s">&quot;free_y&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="nf">theme_bw</span><span class="p">()</span> <span class="o">+</span> <span class="nf">guides</span><span class="p">(</span><span class="n">alpha</span> <span class="o">=</span> <span class="nf">guide_none</span><span class="p">())</span> <span class="o">+</span> <span class="nf">scale_y_continuous</span><span class="p">(</span><span class="n">name</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> <span class="n">breaks</span> <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span> <span class="o">+</span> <span class="nf">theme</span><span class="p">(</span><span class="n">strip.text</span> <span class="o">=</span> <span class="nf">element_blank</span><span class="p">())</span> <span class="o">+</span> <span class="nf">geom_text</span><span class="p">(</span> <span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">y</span><span class="p">,</span> <span class="n">f</span> <span class="o">=</span> <span class="n">f</span><span class="p">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">label</span><span class="p">),</span> <span class="n">data</span> <span class="o">=</span> <span class="nf">tribble</span><span class="p">(</span> <span class="o">~</span><span class="n">x</span><span class="p">,</span> <span class="o">~</span><span class="n">y</span><span class="p">,</span> <span class="o">~</span><span class="n">f</span><span class="p">,</span> <span class="o">~</span><span class="n">label</span><span class="p">,</span> <span class="m">140</span><span class="p">,</span> <span class="m">0.025</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="s">&quot;(1) The data tells us which\ndistributions are most likely.&quot;</span><span class="p">,</span> <span class="m">140</span><span class="p">,</span> <span class="m">0.00</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="s">&quot;...but we don&#39;t know the true distribution.&quot;</span><span class="p">,</span> <span class="m">140</span><span class="p">,</span> <span class="m">0.7</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="s">&quot;(2) The data also tells us which\n10th percentiles are likely.&quot;</span><span class="p">,</span> <span class="m">123</span><span class="p">,</span> <span class="m">0.6</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="s">&quot;(3) Considering the\nlikelihood of all the\npossible 10th\npercentiles, there is\n95% confidence that\nthe true values is above\nthe the B-Basis.&quot;</span> <span class="p">),</span> <span class="n">color</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="s">&quot;black&quot;</span><span class="p">,</span> <span class="s">&quot;black&quot;</span><span class="p">,</span> <span class="s">&quot;blue&quot;</span><span class="p">,</span> <span class="s">&quot;red&quot;</span><span class="p">)</span> <span class="p">)</span> <span class="o">+</span> <span class="nf">xlim</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="m">120</span><span class="p">,</span> <span class="m">150</span><span class="p">))</span> </code></pre></div> <p><img alt="explanation-graph-1" src="https://www.kloppenborg.ca/2022/11/explaining-basis-values/explaining-basis-values_files/figure-markdown/explanation-graph-1.png"></p> <p>And that&#8217;s the graph at the beginning of this&nbsp;post.</p>Blogging with Quarto2022-07-18T00:00:00-04:002022-07-18T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2022-07-18:/2022/07/blogging-with-quarto/<p>I&#8217;ve recently started using <a href="https://quarto.org/">Quarto</a>, which is a new open source project backed by <a href="https://www.rstudio.com/">RStudio</a>. Quarto is a system for producing reports, presentations, books and blog posts. It takes text formatted with markdown and code written in Python or R and produces PDFs, <span class="caps">HTML</span> or several other formats that …</p><p>I&#8217;ve recently started using <a href="https://quarto.org/">Quarto</a>, which is a new open source project backed by <a href="https://www.rstudio.com/">RStudio</a>. Quarto is a system for producing reports, presentations, books and blog posts. It takes text formatted with markdown and code written in Python or R and produces PDFs, <span class="caps">HTML</span> or several other formats that contain the formatted text, the code (optionally) and the outputs from that code. In a lot of ways, Quarto is like R Markdown or Jupyter Notebooks. Quarto uses Pandoc to do actually convert document formats, and Quarto actually works quite well with version control software like git, unlike Jupyter&nbsp;Notebooks.</p> <p>I&#8217;ve written about using R Markdown or Jupyter for <a href="https://www.kloppenborg.ca/2019/06/reproducibility/">reproducibility</a> in engineering reports, and I&#8217;ve written about creating <a href="https://www.kloppenborg.ca/2019/10/pandoc-report-templates/">custom document templates</a> for reports written using Pandoc. Much of what I&#8217;ve written in those posts should be applicable to&nbsp;Quarto.</p> <p>To date, I&#8217;ve written two posts on this blog using Quarto: <a href="https://www.kloppenborg.ca/2022/06/bow-stiffness/">Violin Bow Stiffness</a> and <a href="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/">Shear of Adhesive Bonded Joints</a>. This post describes some of my experiences using Quarto. Overall, I&#8217;ve been quite happy with&nbsp;it.</p> <h1>Blogging with Quarto and&nbsp;Pelican</h1> <p>Quarto has built-in support for blogging with Hugo. However, this blog is using <a href="https://blog.getpelican.com/">Pelican</a>, not Hugo. A few tweaks are&nbsp;needed.</p> <p>Since the Quarto posts are written in Markdown, they have <span class="caps">YAML</span> headers. Here is the header for one of my blog&nbsp;posts:</p> <div class="highlight"><pre><span></span><code><span class="nn">---</span><span class="w"></span> <span class="nt">title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Shear of Adhesive Bonded Joints</span><span class="w"></span> <span class="nt">date</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">2022-06-25</span><span class="w"></span> <span class="nt">format</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">commonmark_x</span><span class="w"></span> <span class="nt">keep-yaml</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span> <span class="nt">Tags</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Engineering, Python, Adhesive Bonding</span><span class="w"></span> <span class="nt">Category</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Posts</span><span class="w"></span> <span class="nt">filters</span><span class="p">:</span><span class="w"></span> <span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">attach-filter.lua</span><span class="w"></span> <span class="nn">---</span><span class="w"></span> </code></pre></div> <p>Let&#8217;s go through the lines in this header one at a&nbsp;time:</p> <ul> <li><code>title</code>: Self-explanatory &#8212; the title of the&nbsp;post</li> <li><code>date</code>: Also self-explanatory &#8212; the date of the&nbsp;post</li> <li><code>format</code>: There are several output formats for Quarto. I&#8217;ve found that <code>commonmark_x</code> works the best for Pelican. This output format produces a <code>.md</code> file in a format that (mostly) works with&nbsp;Pelican.</li> <li><code>keep-yaml</code>: Setting this option to <code>true</code> tells Quarto to copy the present <span class="caps">YAML</span> header to the output <code>.md</code> file.</li> <li><code>Tags</code>: This is an option used by Pelican. Since we&#8217;ve set <code>keep-yaml = true</code>, this gets copied to the <code>.md</code> file that Pelican will&nbsp;process.</li> <li><code>Category</code>: Another option used by&nbsp;Pelican.</li> <li><code>filters</code>: We&#8217;ll talk about this&nbsp;next.</li> </ul> <h1>Lua&nbsp;Filters</h1> <p>Pandoc uses something called a filter to alter the output. These filters are written in a language called <a href="https://pandoc.org/lua-filters.html">Lua</a>. In order for Pelican to include an image, the filename of the image needs to start with <code>{attach}</code>. This tells Pelican to include the image file in the website&nbsp;output.</p> <p>The following filter edits each image element when it&#8217;s being processed. The name of the filter (<code>Image</code>) means that it applies to images. This filter concatenates the string <code>{attach}</code> with the <code>src</code> attribute of the image and stores the result in the <code>src</code> attribute of the resulting&nbsp;element.</p> <div class="highlight"><pre><span></span><code><span class="kr">function</span> <span class="nf">Image</span> <span class="p">(</span><span class="n">elem</span><span class="p">)</span> <span class="n">elem</span><span class="p">.</span><span class="n">src</span> <span class="o">=</span> <span class="s2">&quot;{attach}&quot;</span> <span class="o">..</span> <span class="n">elem</span><span class="p">.</span><span class="n">src</span> <span class="kr">return</span> <span class="n">elem</span> <span class="kr">end</span> </code></pre></div> <p>Similarly, for links, Pelican requires that links to internal files on the blog start with <code>{filename}</code>. External links are used as-is. To do this, I use the following filter, which applies to <code>Link</code> elements. It checks if the target of the link starts with <code>http</code>. If so, it uses the link as-is. Otherwise, I assume that the link is an internal link, so the filter pre-pends the target with <code>{filename}</code>.</p> <div class="highlight"><pre><span></span><code><span class="kr">function</span> <span class="nf">Link</span> <span class="p">(</span><span class="n">elem</span><span class="p">)</span> <span class="kr">if</span><span class="p">(</span> <span class="nb">string.find</span><span class="p">(</span><span class="n">elem</span><span class="p">.</span><span class="n">target</span><span class="p">,</span> <span class="s2">&quot;http&quot;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="p">)</span> <span class="kr">then</span> <span class="kr">return</span> <span class="n">elem</span> <span class="kr">else</span> <span class="n">elem</span><span class="p">.</span><span class="n">target</span> <span class="o">=</span> <span class="s2">&quot;{filename}&quot;</span> <span class="o">..</span> <span class="n">elem</span><span class="p">.</span><span class="n">target</span> <span class="kr">return</span> <span class="n">elem</span> <span class="kr">end</span> <span class="kr">end</span> </code></pre></div> <p>I&#8217;ve created a file called <code>attach-filter.lua</code> containing both of the filters above. The <code>filters</code> line in the <span class="caps">YAML</span> header tells Quarto to use these filters when processing the&nbsp;file.</p>Shear of Adhesive Bonded Joints2022-06-25T00:00:00-04:002022-06-25T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2022-06-25:/2022/06/bonded-joint-shear/<p>There are a lot of misconceptions about bonded joints. One of the misconceptions that I&#8217;ve seen most often is that people think that the average shear stress in a lap joint is predictive of the strength. This same misconception is usually phrased as&nbsp;either:</p> <ul> <li>Doubling the overlap length of …</li></ul><p>There are a lot of misconceptions about bonded joints. One of the misconceptions that I&#8217;ve seen most often is that people think that the average shear stress in a lap joint is predictive of the strength. This same misconception is usually phrased as&nbsp;either:</p> <ul> <li>Doubling the overlap length of a lap joint doubles the strength (<strong>wrong!</strong>)</li> <li>Calculate$P/A$for the joint and make sure that the value is less than the lap shear strength on the adhesive data-sheet (<strong>wrong!</strong>)</li> </ul> <p>In this post, I&#8217;m going to explain why these statements are incorrect. I&#8217;m going to try to give you a understanding of how load transfer works in an adhesive joint, and I&#8217;m going to share some Python code that produces a first approximation of the stress&nbsp;distribution.</p> <p>For simplicity, we&#8217;re going to ignore the effects of peel. Peel is the tendency for the ends of a lap joint to separate. This can cause the joint to fail in some cases, but considering the effects of peel complicates the analysis of the joint &#8212; since the purpose of this post is to give a basic understanding of the mechanics of the joint, I&#8217;m going to ignore this complicating&nbsp;factor.</p> <h2>The&nbsp;Joint</h2> <p>In this post, I&#8217;m going to focus on a simple lap joint. In this type of joint, two adherends overlap each other by a certain amount and there is adhesive connecting the two adherends over the area in which they overlap. These two adherends are then pulled apart. In this post, we&#8217;re going to assume that the two adherends are homogeneous isotropic materials (for example, sheet metal) and are uniform thickness. This joint is shown in the figure below. In the top part of the figure, we see the unstressed joint, and in the bottom, we see the joint under&nbsp;load.</p> <p><img alt="Schematic of lap joint" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/joint.svg"></p> <p>Obviously, the deformation of the joint is exaggerated, but it allows us to see what&#8217;s&nbsp;happening.</p> <p>First, let&#8217;s look at the lower adherend. We see that the left edge of the adherend is built-in (i.e. it can&#8217;t move). When load is applied, the left portion of the lower adherend stretches a lot because it is carrying the entirety of the reaction&nbsp;load.</p> <p>As we move our gaze further to the right, but still focusing on the lower adherend, we see that the further right we go, the less the adherend is stretching. This is because the adhesive is transferring load along the length of the joint. When we look at the right portion of the lower adherend, it&#8217; hardly stretched at all. Sure, it <em>moved</em> because the rest of the adherend has stretched, but the right part of the lower adherend has hardly stretched at all since it&#8217;s carrying no&nbsp;load.</p> <p>If the two adherends are the same thickness, then symmetry will tell us that the upper adherend behaves in the same way &#8212; but now is the right end of the upper adherend that stretches a lot and the left end that doesn&#8217;t&nbsp;stretch.</p> <p>Now, let&#8217;s turn our attention to the adhesive. Shear strain can be though of as an angle. At the very left edge of adhesive, the shear strain is quite large, in the middle of the adhesive, the shear strain is moderate and at the right edge of the adhesive, the shear strain is quite large again. The relationship between shear stress and shear strain of an adhesive is not linear, but nonetheless, a large strain produces a large stress and a small strain produces a small stress. So, the shear stress distribution in this joint is &#8220;U&#8221; shaped &#8212; there&#8217;s a lot of stress at the ends and a smaller stress in the&nbsp;middle.</p> <p>This &#8220;U&#8221; shaped shear stress distribution should be the first clue about why using the average shear stress in the joint to predict failure might not be the best&nbsp;idea.</p> <h2>A Linear Model of the&nbsp;Joint</h2> <p>The actual shear stress-shear strain relationship for most adhesives is non-linear, but we&#8217;ll start our analysis of a lap joint by making the assumption that the adhesive is&nbsp;linear-elastic.</p> <p>Let&#8217;s start with defining the variables that we&#8217;ll need. The variables are shown in the following&nbsp;figure.</p> <p><img alt="Lap shear joint variables" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/joint-variables.svg"></p> <p>A force balance on the two adherends gives&nbsp;us:</p> <p>$$\frac{dN_1}{dx} - \tau = 0&nbsp;$$</p> <p>$$\frac{dN_2}{dx} + \tau = 0&nbsp;$$</p> <p>We can find the deformation of the two adherends as&nbsp;follows:</p> <p>$$\frac{du_1}{dx} = \frac{N_1}{E_1^\prime t_1}&nbsp;$$</p> <p>$$\frac{du_2}{dx} = \frac{N_2}{E_2^\prime t_2}&nbsp;$$</p> <p>Where$E_1^\prime$and$E_2^\prime$are the adherend plane-strain elastic&nbsp;moduli.</p> <p>The shear strain of the adhesive layer is given&nbsp;by:</p> <p>$$\gamma = \frac{1}{t_A} \left(u_1 - u_2\right)&nbsp;$$</p> <p>We can differentiate this with respect to$x$and then substitute in the previous equations to&nbsp;get:</p> <p>$$\frac{d\gamma}{dx} = \frac{1}{t_A}\left( \frac{du_1}{dx} - \frac{du_2}{dx} \right) \ {} = \frac{1}{t_A}\left( \frac{N_1}{E_1^\prime t_1} - \frac{N_2}{E_2^\prime t_2} \right)&nbsp;$$</p> <p>We can then differentiate this again with respect to$x$and substituting in the first equations, we&nbsp;get:</p> <p>$$\frac{d^2\gamma}{dx^2} = \frac{1}{t_A}\left( \frac{dN_1}{dx}\frac{1}{E_1^\prime t_1} - \frac{dN_2}{dx}\frac{1}{E_2^\prime t_2} \right) \ {} = \tau\frac{1}{t_A}\left( \frac{1}{E_1^\prime t_1} + \frac{1}{E_2^\prime t_2} \right)&nbsp;$$</p> <p>Remember that for now, we&#8217;re assuming that the adhesive is linear-elastic.&nbsp;Thus:</p> <p>$$\tau = G_A \gamma&nbsp;$$</p> <p>We can solve the second-order differential equation above, but we need two boundary conditions. The boundary conditions that we choose are the loads at the ends of the adherends. At the left end ($x=0$), the unit load on the lower adherend ($N_2$) must be equal to the applied load ($P$) divided by the width ($w$) and the load on the upper adherend ($N_1$) must be zero. The opposite is true at the other end ($x=L$).&nbsp;Thus:</p> <p>$$\left.N_1\right|<em x="0">{x=0} = 0 \ \left.N_2\right|</em> = P/w&nbsp;$$</p> <p>$$\left.N_1\right|<em x="L">{x=L} = P/w \ \left.N_2\right|</em> = 0&nbsp;$$</p> <p>We can plug these into the equation for$\frac{d\gamma}{dx}$at the two ends of the joint and get the following boundary conditions that we will enforce for the&nbsp;solution.</p> <p>$$\left.\frac{d\gamma}{dx}\right|_{x=0} = \frac{1}{t_A}\left( \frac{-P / w}{E_2^\prime t_2} \right)&nbsp;$$</p> <p>$$\left.\frac{d\gamma}{dx}\right|_{x=L} = \frac{1}{t_A}\left( \frac{P / w}{E_1^\prime t_1} \right)&nbsp;$$</p> <p>There is a closed-form solution to this boundary value problem, which we could find, but I think it&#8217;s more instructive to just find a numerical solution &#8212; plus it&#8217;s easier to extend the numerical solution to the case where the adhesive is non-linear. In order to find the numerical solution, we&#8217;re going to use the Python package <code>scipy</code>, which includes the function <code>solve_bvp()</code> for solving boundary-value problems. We&#8217;ll start by importing the packages that we&#8217;ll&nbsp;use.</p> <div class="cell-code highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span> <span class="kn">import</span> <span class="nn">scipy.integrate</span> <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span> </code></pre></div> <p>Next, we&#8217;ll set the parameters for our solution. These include the elastic moduli, thicknesses, overlap length and&nbsp;load.</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">E1</span> <span class="o">=</span> <span class="mf">10.5e6</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="mf">0.33</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="n">E2</span> <span class="o">=</span> <span class="mf">10.5e6</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="mf">0.33</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="n">t1</span> <span class="o">=</span> <span class="mf">0.063</span> <span class="n">t2</span> <span class="o">=</span> <span class="mf">0.063</span> <span class="n">Ga</span> <span class="o">=</span> <span class="mi">65500</span> <span class="n">ta</span> <span class="o">=</span> <span class="mf">0.005</span> <span class="n">L</span> <span class="o">=</span> <span class="mf">0.5</span> <span class="n">w</span> <span class="o">=</span> <span class="mf">1.</span> <span class="n">P</span> <span class="o">=</span> <span class="mi">2700</span> </code></pre></div> <p>The function <code>solve_bvp</code> requires two arguments: (i) a function that returns the derivatives of the variables, and (ii) a function that returns the residuals for the boundary conditions. We&#8217;re going to reduce the second-order differential equation to a system of two first-order differential equations by defining$y$as follows. Based on this definition, we can implement the two functions required by <code>solve_bvp</code>.</p> <p>$$y = \left[ \begin{matrix} \frac{d\tau}{dx} <span class="amp">&amp;</span> \tau \end{matrix} \right]^T&nbsp;$$</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">func1</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span> <span class="n">D</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">matrix</span><span class="p">([</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="n">Ga</span> <span class="o">/</span> <span class="n">ta</span> <span class="o">*</span> <span class="p">(</span><span class="mf">1.</span> <span class="o">/</span> <span class="p">(</span><span class="n">E1</span> <span class="o">*</span> <span class="n">t1</span><span class="p">)</span> <span class="o">+</span> <span class="mf">1.</span> <span class="o">/</span> <span class="p">(</span><span class="n">E2</span> <span class="o">*</span> <span class="n">t2</span><span class="p">))],</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">]</span> <span class="p">])</span> <span class="k">return</span> <span class="n">D</span> <span class="o">@</span> <span class="n">y</span> <span class="k">def</span> <span class="nf">bc1</span><span class="p">(</span><span class="n">ya</span><span class="p">,</span> <span class="n">yb</span><span class="p">):</span> <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span> <span class="n">ya</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mf">1.</span> <span class="o">/</span> <span class="n">ta</span> <span class="o">*</span> <span class="p">(</span><span class="o">-</span><span class="n">P</span> <span class="o">/</span> <span class="n">w</span> <span class="o">/</span> <span class="p">(</span><span class="n">E2</span> <span class="o">*</span> <span class="n">t2</span><span class="p">)),</span> <span class="n">yb</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mf">1.</span> <span class="o">/</span> <span class="n">ta</span> <span class="o">*</span> <span class="p">(</span><span class="n">P</span> <span class="o">/</span> <span class="n">w</span> <span class="o">/</span> <span class="p">(</span><span class="n">E1</span> <span class="o">*</span> <span class="n">t1</span><span class="p">))</span> <span class="p">])</span> <span class="n">res1</span> <span class="o">=</span> <span class="n">scipy</span><span class="o">.</span><span class="n">integrate</span><span class="o">.</span><span class="n">solve_bvp</span><span class="p">(</span> <span class="n">func1</span><span class="p">,</span> <span class="n">bc1</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">L</span><span class="p">,</span> <span class="n">num</span><span class="o">=</span><span class="mi">50</span><span class="p">),</span> <span class="n">y</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">2</span><span class="p">,</span> <span class="mi">50</span><span class="p">))</span> <span class="p">)</span> </code></pre></div> <p>The variable <code>res1</code> now contains the solution to our differential equation. We can plot the shear strain ($\gamma$) over the length of the joint as&nbsp;follows:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">res1</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">res1</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,:])</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Linear Elastic Adhesive&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Strain,$</span><span class="se">\\</span><span class="s2">gamma$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-5-output-1.png"></p> <p>Because we&#8217;re assuming that the adhesive is linear-elastic, we can find the shear stress by simply multiplying the elastic modulus$G_A$by the shear strain. The shear stress in the adhesive over the length of the joint is&nbsp;thus:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">res1</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">Ga</span> <span class="o">*</span> <span class="n">res1</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,:])</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Linear Elastic Adhesive&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Stress,$</span><span class="se">\\</span><span class="s2">tau$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-6-output-1.png"></p> <h2>Stress-Strain&nbsp;Curve</h2> <p>The shear stress-strain curve for most adhesives is linear at low strain, but highly nonlinear above a certain value of strain. It&#8217;s common to idealize the stress-strain curve for an adhesive as elastic-perfectly plastic. The important parameters for the adhesive stress-strain curve are the initial shear modulus ($G_A$), the strain at yield ($\gamma_y$), from which you can calculate a shear stress at yield. The other important parameter is the ultimate strain, which we&#8217;ll talk about later. The idealized stress-strain curve therefore looks like&nbsp;this:</p> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/stress-strain.svg"></p> <p>The ordinary approach to solving the stress distribution within a bonded joint involves finding the points along the length of the joint at which the adhesive transitions from elastic to plastic and then solving the elastic and plastic portions of the joint separately. If you try to naively solve the equations above with an elastic-perfectly plastic adhesive model, you&#8217;ll get errors since the Jacobian become singular. For the purpose of keeping this blog post simple, we&#8217;ll cheat a little bit and give the the stress-strain curve a very small slope above the yield stress. This will eliminate the numerical issues, and as as long as this slope is small enough, it won&#8217;t affect the results very&nbsp;much.</p> <p>With this in mind, and considering that the strain could be positive or negative, we implement a function to find the stress based on the strain as&nbsp;follows:</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">calc_tau</span><span class="p">(</span><span class="n">gamma</span><span class="p">):</span> <span class="n">gamma_y</span> <span class="o">=</span> <span class="mf">0.09</span> <span class="c1"># the yield strain</span> <span class="n">G_final</span> <span class="o">=</span> <span class="mi">1</span> <span class="c1"># a very small slope for the upper part of the curve</span> <span class="n">sign</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">sign</span><span class="p">(</span><span class="n">gamma</span><span class="p">)</span> <span class="k">if</span> <span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">gamma</span><span class="p">)</span> <span class="o">&lt;=</span> <span class="n">gamma_y</span><span class="p">:</span> <span class="n">tau_unsigned</span> <span class="o">=</span> <span class="n">Ga</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">gamma</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">tau_unsigned</span> <span class="o">=</span> <span class="n">Ga</span> <span class="o">*</span> <span class="n">gamma_y</span> <span class="o">+</span> \ <span class="n">G_final</span> <span class="o">*</span> <span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">gamma</span><span class="p">)</span> <span class="o">-</span> <span class="n">gamma_y</span><span class="p">)</span> <span class="k">return</span> <span class="n">sign</span> <span class="o">*</span> <span class="n">tau_unsigned</span> </code></pre></div> <p>We&#8217;ll vectorize this function so that we can calculate an array of stress values based on an array of strain&nbsp;values:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">calc_tau_vec</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">vectorize</span><span class="p">(</span><span class="n">calc_tau</span><span class="p">)</span> </code></pre></div> <h2>A Nonlinear Model of the&nbsp;Joint</h2> <p>Now that we have a function to describe the way in which the adhesive creates shear stress depending on its shear strain, we can implement the solution to the differential equation again. Since the boundary conditions don&#8217;t depend on the behavior of the adhesive, we can re-use the same function for calculating the residuals of the boundary&nbsp;condition.</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">func2</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span> <span class="n">b</span> <span class="o">=</span> <span class="p">(</span><span class="mf">1.</span> <span class="o">/</span> <span class="p">(</span><span class="n">E1</span> <span class="o">*</span> <span class="n">t1</span><span class="p">)</span> <span class="o">+</span> <span class="mf">1.</span> <span class="o">/</span> <span class="p">(</span><span class="n">E2</span> <span class="o">*</span> <span class="n">t2</span><span class="p">))</span> <span class="o">/</span> <span class="n">ta</span> <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">row_stack</span><span class="p">((</span> <span class="n">calc_tau_vec</span><span class="p">(</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="p">:])</span> <span class="o">*</span> <span class="n">b</span><span class="p">,</span> <span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="p">:]</span> <span class="p">))</span> <span class="n">res2</span> <span class="o">=</span> <span class="n">scipy</span><span class="o">.</span><span class="n">integrate</span><span class="o">.</span><span class="n">solve_bvp</span><span class="p">(</span> <span class="n">func2</span><span class="p">,</span> <span class="n">bc1</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">L</span><span class="p">,</span> <span class="n">num</span><span class="o">=</span><span class="mi">50</span><span class="p">),</span> <span class="n">y</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">2</span><span class="p">,</span> <span class="mi">50</span><span class="p">))</span> <span class="p">)</span> </code></pre></div> <p>Here is the strain solution that we&nbsp;get:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">res2</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,:])</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Elastic-Plastic Adhesive&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Strain,$</span><span class="se">\\</span><span class="s2">gamma$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-10-output-1.png"></p> <p>And the corresponding adhesive shear stress solution is as&nbsp;follows:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">calc_tau_vec</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,:]))</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Elastic-Plastic Adhesive&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Stress,$</span><span class="se">\\</span><span class="s2">tau$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-11-output-1.png"></p> <p>We&#8217;ll overlay the linear and the elastic-plastic models on top of each other to clearly show the differences between the two models. First, we notice that the elastic-plastic model has flat spots in the stress distribution where the adhesive has yielded. These occur near the ends of the joint. Next, we notice that the middle of the two stress distributions look similar, but shifted: for the elastic-plastic model, the stress in the &#8220;trough&#8221; is higher because the ends of this joint take a smaller proportion of the entire&nbsp;load.</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">res1</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">Ga</span> <span class="o">*</span> <span class="n">res1</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,:],</span> <span class="n">label</span><span class="o">=</span><span class="s2">&quot;Elastic Adhesive&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">calc_tau_vec</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,:]),</span> <span class="n">label</span><span class="o">=</span><span class="s2">&quot;Elastic-Plastic Adhesive&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Comparison of Shear Stress for Both Models&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Stress,$</span><span class="se">\\</span><span class="s2">tau$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-12-output-1.png"></p> <h2>Exploration</h2> <p>We&#8217;ll create a function that takes several of the joint parameters as arguments and returns the stress and strain distributions. We&#8217;ll use this function to explore the effect of some of the joint parameters. We&#8217;re only going to implement this for the elastic-plastic&nbsp;model.</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">model</span><span class="p">(</span><span class="n">t1</span><span class="p">,</span> <span class="n">t2</span><span class="p">,</span> <span class="n">ta</span><span class="p">,</span> <span class="n">L</span><span class="p">,</span> <span class="n">P</span><span class="p">):</span> <span class="k">def</span> <span class="nf">ode</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span> <span class="n">b</span> <span class="o">=</span> <span class="p">(</span><span class="mf">1.</span> <span class="o">/</span> <span class="p">(</span><span class="n">E1</span> <span class="o">*</span> <span class="n">t1</span><span class="p">)</span> <span class="o">+</span> <span class="mf">1.</span> <span class="o">/</span> <span class="p">(</span><span class="n">E2</span> <span class="o">*</span> <span class="n">t2</span><span class="p">))</span> <span class="o">/</span> <span class="n">ta</span> <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">row_stack</span><span class="p">((</span> <span class="n">calc_tau_vec</span><span class="p">(</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="p">:])</span> <span class="o">*</span> <span class="n">b</span><span class="p">,</span> <span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="p">:]</span> <span class="p">))</span> <span class="k">def</span> <span class="nf">bc</span><span class="p">(</span><span class="n">ya</span><span class="p">,</span> <span class="n">yb</span><span class="p">):</span> <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span> <span class="n">ya</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mf">1.</span> <span class="o">/</span> <span class="n">ta</span> <span class="o">*</span> <span class="p">(</span><span class="o">-</span><span class="n">P</span> <span class="o">/</span> <span class="n">w</span> <span class="o">/</span> <span class="p">(</span><span class="n">E2</span> <span class="o">*</span> <span class="n">t2</span><span class="p">)),</span> <span class="n">yb</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mf">1.</span> <span class="o">/</span> <span class="n">ta</span> <span class="o">*</span> <span class="p">(</span><span class="n">P</span> <span class="o">/</span> <span class="n">w</span> <span class="o">/</span> <span class="p">(</span><span class="n">E1</span> <span class="o">*</span> <span class="n">t1</span><span class="p">))</span> <span class="p">])</span> <span class="n">res</span> <span class="o">=</span> <span class="n">scipy</span><span class="o">.</span><span class="n">integrate</span><span class="o">.</span><span class="n">solve_bvp</span><span class="p">(</span> <span class="n">ode</span><span class="p">,</span> <span class="n">bc</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">L</span><span class="p">,</span> <span class="n">num</span><span class="o">=</span><span class="mi">50</span><span class="p">),</span> <span class="n">y</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">2</span><span class="p">,</span> <span class="mi">50</span><span class="p">))</span> <span class="p">)</span> <span class="n">x</span> <span class="o">=</span> <span class="n">res</span><span class="o">.</span><span class="n">x</span> <span class="n">gamma</span> <span class="o">=</span> <span class="n">res</span><span class="o">.</span><span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">,:]</span> <span class="n">tau</span> <span class="o">=</span> <span class="n">calc_tau_vec</span><span class="p">(</span><span class="n">gamma</span><span class="p">)</span> <span class="k">return</span> <span class="n">x</span><span class="p">,</span> <span class="n">gamma</span><span class="p">,</span> <span class="n">tau</span> </code></pre></div> <p>First, we&#8217;ll keep all of the parameters constant except that we&#8217;ll vary the load. This will show us how the stress distribution changes as we increase the load. The results aren&#8217;t surprising. At low loads, the joint is fully elastic. As the load is increased, the adhesive at the ends of the overlap start to yield. As load is increased further, the yielded area grows and the &#8220;trough&#8221; gets shallower. Finally, the joint becomes fully plastic. At this point, the joint would surely fail, but since our model doesn&#8217;t check for failure, we don&#8217;t see&nbsp;this.</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">for</span> <span class="n">Pi</span> <span class="ow">in</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">1750</span><span class="p">,</span> <span class="mi">2950</span><span class="p">,</span> <span class="n">num</span><span class="o">=</span><span class="mi">4</span><span class="p">):</span> <span class="n">x_i</span><span class="p">,</span> <span class="n">gamma_i</span><span class="p">,</span> <span class="n">tau_i</span> <span class="o">=</span> <span class="n">model</span><span class="p">(</span> <span class="n">t1</span><span class="o">=</span><span class="n">t1</span><span class="p">,</span> <span class="n">t2</span><span class="o">=</span><span class="n">t2</span><span class="p">,</span> <span class="n">ta</span><span class="o">=</span><span class="n">ta</span><span class="p">,</span> <span class="n">L</span><span class="o">=</span><span class="n">L</span><span class="p">,</span> <span class="n">P</span><span class="o">=</span><span class="n">Pi</span> <span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_i</span><span class="p">,</span> <span class="n">tau_i</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="sa">f</span><span class="s2">&quot;P=</span><span class="si">{</span><span class="n">Pi</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Shear Stress With Various Loads&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Stress,$</span><span class="se">\\</span><span class="s2">tau$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-14-output-1.png"></p> <p>Next, we&#8217;ll see what happens when we change the thickness of the upper adherend. In this example, the lower adherend has a thickness of$t_2=0.063$and we vary the thickness of the upper adherend ($t_1$) from half this thickness to four times this thickness. As we can see, this changes the length of the two plastic zones: in the extreme case of$t_1=0.250$, there is no plastic zone on the right because the adherend carrying the load at the right end of the joint is so&nbsp;stiff.</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">for</span> <span class="n">t1_i</span> <span class="ow">in</span> <span class="p">[</span><span class="mf">0.032</span><span class="p">,</span> <span class="mf">0.063</span><span class="p">,</span> <span class="mf">.125</span><span class="p">,</span> <span class="mf">.250</span><span class="p">]:</span> <span class="n">x_i</span><span class="p">,</span> <span class="n">gamma_i</span><span class="p">,</span> <span class="n">tau_i</span> <span class="o">=</span> <span class="n">model</span><span class="p">(</span> <span class="n">t1</span><span class="o">=</span><span class="n">t1_i</span><span class="p">,</span> <span class="n">t2</span><span class="o">=</span><span class="n">t2</span><span class="p">,</span> <span class="n">ta</span><span class="o">=</span><span class="n">ta</span><span class="p">,</span> <span class="n">L</span><span class="o">=</span><span class="n">L</span><span class="p">,</span> <span class="n">P</span><span class="o">=</span><span class="n">P</span> <span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_i</span><span class="p">,</span> <span class="n">tau_i</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="sa">f</span><span class="s2">&quot;t1=</span><span class="si">{</span><span class="n">t1_i</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Various Upper Adherend Thicknesses&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Stress,$</span><span class="se">\\</span><span class="s2">tau$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-15-output-1.png"></p> <p>Finally, we&#8217;ll see the effect of changing the overlap length. This time, we&#8217;re going to vary the overlap length$L$and keep the <em>average shear stress</em> ($P/A$)&nbsp;constant.</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">for</span> <span class="n">Li</span> <span class="ow">in</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">1.5</span><span class="p">,</span> <span class="n">num</span><span class="o">=</span><span class="mi">3</span><span class="p">):</span> <span class="n">x_i</span><span class="p">,</span> <span class="n">gamma_i</span><span class="p">,</span> <span class="n">tau_i</span> <span class="o">=</span> <span class="n">model</span><span class="p">(</span> <span class="n">t1</span><span class="o">=</span><span class="n">t1</span><span class="p">,</span> <span class="n">t2</span><span class="o">=</span><span class="n">t2</span><span class="p">,</span> <span class="n">ta</span><span class="o">=</span><span class="n">ta</span><span class="p">,</span> <span class="n">L</span><span class="o">=</span><span class="n">Li</span><span class="p">,</span> <span class="n">P</span><span class="o">=</span><span class="mi">5600</span><span class="o">*</span><span class="n">Li</span> <span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_i</span><span class="p">,</span> <span class="n">tau_i</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="sa">f</span><span class="s2">&quot;L=</span><span class="si">{</span><span class="n">Li</span><span class="si">}</span><span class="s2">, P=</span><span class="si">{</span><span class="mi">5600</span><span class="o">*</span><span class="n">Li</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Various Lap Lengths, Constant$P/A$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Stress,$</span><span class="se">\\</span><span class="s2">tau$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-16-output-1.png"></p> <p>Here, we see that for all three overlap lengths considered, the adhesive at the ends of the lap is plastic and that there&#8217;s an elastic &#8220;trough&#8221; in the middle of each joint. At this point, we might be tempted to declare that all of the joints are able to carry at least the same average shear stress ($P/A$), but before we do so, let&#8217;s look at the shear strain in the adhesive layer for each of these&nbsp;cases.</p> <div class="cell-code highlight"><pre><span></span><code><span class="k">for</span> <span class="n">Li</span> <span class="ow">in</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">1.5</span><span class="p">,</span> <span class="n">num</span><span class="o">=</span><span class="mi">3</span><span class="p">):</span> <span class="n">x_i</span><span class="p">,</span> <span class="n">gamma_i</span><span class="p">,</span> <span class="n">tau_i</span> <span class="o">=</span> <span class="n">model</span><span class="p">(</span> <span class="n">t1</span><span class="o">=</span><span class="n">t1</span><span class="p">,</span> <span class="n">t2</span><span class="o">=</span><span class="n">t2</span><span class="p">,</span> <span class="n">ta</span><span class="o">=</span><span class="n">ta</span><span class="p">,</span> <span class="n">L</span><span class="o">=</span><span class="n">Li</span><span class="p">,</span> <span class="n">P</span><span class="o">=</span><span class="mi">5600</span><span class="o">*</span><span class="n">Li</span> <span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_i</span><span class="p">,</span> <span class="n">gamma_i</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="sa">f</span><span class="s2">&quot;L=</span><span class="si">{</span><span class="n">Li</span><span class="si">}</span><span class="s2">, P=</span><span class="si">{</span><span class="mi">5600</span><span class="o">*</span><span class="n">Li</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Various Lap Lengths, Constant$P/A$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Shear Strain,$</span><span class="se">\\</span><span class="s2">gamma$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;$x$&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="" src="https://www.kloppenborg.ca/2022/06/bonded-joint-shear/bonded-joint-shear_files/figure-commonmark_x/cell-17-output-1.png"></p> <p>Here we see that the shear strain in the adhesive at the ends of the longest joint is almost 0.9. Think about what that means: the &#8220;top&#8221; of the adhesive layer has moved sideways relative to the &#8220;bottom&#8221; of the layer by an amount almost equal to the thickness of the layer. In other words, that a <strong>huge</strong> amount of&nbsp;strain.</p> <p>The ultimate shear strain is going to depend on the type of adhesive we&#8217;re using, as well as the environmental conditions (temperature, moisture content, etc.). For a lot of adhesives, the ultimate strain is going to be somewhere in the range of$0.2$to$0.6$. So, in the three examples shown here, the first overlap length ($L=0.5$) can probably carry this value of$P/A$, the second overlap length ($L=1.0$) might be able to carry it, but the third overlap length ($L=1.5$) almost certainly will fail. This is the reason that you can&#8217;t use the average shear stress ($P/A$) to size lap&nbsp;joints.</p> <p>If you want to play around with this model, I&#8217;ve created a <a href="https://www.kloppenborg.ca/adhesive-lap-no-peel/">widget</a> that implements this&nbsp;model.</p>Violin Bow Stiffness2022-06-15T00:00:00-04:002022-06-15T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2022-06-15:/2022/06/bow-stiffness/<p>I&#8217;ve made a few violin bows and a couple cello bows. I&#8217;m very much a novice bow maker, but I&#8217;m learning. As I&#8217;m an engineer, I&#8217;m naturally trying to apply engineering principles to bow making, which isn&#8217;t necessarily easy since violin bows are actually …</p><p>I&#8217;ve made a few violin bows and a couple cello bows. I&#8217;m very much a novice bow maker, but I&#8217;m learning. As I&#8217;m an engineer, I&#8217;m naturally trying to apply engineering principles to bow making, which isn&#8217;t necessarily easy since violin bows are actually very complex, despite looking quite&nbsp;simple.</p> <p>The stiffness of a bow affects what the player is able to do with it. If a bow is too stiff, it becomes nearly unplayable; if it&#8217;s too soft, they player can&#8217;t apply much force to the string before the stick bottoms out and contacts the string (normally the hair of the bow contacts the string). The stiffness affects how much camber the bow maker must add to the stick. The wrong combination of stiffness and camber can lead to a torsional-bending buckling mode, which will make the bow unplayable. The mass and mass distribution of the bow has a large effect on playability. Plus, the aesthetics of the bow are of importance. As I said, a bow is quite&nbsp;complex.</p> <p>The &#8220;standard&#8221; wood for making violin bows has been pernambuco for the past 250 years. However, the tree that produces this wood is endangered and hence this wood is difficult to obtain. I&#8217;ve been making bows out of other types of wood &#8212; mostly ipe and snakewood. In order for a bow made from ipe to have the same stiffness as a bow made from pernambuco, the dimensions need to be altered. Hence, having a good understanding between the taper of the stick and the resulting stiffness is&nbsp;important.</p> <h1>Taper</h1> <p>Henry Saint-George provides a procedure for calculating the taper of a bow based on measurements of Tourte bows (<a href='#Saint-George_1896' id='ref-Saint-George_1896-1'> SaintGeorge (1896) </a>). In this procedure, the bow is divided into 12 (unequal) segments. Referring to the figure below (reproduced from Saint-George&#8217;s book), line <span class="caps">AC</span> is constructed perpendicular to the bow with a length of 110 mm. A second line <span class="caps">BD</span> is constructed perpendicular to the stick at the other end. Saint-George indicates that the line <span class="caps">BD</span> is 22 mm when the total length (<span class="caps">AB</span>) is 700 mm. A compass is used to draw the arc Ce. A line perpendicular to the stick is then constructed starting from point e and terminating at the line <span class="caps">CD</span>. The compass is re-set to draw the arc fg and the process is repeated. The points A, e, g, i, k, etc. are the points at which the diameter of the bow is set. At points A and e, the diameter are set equal to one another. At points y and B, they are equal to another fixed value. The diameter at the remaining points are each decremented by a fixed value. But, since those points are not uniformly spaced, the taper is not linear, but instead accelerates along the length of the&nbsp;stick.</p> <p><img alt="TaperProcedure" src="https://www.kloppenborg.ca/2022/06/bow-stiffness/images/TaperProcedure.png"></p> <p>This procedure seems quite complicated. However, the keen reader might recognize that the points along the stick form a geometric series. The keen reader may also recognize that the values 22 mm and 700 mm cannot both be taken as fixed: if you change the length of the bow (which affects the slope of the line <span class="caps">CD</span>), you also need to change the length of line <span class="caps">BD</span>, otherwise the procedure described above will not produce the correct overall&nbsp;length.</p> <p>The sum of each of these segments is given&nbsp;by:</p> <p>$$L = \sum_{k=0}^{12} C r^k = C\left(\frac{r^{12}-1}{r-1}\right)&nbsp;$$</p> <p>Here, the value C is selected as 110 mm and the value of$r$needs to be found based on the value of$L$chosen. This can be done numerically in Python. The following code does that, then computes the points and the diameters of the&nbsp;bow:</p> <div class="cell-code highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">scipy.optimize</span> <span class="n">length</span> <span class="o">=</span> <span class="mf">700.</span> <span class="n">length_constant</span> <span class="o">=</span> <span class="mf">110.</span> <span class="n">d_butt</span> <span class="o">=</span> <span class="mf">8.6</span> <span class="n">d_head</span> <span class="o">=</span> <span class="mf">5.6</span> <span class="n">r</span> <span class="o">=</span> <span class="n">scipy</span><span class="o">.</span><span class="n">optimize</span><span class="o">.</span><span class="n">root</span><span class="p">(</span> <span class="k">lambda</span> <span class="n">r</span><span class="p">:</span> <span class="n">length_constant</span> <span class="o">*</span> <span class="p">(</span><span class="n">r</span><span class="o">**</span><span class="mi">12</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="n">r</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">-</span> <span class="n">length</span><span class="p">,</span> <span class="mi">22</span> <span class="p">)</span><span class="o">.</span><span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Found r = </span><span class="si">{</span><span class="n">r</span><span class="si">}</span><span class="se">\n</span><span class="s2">&quot;</span><span class="p">)</span> <span class="n">x_points</span> <span class="o">=</span> <span class="p">[</span><span class="mf">0.</span><span class="p">]</span> <span class="o">*</span> <span class="mi">13</span> <span class="n">d_points</span> <span class="o">=</span> <span class="p">[</span><span class="mf">0.</span><span class="p">]</span> <span class="o">*</span> <span class="mi">13</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">13</span><span class="p">):</span> <span class="k">if</span> <span class="n">i</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="n">d_points</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">d_butt</span> <span class="k">else</span><span class="p">:</span> <span class="n">x_points</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">length_constant</span> <span class="o">*</span> <span class="p">(</span><span class="n">r</span><span class="o">**</span><span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="n">r</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="n">d_points</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">d_butt</span> <span class="o">+</span> <span class="p">(</span><span class="n">d_head</span> <span class="o">-</span> <span class="n">d_butt</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">i</span> <span class="o">-</span> <span class="mf">1.</span><span class="p">)</span> <span class="o">/</span> <span class="mf">10.</span> <span class="k">if</span> <span class="n">i</span> <span class="o">==</span> <span class="mi">12</span><span class="p">:</span> <span class="n">d_points</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">d_head</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;x = </span><span class="si">{</span><span class="n">x_points</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="si">:</span><span class="s2">.1f</span><span class="si">}</span><span class="s2">, d = </span><span class="si">{</span><span class="n">d_points</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>Found r = 0.8741349707251251 x = 0.0, d = 8.60 x = 110.0, d = 8.60 x = 206.2, d = 8.30 x = 290.2, d = 8.00 x = 363.7, d = 7.70 x = 427.9, d = 7.40 x = 484.0, d = 7.10 x = 533.1, d = 6.80 x = 576.0, d = 6.50 x = 613.5, d = 6.20 x = 646.3, d = 5.90 x = 675.0, d = 5.60 x = 700.0, d = 5.60 </code></pre></div> <p>We can plot the diameter of the&nbsp;stick:</p> <div class="cell-code highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_points</span><span class="p">,</span> <span class="n">d_points</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Bow Diameter&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;x&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="cell-3-output-1" src="https://www.kloppenborg.ca/2022/06/bow-stiffness/bow-stiffness_files/figure-commonmark_x/cell-3-output-1.png"></p> <h1>Stiffness</h1> <h2>Section&nbsp;Properties</h2> <p>Bows are either (approximately) round or octagonal in cross-section. The area moment of inertia of each of these are as follows&nbsp;(<a href='#MachinerysHandbook' id='ref-MachinerysHandbook-1'> Oberg et al. (2000) </a>):</p> <table> <thead> <tr> <th>Shape</th> <th>Area Moment of Inertia</th> </tr> </thead> <tbody> <tr> <td>Circle</td> <td>$\frac{\pi d^4}{64} = 0.0490874 d^4$</td> </tr> <tr> <td>Octagon</td> <td>$\frac{2 d^2 \tan\frac{\pi}{8}}{12}\left[\frac{d^2 \left(1 + 2 \cos^2\frac{\pi}{8}\right)}{4\cos^2\frac{\pi}{8}}\right] = 0.0547379 d^4$</td> </tr> </tbody> </table> <p>Of course, when determining the stiffness of the bow, the modulus of elasticity also needs to be known. From my research, the modulus of elasticity of pernambuco is about 30 GPa. From my measurements, the modulus of elasticity of ipe is about 20&nbsp;GPa.</p> <h2>Finite Element&nbsp;Method</h2> <p>In order to determine the stiffness of the stick, we&#8217;ll use the finite element method with tapered beam elements. This analysis will be done in two dimensions. We&#8217;ll define a node at each of the <code>x</code> points found in the previous calculation of bow taper with a tapered beam element connecting adjacent nodes. The diameter (or width across flats in the case of an octagonal cross-section) is known at each of the nodes. Our model will assume that the variation in the diameter is linear between&nbsp;nodes.</p> <p>The following derivation is based on Chapter 3 from <a href='#CookFEA' id='ref-CookFEA-1'> Cook et al. (2001) </a>, but differs since the elements are tapered beams instead of constant section&nbsp;beams.</p> <p>Each node will have two degrees of freedom: a transverse displacement and a rotation. The degrees of freedom associated with a single element (which connects two nodes) is&nbsp;thus:</p> <p>$$[d] = \left[ \matrix{ \nu_1 <span class="amp">&amp;</span> \theta_1 <span class="amp">&amp;</span> \nu_2 <span class="amp">&amp;</span> \theta_2 } \right]&nbsp;$$</p> <p>Some of the algebra that we&#8217;ll use in the following derivation gets a bit tedious, so we&#8217;ll use the symbolic mathematics package <code>sympy</code> to help&nbsp;us:</p> <div class="cell-code highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">sympy</span> <span class="c1"># Due to the way that my blogging platform works, we need to</span> <span class="c1"># define a new function for printing symbolic math:</span> <span class="k">def</span> <span class="nf">sym_print</span><span class="p">(</span><span class="n">x</span><span class="p">):</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;$$</span><span class="si">{}</span><span class="s1">$$&#39;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sympy</span><span class="o">.</span><span class="n">printing</span><span class="o">.</span><span class="n">latex</span><span class="p">(</span><span class="n">x</span><span class="p">)))</span> </code></pre></div> <p>The shape function for our element is a function of the element length$L$and the position along the element$x$and is given&nbsp;by:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">L</span> <span class="o">=</span> <span class="n">sympy</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;L&quot;</span><span class="p">)</span> <span class="n">x</span> <span class="o">=</span> <span class="n">sympy</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;x&quot;</span><span class="p">)</span> <span class="n">B</span> <span class="o">=</span> <span class="n">sympy</span><span class="o">.</span><span class="n">Matrix</span><span class="p">([[</span> <span class="o">-</span><span class="mi">6</span> <span class="o">/</span> <span class="n">L</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">12</span> <span class="o">*</span> <span class="n">x</span> <span class="o">/</span> <span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="p">,</span> <span class="o">-</span><span class="mi">4</span> <span class="o">/</span> <span class="n">L</span> <span class="o">+</span> <span class="mi">6</span> <span class="o">*</span> <span class="n">x</span> <span class="o">/</span> <span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="p">,</span> <span class="mi">6</span> <span class="o">/</span> <span class="n">L</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">12</span> <span class="o">*</span> <span class="n">x</span> <span class="o">/</span> <span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="p">,</span> <span class="o">-</span><span class="mi">2</span> <span class="o">/</span> <span class="n">L</span> <span class="o">+</span> <span class="mi">6</span> <span class="o">*</span> <span class="n">x</span> <span class="o">/</span> <span class="n">L</span><span class="o">**</span><span class="mi">2</span> <span class="p">]])</span> <span class="n">sym_print</span><span class="p">(</span><span class="n">B</span><span class="p">)</span> </code></pre></div> <p>$$\left[\begin{matrix}- \frac{6}{L^{2}} + \frac{12 x}{L^{3}} <span class="amp">&amp;</span> - \frac{4}{L} + \frac{6 x}{L^{2}} <span class="amp">&amp;</span> \frac{6}{L^{2}} - \frac{12 x}{L^{3}} <span class="amp">&amp;</span> - \frac{2}{L} + \frac{6&nbsp;x}{L^{2}}\end{matrix}\right]$$</p> <p>For the purpose of stiffness calculations, we&#8217;re idealizing the taper of the bow so that within each element the taper is linear. This means that the diameter of the stick at the point$x$is given by the following. Note that in this section,$x$and$L$refer to the distance along the length of the element dn the length of the element, respectively, rather than the dimensions of the&nbsp;bow.</p> <p>$$d = d_1 + \frac{x}{L}\left(d_2 - d_1\right)&nbsp;$$</p> <p>where$d_1$and$d_2$are the diameters at nodes 1 and 2, respectively. So that we don&#8217;t have to carry around so many variables, we&#8217;ll define the variable$\beta$such&nbsp;that:</p> <p>$$d = d_1 + \beta x&nbsp;$$</p> <p>As we found earlier, for both circular sections and octagonal sections, the moment of inertia ($I$) is a function of$d^4$. We&#8217;ll define a new variable$\alpha$such&nbsp;that:</p> <p>$$<span class="caps">EI</span> = \alpha d^4&nbsp;$$</p> <p>Combining the previous two equations and entering this into <code>sympy</code>, we&nbsp;get:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">alpha</span> <span class="o">=</span> <span class="n">sympy</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;</span><span class="se">\\</span><span class="s2">alpha&quot;</span><span class="p">)</span> <span class="n">d1</span> <span class="o">=</span> <span class="n">sympy</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;d_1&quot;</span><span class="p">)</span> <span class="n">beta</span> <span class="o">=</span> <span class="n">sympy</span><span class="o">.</span><span class="n">var</span><span class="p">(</span><span class="s2">&quot;</span><span class="se">\\</span><span class="s2">beta&quot;</span><span class="p">)</span> <span class="n">EI</span> <span class="o">=</span> <span class="n">alpha</span> <span class="o">*</span> <span class="p">(</span><span class="n">d1</span> <span class="o">+</span> <span class="n">beta</span> <span class="o">*</span> <span class="n">x</span><span class="p">)</span><span class="o">**</span><span class="mi">4</span> <span class="n">sym_print</span><span class="p">(</span><span class="n">EI</span><span class="p">)</span> </code></pre></div> <p>$$\alpha \left(\beta x +&nbsp;d_{1}\right)^{4}$$</p> <p>The stiffness matrix for the element is given&nbsp;by:</p> <p>$$[k] = \int_0^L \left[B\right]^T <span class="caps">EI</span> \left[B\right] dx&nbsp;$$</p> <p>Solving and simplifying this using <code>sympy</code>, we get the following. The stiffness matrix is a 4x4 matrix that is quite complex, so we&#8217;ll show one column at a time in this&nbsp;post:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">k</span> <span class="o">=</span> <span class="n">sympy</span><span class="o">.</span><span class="n">simplify</span><span class="p">(</span> <span class="n">sympy</span><span class="o">.</span><span class="n">integrate</span><span class="p">(</span><span class="n">B</span><span class="o">.</span><span class="n">T</span> <span class="o">*</span> <span class="n">EI</span> <span class="o">*</span> <span class="n">B</span><span class="p">,</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">L</span><span class="p">))</span> <span class="p">)</span> </code></pre></div> <div class="cell-code highlight"><pre><span></span><code><span class="c1"># The first column</span> <span class="n">sym_print</span><span class="p">(</span><span class="n">k</span><span class="p">[:,</span><span class="mi">0</span><span class="p">])</span> </code></pre></div> <p>$$\left[\begin{matrix}\frac{12 \alpha \left(11 L^{4} \beta^{4} + 49 L^{3} \beta^{3} d_{1} + 84 L^{2} \beta^{2} d_{1}^{2} + 70 L \beta d_{1}^{3} + 35 d_{1}^{4}\right)}{35 L^{3}}&#92;frac{2 \alpha \left(19 L^{4} \beta^{4} + 84 L^{3} \beta^{3} d_{1} + 147 L^{2} \beta^{2} d_{1}^{2} + 140 L \beta d_{1}^{3} + 105 d_{1}^{4}\right)}{35 L^{2}}&#92;frac{12 \alpha \left(- 11 L^{4} \beta^{4} - 49 L^{3} \beta^{3} d_{1} - 84 L^{2} \beta^{2} d_{1}^{2} - 70 L \beta d_{1}^{3} - 35 d_{1}^{4}\right)}{35 L^{3}}&#92;frac{2 \alpha \left(47 L^{4} \beta^{4} + 210 L^{3} \beta^{3} d_{1} + 357 L^{2} \beta^{2} d_{1}^{2} + 280 L \beta d_{1}^{3} + 105 d_{1}^{4}\right)}{35&nbsp;L^{2}}\end{matrix}\right]$$</p> <div class="cell-code highlight"><pre><span></span><code><span class="c1"># The second column</span> <span class="n">sym_print</span><span class="p">(</span><span class="n">k</span><span class="p">[:,</span><span class="mi">1</span><span class="p">])</span> </code></pre></div> <p>$$\left[\begin{matrix}\frac{2 \alpha \left(19 L^{4} \beta^{4} + 84 L^{3} \beta^{3} d_{1} + 147 L^{2} \beta^{2} d_{1}^{2} + 140 L \beta d_{1}^{3} + 105 d_{1}^{4}\right)}{35 L^{2}}&#92;frac{4 \alpha \left(3 L^{4} \beta^{4} + 14 L^{3} \beta^{3} d_{1} + 28 L^{2} \beta^{2} d_{1}^{2} + 35 L \beta d_{1}^{3} + 35 d_{1}^{4}\right)}{35 L}&#92;frac{2 \alpha \left(- 19 L^{4} \beta^{4} - 84 L^{3} \beta^{3} d_{1} - 147 L^{2} \beta^{2} d_{1}^{2} - 140 L \beta d_{1}^{3} - 105 d_{1}^{4}\right)}{35 L^{2}}&#92;frac{2 \alpha \left(13 L^{4} \beta^{4} + 56 L^{3} \beta^{3} d_{1} + 91 L^{2} \beta^{2} d_{1}^{2} + 70 L \beta d_{1}^{3} + 35 d_{1}^{4}\right)}{35&nbsp;L}\end{matrix}\right]$$</p> <div class="cell-code highlight"><pre><span></span><code><span class="c1"># The third column</span> <span class="n">sym_print</span><span class="p">(</span><span class="n">k</span><span class="p">[:,</span><span class="mi">2</span><span class="p">])</span> </code></pre></div> <p>$$\left[\begin{matrix}\frac{12 \alpha \left(- 11 L^{4} \beta^{4} - 49 L^{3} \beta^{3} d_{1} - 84 L^{2} \beta^{2} d_{1}^{2} - 70 L \beta d_{1}^{3} - 35 d_{1}^{4}\right)}{35 L^{3}}&#92;frac{2 \alpha \left(- 19 L^{4} \beta^{4} - 84 L^{3} \beta^{3} d_{1} - 147 L^{2} \beta^{2} d_{1}^{2} - 140 L \beta d_{1}^{3} - 105 d_{1}^{4}\right)}{35 L^{2}}&#92;frac{12 \alpha \left(11 L^{4} \beta^{4} + 49 L^{3} \beta^{3} d_{1} + 84 L^{2} \beta^{2} d_{1}^{2} + 70 L \beta d_{1}^{3} + 35 d_{1}^{4}\right)}{35 L^{3}}&#92;frac{2 \alpha \left(- 47 L^{4} \beta^{4} - 210 L^{3} \beta^{3} d_{1} - 357 L^{2} \beta^{2} d_{1}^{2} - 280 L \beta d_{1}^{3} - 105 d_{1}^{4}\right)}{35&nbsp;L^{2}}\end{matrix}\right]$$</p> <div class="cell-code highlight"><pre><span></span><code><span class="c1"># The fourth column</span> <span class="n">sym_print</span><span class="p">(</span><span class="n">k</span><span class="p">[:,</span><span class="mi">3</span><span class="p">])</span> </code></pre></div> <p>$$\left[\begin{matrix}\frac{2 \alpha \left(47 L^{4} \beta^{4} + 210 L^{3} \beta^{3} d_{1} + 357 L^{2} \beta^{2} d_{1}^{2} + 280 L \beta d_{1}^{3} + 105 d_{1}^{4}\right)}{35 L^{2}}&#92;frac{2 \alpha \left(13 L^{4} \beta^{4} + 56 L^{3} \beta^{3} d_{1} + 91 L^{2} \beta^{2} d_{1}^{2} + 70 L \beta d_{1}^{3} + 35 d_{1}^{4}\right)}{35 L}&#92;frac{2 \alpha \left(- 47 L^{4} \beta^{4} - 210 L^{3} \beta^{3} d_{1} - 357 L^{2} \beta^{2} d_{1}^{2} - 280 L \beta d_{1}^{3} - 105 d_{1}^{4}\right)}{35 L^{2}}&#92;frac{4 \alpha \left(17 L^{4} \beta^{4} + 77 L^{3} \beta^{3} d_{1} + 133 L^{2} \beta^{2} d_{1}^{2} + 105 L \beta d_{1}^{3} + 35 d_{1}^{4}\right)}{35&nbsp;L}\end{matrix}\right]$$</p> <p>We can now write a function that outputs the stiffness matrix for a tapered beam&nbsp;element:</p> <div class="cell-code highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span> </code></pre></div> <div class="cell-code highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">elm_k</span><span class="p">(</span><span class="n">L</span><span class="p">,</span> <span class="n">d1</span><span class="p">,</span> <span class="n">d2</span><span class="p">,</span> <span class="n">alpha</span><span class="p">):</span> <span class="n">b</span> <span class="o">=</span> <span class="p">(</span><span class="n">d2</span> <span class="o">-</span> <span class="n">d1</span><span class="p">)</span> <span class="o">/</span> <span class="n">L</span> <span class="k">return</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">alpha</span> <span class="o">/</span> <span class="p">(</span><span class="mi">35</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="p">)</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span> <span class="p">[</span> <span class="p">[</span> <span class="mi">6</span><span class="o">*</span><span class="p">(</span><span class="mi">11</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">49</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">70</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="mi">19</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">147</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">140</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="mi">6</span><span class="o">*</span><span class="p">(</span><span class="o">-</span><span class="mi">11</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">-</span> <span class="mi">49</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">-</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">70</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">-</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="mi">47</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">210</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">357</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">280</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">)</span> <span class="p">],</span> <span class="p">[</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="mi">19</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">147</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">140</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="mi">2</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="p">(</span><span class="mi">3</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">14</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">28</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">35</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="o">-</span><span class="mi">19</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">-</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">-</span> <span class="mi">147</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">140</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">-</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="p">(</span><span class="mi">13</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">56</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">91</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">70</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">)</span> <span class="p">],</span> <span class="p">[</span> <span class="mi">6</span><span class="o">*</span><span class="p">(</span><span class="o">-</span><span class="mi">11</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">-</span> <span class="mi">49</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">-</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">70</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">-</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="o">-</span><span class="mi">19</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">-</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">-</span> <span class="mi">147</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">140</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">-</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="mi">6</span><span class="o">*</span><span class="p">(</span><span class="mi">11</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">49</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">84</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">70</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="o">-</span><span class="mi">47</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">-</span> <span class="mi">210</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">-</span> <span class="mi">357</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">280</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">-</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">)</span> <span class="p">],</span> <span class="p">[</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="mi">47</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">210</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">357</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">280</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="p">(</span><span class="mi">13</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">56</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">91</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">70</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="n">L</span><span class="o">*</span><span class="p">(</span><span class="o">-</span><span class="mi">47</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">-</span> <span class="mi">210</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">-</span> <span class="mi">357</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">280</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">-</span> <span class="mi">105</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">),</span> <span class="mi">2</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="p">(</span><span class="mi">17</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">4</span> <span class="o">+</span> <span class="mi">77</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">3</span><span class="o">*</span><span class="n">d1</span> <span class="o">+</span> <span class="mi">133</span><span class="o">*</span><span class="n">L</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">b</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="mi">105</span><span class="o">*</span><span class="n">L</span><span class="o">*</span><span class="n">b</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">3</span> <span class="o">+</span> <span class="mi">35</span><span class="o">*</span><span class="n">d1</span><span class="o">**</span><span class="mi">4</span><span class="p">)</span> <span class="p">]</span> <span class="p">]</span> <span class="p">)</span> </code></pre></div> <h2>Stroup&nbsp;Test</h2> <p>The Stroup Test is a way of testing the stiffness of the stick of a bow. In this test, the bow is mounted in a jig that supports the stick on two rollers that are 575 mm apart. A transverse force of 2 lb is applied mid-way between the two rollers and the deflection at the force application point is measured. From what I can tell, there were a small number of people advocating this test some time ago, but it has since become quite uncommon &#8212; most makers will assess the stiffness of a stick by feel. However, the Stroup Test can be easily implemented using the finite element method for the purpose of assessing relative stiffness of sticks made from different materials with different&nbsp;dimensions.</p> <h2>Implementing the Stroup&nbsp;Test</h2> <p>We already have a list of nodal locations. We&#8217;ll choose one of these nodes as the location of one of the supports (we&#8217;ll use the second last node for this). We&#8217;ll need to ensure that there are two other nodes for the load application point and the other support in the correct location. We&#8217;ll likely need to create these nodes and sub-divide the existing elements. We can do this in Python as&nbsp;follows:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">x_nodes</span> <span class="o">=</span> <span class="p">[]</span> <span class="n">d_nodes</span> <span class="o">=</span> <span class="p">[]</span> <span class="n">x_s2</span> <span class="o">=</span> <span class="n">x_points</span><span class="p">[</span><span class="mi">11</span><span class="p">]</span> <span class="n">x_s1</span> <span class="o">=</span> <span class="n">x_s2</span> <span class="o">-</span> <span class="mi">575</span> <span class="n">x_l</span> <span class="o">=</span> <span class="n">x_s2</span> <span class="o">-</span> <span class="mi">575</span> <span class="o">/</span> <span class="mi">2</span> <span class="n">nid_s1</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span> <span class="c1"># storage for node ID of support #1</span> <span class="n">nid_s2</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span> <span class="c1"># storage for node ID of support #2</span> <span class="n">nid_l</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span> <span class="c1"># storage for node ID of load application</span> <span class="n">tol</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">xa</span><span class="p">,</span> <span class="n">xb</span><span class="p">:</span> <span class="nb">abs</span><span class="p">(</span><span class="n">xa</span> <span class="o">-</span> <span class="n">xb</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mf">1e-3</span> <span class="n">inside</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">xa</span><span class="p">,</span> <span class="n">xb</span><span class="p">:</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">xa</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">xb</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">0</span> <span class="k">for</span> <span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">d1</span><span class="p">,</span> <span class="n">d2</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span> <span class="n">x_points</span><span class="p">,</span> <span class="n">x_points</span><span class="p">[</span><span class="mi">1</span><span class="p">:],</span> <span class="n">d_points</span><span class="p">,</span> <span class="n">d_points</span><span class="p">[</span><span class="mi">1</span><span class="p">:]):</span> <span class="n">x_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">x1</span><span class="p">)</span> <span class="n">d_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">d1</span><span class="p">)</span> <span class="k">if</span> <span class="n">tol</span><span class="p">(</span><span class="n">x_s1</span><span class="p">,</span> <span class="n">x1</span><span class="p">):</span> <span class="n">nid_s1</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">elif</span> <span class="n">inside</span><span class="p">(</span><span class="n">x_s1</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">):</span> <span class="n">x_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">x_s1</span><span class="p">)</span> <span class="n">d_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">d1</span> <span class="o">+</span> <span class="p">(</span><span class="n">x_s1</span> <span class="o">-</span> <span class="n">x1</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="n">x2</span> <span class="o">-</span> <span class="n">x1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">d2</span> <span class="o">-</span> <span class="n">d1</span><span class="p">))</span> <span class="n">nid_s1</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">tol</span><span class="p">(</span><span class="n">x_s2</span><span class="p">,</span> <span class="n">x1</span><span class="p">):</span> <span class="n">nid_s2</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">elif</span> <span class="n">inside</span><span class="p">(</span><span class="n">x_s2</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">):</span> <span class="n">x_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">x_s2</span><span class="p">)</span> <span class="n">d_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">d1</span> <span class="o">+</span> <span class="p">(</span><span class="n">x_s2</span> <span class="o">-</span> <span class="n">x1</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="n">x2</span> <span class="o">-</span> <span class="n">x1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">d2</span> <span class="o">-</span> <span class="n">d1</span><span class="p">))</span> <span class="n">nid_s2</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">tol</span><span class="p">(</span><span class="n">x_l</span><span class="p">,</span> <span class="n">x1</span><span class="p">):</span> <span class="n">nid_l</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">elif</span> <span class="n">inside</span><span class="p">(</span><span class="n">x_l</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">):</span> <span class="n">x_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">x_l</span><span class="p">)</span> <span class="n">d_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">d1</span> <span class="o">+</span> <span class="p">(</span><span class="n">x_l</span> <span class="o">-</span> <span class="n">x1</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="n">x2</span> <span class="o">-</span> <span class="n">x1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">d2</span> <span class="o">-</span> <span class="n">d1</span><span class="p">))</span> <span class="n">nid_l</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span> <span class="n">x_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">x_points</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span> <span class="n">d_nodes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">d_points</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span> </code></pre></div> <p>We can now build a stiffness matrix for the model. There are now 15 nodes and each node has 2 <span class="caps">DOF</span>, so the matrix will be 30 x 30. We&#8217;ll use a sparse matrix. We&#8217;ll assume that all elements are round and the material has a modulus of 30&nbsp;GPa.</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">k_model</span><span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">2</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">),</span> <span class="mi">2</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">)))</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">d1</span><span class="p">,</span> <span class="n">d2</span><span class="p">)</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span> <span class="nb">zip</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">,</span> <span class="n">x_nodes</span><span class="p">[</span><span class="mi">1</span><span class="p">:],</span> <span class="n">d_nodes</span><span class="p">,</span> <span class="n">d_nodes</span><span class="p">[</span><span class="mi">1</span><span class="p">:])):</span> <span class="c1"># Each element connects the two adjacent nodes</span> <span class="n">k_elm</span> <span class="o">=</span> <span class="n">elm_k</span><span class="p">(</span> <span class="n">L</span> <span class="o">=</span> <span class="n">x2</span> <span class="o">-</span> <span class="n">x1</span><span class="p">,</span> <span class="n">d1</span> <span class="o">=</span> <span class="n">d1</span><span class="p">,</span> <span class="n">d2</span> <span class="o">=</span> <span class="n">d2</span><span class="p">,</span> <span class="n">alpha</span> <span class="o">=</span> <span class="mf">0.0490874</span> <span class="o">*</span> <span class="mf">30e3</span> <span class="p">)</span> <span class="k">for</span> <span class="n">ii</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">4</span><span class="p">):</span> <span class="k">for</span> <span class="n">jj</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">4</span><span class="p">):</span> <span class="n">k_model</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">+</span> <span class="n">ii</span><span class="p">,</span> <span class="n">i</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">+</span> <span class="n">jj</span><span class="p">]</span> <span class="o">+=</span> <span class="n">k_elm</span><span class="p">[</span><span class="n">ii</span><span class="p">,</span><span class="n">jj</span><span class="p">]</span> </code></pre></div> <p>We can visualize the stiffness matrix. As expected, all of the elements are near the&nbsp;diagonal.</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">plt</span><span class="o">.</span><span class="n">matshow</span><span class="p">(</span><span class="n">k_model</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Visualization of Stiffness Matrix&quot;</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>Text(0.5, 1.0, &#39;Visualization of Stiffness Matrix&#39;) </code></pre></div> <p><img alt="cell-16-output-2" src="https://www.kloppenborg.ca/2022/06/bow-stiffness/bow-stiffness_files/figure-commonmark_x/cell-16-output-2.png"></p> <p>Next, we will create the load vector. This vector will have all elements set to zero except for the entry corresponding to the first <span class="caps">DOF</span> of the loading&nbsp;node.</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">p_model</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">))</span> <span class="n">p_model</span><span class="p">[</span><span class="n">nid_l</span> <span class="o">*</span> <span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="o">-</span><span class="mf">8.9075</span> <span class="c1"># 2 lb in N</span> </code></pre></div> <p>Next, we&#8217;ll take away the constrained DOFs from the stiffness matrix and the load vector. In our case, those DOFs are the transverse displacement of the constrained&nbsp;nodes.</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">mask</span> <span class="o">=</span> <span class="p">[</span><span class="n">i</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">p_model</span><span class="p">)</span> <span class="k">if</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">nid_s1</span> <span class="o">*</span> <span class="mi">2</span> <span class="ow">and</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">nid_s2</span> <span class="o">*</span> <span class="mi">2</span><span class="p">]</span> <span class="n">p_const</span> <span class="o">=</span> <span class="n">p_model</span><span class="p">[</span><span class="n">mask</span><span class="p">]</span> <span class="n">k_const</span> <span class="o">=</span> <span class="n">k_model</span><span class="p">[</span><span class="n">mask</span><span class="p">,</span> <span class="p">:]</span> <span class="n">k_const</span> <span class="o">=</span> <span class="n">k_const</span><span class="p">[:,</span> <span class="n">mask</span><span class="p">]</span> </code></pre></div> <p>Now, we can solve for the&nbsp;deflections:</p> <div class="cell-code highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">scipy.linalg</span> <span class="n">d_const</span> <span class="o">=</span> <span class="n">scipy</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">solve</span><span class="p">(</span><span class="n">k_const</span><span class="p">,</span> <span class="n">p_const</span><span class="p">)</span> </code></pre></div> <p>Now, we can add back in the constrained DOFs into the displacement solution. These will be zero because these DOFs were&nbsp;constrained.</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">d_model</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">))</span> <span class="n">d_model</span><span class="p">[</span><span class="n">mask</span><span class="p">]</span> <span class="o">=</span> <span class="n">d_const</span> </code></pre></div> <p>Now, we can plot the&nbsp;results:</p> <div class="cell-code highlight"><pre><span></span><code><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_nodes</span><span class="p">,</span> <span class="n">d_model</span><span class="p">[</span><span class="mi">0</span><span class="p">::</span><span class="mi">2</span><span class="p">])</span> <span class="n">plt</span><span class="o">.</span><span class="n">grid</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s2">&quot;Deflection&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;x&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;Vertical Deflection&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <p><img alt="cell-21-output-1" src="https://www.kloppenborg.ca/2022/06/bow-stiffness/bow-stiffness_files/figure-commonmark_x/cell-21-output-1.png"></p> <p>Stroup values are normally given in thousandths of an inch, which we can calculate as&nbsp;follows:</p> <div class="cell-code highlight"><pre><span></span><code><span class="o">-</span><span class="n">d_model</span><span class="p">[</span><span class="n">nid_l</span> <span class="o">*</span> <span class="mi">2</span><span class="p">]</span> <span class="o">/</span> <span class="mf">25.4</span> <span class="o">*</span> <span class="mi">1000</span> </code></pre></div> <div class="highlight"><pre><span></span><code><span class="mf">301.20020904559294</span><span class="w"></span> </code></pre></div> <h1>Conclusion</h1> <p>This blog post describes a way of numerically finding the relationship between the stiffness of a violin bow and its taper. We used the finite element method to do so. I&#8217;m planning on developing an online calculator for performing this computation. I plan to use an early version of <a href="https://pyscript.net/"><code>py-script</code></a> to do so, but since I&#8217;ve never used <code>py-script</code>, it&#8217;s possible that it will take a while to figure it&nbsp;out.</p>Speeding up Quadrature2021-09-18T00:00:00-04:002021-09-18T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2021-09-18:/2021/09/speeding-up-quadrature/<p>Up until recently, I hadn&#8217;t really thought about the way that numerical integration was performed. Sure, I knew about some techniques like using the trapezoid rule to perform numerical integration, and without thinking about it too much, I had just assumed that the integration routines like R&#8217;s <code>integrate …</code></p><p>Up until recently, I hadn&#8217;t really thought about the way that numerical integration was performed. Sure, I knew about some techniques like using the trapezoid rule to perform numerical integration, and without thinking about it too much, I had just assumed that the integration routines like R&#8217;s <code>integrate</code> function used this technique too. But, I was wrong &#8212; most libraries that implement numerical integration use adaptive&nbsp;quadrature.</p> <p>Adaptive quadrature is actually a rather interesting technique. I won&#8217;t go into too much detail here, but the function being integrated (the integrand) is evaluated at a number of points within the integration range, and the function values are multiplied by a set of weights. In mathematical&nbsp;terms:</p> <p>$$\int_a^b f\left(x\right) dx \approx \sum_i^n w_i f\left(x_i\right)&nbsp;$$</p> <p>Where the weights,$w_i$and the evaluation points,$x_i$are tabulated values. These values can be taken from references such as&nbsp;<a href='#HandbookMathFunctions' id='ref-HandbookMathFunctions-1'> Abramowitz (1972) </a>.</p> <p>The <a href="http://www.gnu.org/software/gsl/"><span class="caps">GNU</span> Scientific Library</a> uses two different sets of$w_i$and$x_i$: the first set are 15-point Kronrod weights, and the second set are 7-point Gausian weights. The estimate of the integral is computed using these two sets of weight and the absolute value of the difference between the two results is an upper bound on the&nbsp;error.</p> <p>If the error is too great, the range is sub-divided and the integral of each sub-divided range is summed to produce the complete integral &#8212; as are the error estimates. This sub-division procedure is the &#8220;adaptive&#8221; part of adaptive&nbsp;quadrature.</p> <p>I&#8217;ve been working on a computational problem that involves the computation of an expression of the following&nbsp;form:</p> <p>$$\frac{ \int_{-\infty}^\lambda g(t)A(t)dt + \int_{\lambda}^\infty h(t)A(t)dt }{ \int_{-\infty}^{\infty}A(t)dt }&nbsp;$$</p> <p>In my particular problem,$A(t)$is expensive to compute, while$g(t)$and$h(t)$are relatively computationally&nbsp;cheap.</p> <p>In my use case, I need to compute this integral many times with slightly different$g(t)$and$h(t)$functions, but with the$A(t)$function identical each&nbsp;time.</p> <p>For now, let&#8217;s ignore the integration bounds for these four integrals. We&#8217;ll revisit the bounds shortly. The quadrature estimate of first integral (containing$g(t)$) will&nbsp;be:</p> <p>$$\int g(t) A(t) dt \approx \sum_i^n w_i f(x_i) = \sum_i^n w_i g(x_i) A(x_i)&nbsp;$$</p> <p>Thus, we can pre-compute the values of$A(x_i)$once and avoid computing them again. A similar procedure can be used for the other three integrals in the original&nbsp;expression.</p> <p>I&#8217;ve implemented this approach of pre-computing the values of$A(x_i)$in C++. I&#8217;ve run this several times with different repetitions and compared the speed to a &#8220;naive&#8221; approach where the complete integration is performed each time. The results are as&nbsp;follows:</p> <table> <thead> <tr> <th>Repetitions</th> <th>Naive Approach</th> <th>Pre-Computing$A(x_i)$</th> </tr> </thead> <tbody> <tr> <td>1</td> <td>0.485 ms</td> <td>1.625 ms</td> </tr> <tr> <td>10</td> <td>3.65 ms</td> <td>1.42 ms</td> </tr> <tr> <td>100</td> <td>57.4 ms</td> <td>1.34 ms</td> </tr> <tr> <td>1000</td> <td>317 ms</td> <td>1.24 ms</td> </tr> <tr> <td>10000</td> <td>2645 ms</td> <td>1.11 ms</td> </tr> </tbody> </table> <p>Using the naive approach, the time scales roughly linearly with the number of repetitions, while the approach where we pre-compute the value of$A(x_i)$is roughly constant regardless of the number of repetitions. The specific values shown here are based on a single-run of the code, so the results will be affected the whatever else my <span class="caps">PC</span> was doing at the time, but we can still see general&nbsp;trends.</p> <p>Returning to the discussion of the integration bounds: first, the bounds of the two integrals in the numerator and the bounds of the integral in the denominator are all different. To account for this, we compute the integral using the widest bounds, subdivide the range as required to achieve a suitable error estimate. Then for the smaller range, we choose the subdivisions that are within the new range, adding a smaller subdivision at one end if&nbsp;needed.</p> <p>Second, you&#8217;ll notice that some of the integration bounds are infinite. This is handled by a clever trick that I would not have though of myself &#8212; a change of variables. In my code, I&#8217;ve used a$\tan$transformation; in the <span class="caps">GNU</span> Scientific Library, they use a different transform that contains a singularity. This singularity is okay if you&#8217;re not altering the integration bounds after starting the computation (which <span class="caps">GSL</span> does not), but can lead to trouble otherwise. After this change of variables, the integration&nbsp;becomes:</p> <p>$$\int_{-\infty}^\infty f(x) dx = \int_{-\pi/2}^{\pi/2} f(\tan(t)) \cos^2(t) dt&nbsp;$$</p> <p>With this transformation, the integration bounds become&nbsp;finite.</p> <p>Using these few tricks, the quadrature for this particular problem can be sped up significantly. These tricks won&#8217;t work for all problems,&nbsp;though.</p>Long-Running Vignettes for R Packages2021-06-21T00:00:00-04:002021-06-21T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2021-06-21:/2021/06/long-running-vignettes/<p>I&#8217;m going to release a new version of <a href="https://www.cmstatr.net"><code>cmstatr</code></a> soon. This new version includes, amongst other things, a new vignette. In R packages, a vignette is a type of long-form documentation. This particular vignette includes a simulation study that helps to demonstrate the validity of a particular statistical method …</p><p>I&#8217;m going to release a new version of <a href="https://www.cmstatr.net"><code>cmstatr</code></a> soon. This new version includes, amongst other things, a new vignette. In R packages, a vignette is a type of long-form documentation. This particular vignette includes a simulation study that helps to demonstrate the validity of a particular statistical method. This simulation study takes a long time to run, though. It takes long enough that I don&#8217;t want to sit and wait for it to run every time I check that package, and I don&#8217;t want to waste resources on the <code>CRAN</code> servers and force their servers to re-run my vignette every time they check that&nbsp;package.</p> <p>Jeroen Oorms wrote a <a href="https://ropensci.org/blog/2019/12/08/precompute-vignettes/">blog post at rOpenSci</a> about this topic. I decided to follow the advice in that blog post and pre-compute the new vignette on my computer, and avoid having to re-run it every time the package is checked. The blog post doesn&#8217;t include all of the necessary information for vignettes that include graphs, though. This present blog post is intended to fill in that&nbsp;gap.</p> <p>The basic idea is that you take your long-running vignette and rename it with the extension <code>.Rmd.orig</code> so that R (and <span class="caps">CRAN</span>) doesn&#8217;t try to build it, because it doesn&#8217;t recognize it as an RMarkdown file. Then you write a script that invokes <code>knitr</code> to to run the executable code in the vignette and write a <code>.Rmd</code> file where the code is no longer executable. With this approach, when R tried to re-build the vignette, none of the code is executable, and it runs almost&nbsp;instantly.</p> <p>In the case of the new vignette being added to <code>cmstatr</code>, the filename of the vignette is <code>hk_ext.Rmd</code>.</p> <p>The first step is easy. Just rename the vignette from <code>hk_ext.Rmd</code> to <code>hk_ext.Rmd.orig</code>.</p> <p>If were were to run the function <code>knitr::knit("hk_ext.Rmd.orig", output = "hk_ext.Rmd")</code>, it would create the <code>.Rmd</code> file with the executable code turned into non-executable code, and with the results of the code included. The figures would be located in folder <code>figures/</code> and referenced by the resulting markdown file. However, the path to <code>figures/</code> will be relative to the current working directory. This is a problem, since the current working directory will (likely) be the root directory of the package, and the vignettes are stored in the <code>vignettes/</code> sub-folder.</p> <p>We can fix this problem by using the following script to re-build the vignette. I&#8217;ve saved this script with the very verbose filename <code>rebuild-long-running-vignette.R</code>.</p> <div class="highlight"><pre><span></span><code><span class="n">old_wd</span> <span class="o">&lt;-</span> <span class="nf">getwd</span><span class="p">()</span> <span class="nf">setwd</span><span class="p">(</span><span class="s">&quot;vignettes/&quot;</span><span class="p">)</span> <span class="n">knitr</span><span class="o">::</span><span class="nf">knit</span><span class="p">(</span><span class="s">&quot;hk_ext.Rmd.orig&quot;</span><span class="p">,</span> <span class="n">output</span> <span class="o">=</span> <span class="s">&quot;hk_ext.Rmd&quot;</span><span class="p">)</span> <span class="n">knitr</span><span class="o">::</span><span class="nf">purl</span><span class="p">(</span><span class="s">&quot;hk_ext.Rmd.orig&quot;</span><span class="p">,</span> <span class="n">output</span> <span class="o">=</span> <span class="s">&quot;hk_ext.R&quot;</span><span class="p">)</span> <span class="nf">setwd</span><span class="p">(</span><span class="n">old_wd</span><span class="p">)</span> </code></pre></div> <p>This sets the working directory to the <code>vignettes/</code> sub-folder, rebuilds the vignette then sets the working directory back to what it originally&nbsp;was.</p> <p>We also need to make a change to the setup chunk of our vignette (<code>hk_ext.Rmd.orig</code>). This will tell <code>knitr</code> to put the resulting figures in the same folder as the vignette, rather than a&nbsp;sub-folder.</p> <div class="highlight"><pre><span></span><code><span class="n">knitr</span><span class="o">::</span><span class="n">opts_chunk</span><span class="o">$</span><span class="nf">set</span><span class="p">(</span> <span class="n">collapse</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> <span class="n">comment</span> <span class="o">=</span> <span class="s">&quot;#&gt;&quot;</span><span class="p">,</span> <span class="n">fig.path</span> <span class="o">=</span> <span class="s">&quot;&quot;</span> <span class="c1"># Added this line to the standard setup chunk</span> <span class="p">)</span> </code></pre></div> <p>Now to rebuild the vignette, you just run the script <code>rebuild-long-running-vignette.R</code>. This script should be added to <code>.Rbuildignore</code> so that it doesn&#8217;t get included in the built package. Similarly, the <code>.Rmd.orig</code> file needs to be added to the <code>.Rbuildignore</code> file.</p> <p>The other issue is remembering to update the vignette, now that it&#8217;s not automatic. I personally use <code>devtools</code> to release packages to <span class="caps">CRAN</span>. When you run <code>devtools::release()</code> it asks you a bunch of standard questions. It&#8217;s possible to add extra questions according to the <a href="https://devtools.r-lib.org/reference/release.html">documentation</a>. So, I&#8217;ve added the following un-exported function to the&nbsp;package:</p> <div class="highlight"><pre><span></span><code><span class="n">release_questions</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">()</span> <span class="p">{</span> <span class="nf">c</span><span class="p">(</span> <span class="s">&quot;Did you re-build the hk_ext.Rmd using rebuild-long-running-vignette.R?&quot;</span> <span class="p">)</span> <span class="p">}</span> </code></pre></div>Calculating Extended Hanson—Koopmans Tolerance Limits2021-06-12T00:00:00-04:002021-06-12T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2021-06-12:/2021/06/calculating-ext-hk-tolerance-limits/<p>Calculating tolerance limits &#8212; such as A-Basis and B-Basis is an important part of developing and certifying composite structure for aircraft. When the data doesn&#8217;t fit a convenient parametric distribution like a Normal or Weibull distribution, one often resorts to non-parametric methods. Several non-parametric methods exist for determining tolerance&nbsp;limits …</p><p>Calculating tolerance limits &#8212; such as A-Basis and B-Basis is an important part of developing and certifying composite structure for aircraft. When the data doesn&#8217;t fit a convenient parametric distribution like a Normal or Weibull distribution, one often resorts to non-parametric methods. Several non-parametric methods exist for determining tolerance&nbsp;limits.</p> <p>Vangel&#8217;s 1994 paper <a href='#Vangel1994' id='ref-Vangel1994-1'> Vangel (1994) </a> discusses a non-parametric method for determining tolerance limits. This article provides a brief summary of that work and discusses the implementation of that method in the R language, as well as some choices that can be made in the&nbsp;implementation.</p> <p>This method of calculating non-parametric tolerance limits is an extension of the Hanson&#8212;Koopmans method <a href='#Hanson1964' id='ref-Hanson1964-1'> Hanson and Koopmans (1964) </a>. The lower tolerance can be calculated using the following&nbsp;formula:</p> <p>$$T_L = x_{(j)}\left[\frac{x_{(i)}}{x_{(j)}}\right]^z&nbsp;$$</p> <p>where $x_{(i)}$ and $x_{(j)}$ indicate the $i$th and $j$th order statistic of the sample (that is, the $i$th smallest and the $j$th smallest&nbsp;value).</p> <p>The values of $j$ and $z$ need to be determined&nbsp;somehow.</p> <p>There is a function $H(z)$ defined as&nbsp;follows:</p> <p>$$H(z) = Pr\left[T(z) \ge \log(1 - \beta)\right]&nbsp;$$</p> <p>where $\beta$ is the content of the desired tolerance limit. The details are outside the scope of this article, but we can write a function that solves the following equation for&nbsp;$z$.</p> <p>$$H(z) = \gamma&nbsp;$$</p> <p>where $\gamma$ is the confidence of the desired tolerance&nbsp;limit.</p> <p>It turns out that we obtain different values of $z$ depending on which values of $i$ and $j$ we&nbsp;choose.</p> <p>Vangel&#8217;s approach is to set $i=1$ in all cases, then to find the value of $j$ that would produce a tolerance limit that is nearest to the population quantile assuming that the data is distributed according to a standard normal&nbsp;distribution.</p> <p>We&#8217;ll investigate this approach through simulation. First, we&#8217;ll load a few&nbsp;packages.</p> <div class="highlight"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span> <span class="nf">library</span><span class="p">(</span><span class="n">cmstatr</span><span class="p">)</span> </code></pre></div> <p>Next, we&#8217;ll set the value of $i=1$ and a value of the content and confidence of our tolerance limit. We&#8217;ll choose B-Basis tolerance limits as an&nbsp;example.</p> <div class="highlight"><pre><span></span><code><span class="n">i</span> <span class="o">&lt;-</span> <span class="m">1</span> <span class="n">p</span> <span class="o">&lt;-</span> <span class="m">0.90</span> <span class="n">conf</span> <span class="o">&lt;-</span> <span class="m">0.95</span> </code></pre></div> <p>The expected value of the $i$th order statistic for a normally distributed sample can be calculated using the following function (see <a href='#Harter1961' id='ref-Harter1961-1'> Harter (1961) </a>). We&#8217;ll need this function&nbsp;soon.</p> <div class="highlight"><pre><span></span><code><span class="n">expected_order_statistic</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span> <span class="n">int</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="n">x</span> <span class="o">*</span> <span class="nf">pnorm</span><span class="p">(</span><span class="o">-</span><span class="n">x</span><span class="p">)</span> <span class="o">^</span> <span class="p">(</span><span class="n">i</span> <span class="o">-</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="nf">pnorm</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">^</span> <span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="n">i</span><span class="p">)</span> <span class="o">*</span> <span class="nf">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">}</span> <span class="n">integral</span> <span class="o">&lt;-</span> <span class="nf">integrate</span><span class="p">(</span><span class="n">int</span><span class="p">,</span> <span class="o">-</span><span class="kc">Inf</span><span class="p">,</span> <span class="kc">Inf</span><span class="p">)</span> <span class="nf">stopifnot</span><span class="p">(</span><span class="n">integral</span><span class="o">$</span><span class="n">message</span> <span class="o">==</span> <span class="s">&quot;OK&quot;</span><span class="p">)</span> <span class="nf">factorial</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="nf">factorial</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="n">i</span><span class="p">)</span> <span class="o">*</span> <span class="nf">factorial</span><span class="p">(</span><span class="n">i</span> <span class="o">-</span> <span class="m">1</span><span class="p">))</span> <span class="o">*</span> <span class="n">integral</span><span class="o">$</span><span class="n">value</span> <span class="p">}</span> </code></pre></div> <p>When using Vangel&#8217;s approach, we need to minimize the value of the following&nbsp;function.</p> <div class="highlight"><pre><span></span><code><span class="n">fcn</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">j</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span> <span class="n">e1</span> <span class="o">&lt;-</span> <span class="nf">expected_order_statistic</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="n">e2</span> <span class="o">&lt;-</span> <span class="nf">expected_order_statistic</span><span class="p">(</span><span class="n">j</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="n">z</span> <span class="o">&lt;-</span> <span class="nf">hk_ext_z</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">conf</span><span class="p">)</span> <span class="nf">abs</span><span class="p">(</span><span class="n">z</span> <span class="o">*</span> <span class="n">e1</span> <span class="o">+</span> <span class="p">(</span><span class="m">1</span> <span class="o">-</span> <span class="n">z</span><span class="p">)</span> <span class="o">*</span> <span class="n">e2</span> <span class="o">-</span> <span class="nf">qnorm</span><span class="p">(</span><span class="n">p</span><span class="p">))</span> <span class="p">}</span> </code></pre></div> <p>We can plot the above function versus $j$ for the value of&nbsp;$n=17$:</p> <div class="highlight"><pre><span></span><code><span class="nf">data.frame</span><span class="p">(</span> <span class="n">j</span> <span class="o">=</span> <span class="nf">seq</span><span class="p">(</span><span class="m">7</span><span class="p">,</span> <span class="m">11</span><span class="p">,</span> <span class="n">by</span> <span class="o">=</span> <span class="m">0.1</span><span class="p">)</span> <span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">fcn</span> <span class="o">=</span> <span class="nf">Vectorize</span><span class="p">(</span><span class="n">fcn</span><span class="p">)(</span><span class="n">j</span><span class="p">,</span> <span class="m">17</span><span class="p">))</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">j</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">fcn</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_line</span><span class="p">()</span> <span class="o">+</span> <span class="nf">geom_point</span><span class="p">(</span> <span class="n">data</span> <span class="o">=</span> <span class="nf">data.frame</span><span class="p">(</span><span class="n">j</span> <span class="o">=</span> <span class="m">7</span><span class="o">:</span><span class="m">11</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">fcn</span> <span class="o">=</span> <span class="nf">Vectorize</span><span class="p">(</span><span class="n">fcn</span><span class="p">)(</span><span class="n">j</span><span class="p">,</span> <span class="m">17</span><span class="p">)),</span> <span class="n">mapping</span> <span class="o">=</span> <span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">j</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">fcn</span><span class="p">)</span> <span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-5-1" src="https://www.kloppenborg.ca/2021/06/calculating-ext-hk-tolerance-limits/Calculating-Ext-HK-Tolerance-Limits_files/figure-markdown/unnamed-chunk-5-1.png"></p> <p>In this particular case, we can see that $j=9$ produces the minimum value of this function (for integer values of $j$). But this function at $j=8$ is not much&nbsp;worse.</p> <p>Of note, there is a table of optimum values of $j$ for various values of $n$ published in <span class="caps">CMH</span>-17-1G [&#64;<span class="caps">CMH</span>-17-1G] <sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup>. For most values of $n$, the optimum value from the function above matches the published value. However, for samples of size 17, 20, 23, 24 and 28, the function above disagrees with the published values by one unit. We will focus the simulation effort on samples of these sizes. For sample sizes of interest, the following values of $j$ and $z$ are published in <span class="caps">CMH</span>-17-1G.</p> <div class="highlight"><pre><span></span><code><span class="n">published_r_n</span> <span class="o">&lt;-</span> <span class="nf">tribble</span><span class="p">(</span> <span class="o">~</span><span class="n">n</span><span class="p">,</span> <span class="o">~</span><span class="n">j_pub</span><span class="p">,</span> <span class="o">~</span><span class="n">z_pub</span><span class="p">,</span> <span class="m">17</span><span class="p">,</span> <span class="m">8</span><span class="p">,</span> <span class="m">1.434</span><span class="p">,</span> <span class="m">20</span><span class="p">,</span> <span class="m">10</span><span class="p">,</span> <span class="m">1.253</span><span class="p">,</span> <span class="m">23</span><span class="p">,</span> <span class="m">11</span><span class="p">,</span> <span class="m">1.143</span><span class="p">,</span> <span class="m">24</span><span class="p">,</span> <span class="m">11</span><span class="p">,</span> <span class="m">1.114</span><span class="p">,</span> <span class="m">28</span><span class="p">,</span> <span class="m">12</span><span class="p">,</span> <span class="m">1.010</span> <span class="p">)</span> </code></pre></div> <p>We can create an R function that returns the &#8220;optimum&#8221; value of $j$ where all the integer values of $j$ are considered, then the integer with the lowest value of that function is returned. Such an R function is as&nbsp;follows:</p> <div class="highlight"><pre><span></span><code><span class="n">optim_j</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="p">{</span> <span class="n">j</span> <span class="o">&lt;-</span> <span class="m">2</span><span class="o">:</span><span class="n">n</span> <span class="n">f</span> <span class="o">&lt;-</span> <span class="nf">sapply</span><span class="p">(</span><span class="m">2</span><span class="o">:</span><span class="n">n</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">j</span><span class="p">)</span> <span class="nf">Vectorize</span><span class="p">(</span><span class="n">fcn</span><span class="p">)(</span><span class="n">j</span><span class="p">,</span> <span class="n">n</span><span class="p">))</span> <span class="n">j</span><span class="p">[</span><span class="n">f</span> <span class="o">==</span> <span class="nf">min</span><span class="p">(</span><span class="n">f</span><span class="p">)]</span> <span class="p">}</span> </code></pre></div> <p>For values of $n$ of interest, we&#8217;ll generate a large number of samples (10,000) drawn from a normal distribution. We can calculate the true population quantile, since we know the population parameters. We can use the two variations of the nonparametric tolerance limit approach to calculate tolerance limits. The proportion of those tolerance limits that are below the population quantile should equal the selected confidence level. We&#8217;ll restrict the simulation to values of $n$ where we find different values of $j$ compared with those publised in <span class="caps">CMH</span>-17-1G.</p> <div class="highlight"><pre><span></span><code><span class="n">mu_normal</span> <span class="o">&lt;-</span> <span class="m">100</span> <span class="n">sd_normal</span> <span class="o">&lt;-</span> <span class="m">6</span> <span class="nf">set.seed</span><span class="p">(</span><span class="m">1234567</span><span class="p">)</span> <span class="c1"># make this reproducible</span> <span class="n">sim_normal</span> <span class="o">&lt;-</span> <span class="nf">pmap_dfr</span><span class="p">(</span><span class="n">published_r_n</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">j_pub</span><span class="p">,</span> <span class="n">z_pub</span><span class="p">)</span> <span class="p">{</span> <span class="n">j_opt</span> <span class="o">&lt;-</span> <span class="nf">optim_j</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="n">z_opt</span> <span class="o">&lt;-</span> <span class="nf">hk_ext_z</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">j_opt</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">conf</span><span class="p">)</span> <span class="nf">map_dfr</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">10000</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">i_sim</span><span class="p">)</span> <span class="p">{</span> <span class="nf">tibble</span><span class="p">(</span> <span class="n">n</span> <span class="o">=</span> <span class="n">n</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="nf">list</span><span class="p">(</span><span class="nf">sort</span><span class="p">(</span><span class="nf">rnorm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">mu_normal</span><span class="p">,</span> <span class="n">sd_normal</span><span class="p">))),</span> <span class="n">j_pub</span> <span class="o">=</span> <span class="n">j_pub</span><span class="p">,</span> <span class="n">j_opt</span> <span class="o">=</span> <span class="n">j_opt</span><span class="p">,</span> <span class="n">z_pub</span> <span class="o">=</span> <span class="n">z_pub</span><span class="p">,</span> <span class="n">z_opt</span> <span class="o">=</span> <span class="n">z_opt</span><span class="p">,</span> <span class="p">)</span> <span class="p">}</span> <span class="p">)</span> <span class="p">})</span> <span class="o">%&gt;%</span> <span class="nf">rowwise</span><span class="p">()</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span> <span class="n">T_pub</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="n">j_pub</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="n">x</span><span class="p">[</span><span class="n">j_pub</span><span class="p">])</span> <span class="o">^</span> <span class="n">z_pub</span><span class="p">,</span> <span class="n">T_opt</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="n">j_opt</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="n">x</span><span class="p">[</span><span class="n">j_opt</span><span class="p">])</span> <span class="o">^</span> <span class="n">z_opt</span> <span class="p">)</span> <span class="n">sim_normal</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 50,000 × 8 ## # Rowwise: ## n x j_pub j_opt z_pub z_opt T_pub T_opt ## &lt;dbl&gt; &lt;list&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 85.4 85.4 ## 2 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 89.5 89.3 ## 3 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 83.2 83.5 ## 4 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 83.6 83.8 ## 5 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 83.4 83.8 ## 6 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 84.1 84.4 ## 7 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 82.6 82.9 ## 8 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 87.5 87.6 ## 9 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 83.9 83.8 ## 10 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 86.9 87.2 ## # … with 49,990 more rows </code></pre></div> <p>We can plot the distribution of the tolerance limits that result from our R code and from the values of $j$ and $z$ published in <span class="caps">CMH</span>-17-1G. We see that the distributions are very&nbsp;similar.</p> <div class="highlight"><pre><span></span><code><span class="n">sim_normal</span> <span class="o">%&gt;%</span> <span class="nf">pivot_longer</span><span class="p">(</span><span class="n">cols</span> <span class="o">=</span> <span class="n">T_pub</span><span class="o">:</span><span class="n">T_opt</span><span class="p">,</span> <span class="n">names_to</span> <span class="o">=</span> <span class="s">&quot;Approach&quot;</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">value</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="n">Approach</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_density</span><span class="p">()</span> <span class="o">+</span> <span class="nf">facet_wrap</span><span class="p">(</span><span class="n">n</span> <span class="o">~</span> <span class="n">.</span><span class="p">)</span> <span class="o">+</span> <span class="nf">ggtitle</span><span class="p">(</span><span class="s">&quot;Distribution of Tolerance Limits for Various Values of n&quot;</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-9-1" src="https://www.kloppenborg.ca/2021/06/calculating-ext-hk-tolerance-limits/Calculating-Ext-HK-Tolerance-Limits_files/figure-markdown/unnamed-chunk-9-1.png"></p> <p>In this article, we&#8217;re calculating the B-Basis (lower 90/95 tolerance limit). So, the population quantile that we&#8217;re approximating&nbsp;is:</p> <div class="highlight"><pre><span></span><code><span class="n">x_p_normal</span> <span class="o">&lt;-</span> <span class="nf">qnorm</span><span class="p">(</span><span class="m">1</span> <span class="o">-</span> <span class="n">p</span><span class="p">,</span> <span class="n">mu_normal</span><span class="p">,</span> <span class="n">sd_normal</span><span class="p">)</span> <span class="n">x_p_normal</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 92.31069 </code></pre></div> <p>We can now determine what proportion of the calculated tolerance limits were below the population&nbsp;quantile.</p> <div class="highlight"><pre><span></span><code><span class="n">sim_normal</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">below_pub</span> <span class="o">=</span> <span class="n">T_pub</span> <span class="o">&lt;</span> <span class="n">x_p_normal</span><span class="p">,</span> <span class="n">below_opt</span> <span class="o">=</span> <span class="n">T_opt</span> <span class="o">&lt;</span> <span class="n">x_p_normal</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">group_by</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">summarise</span><span class="p">(</span> <span class="n">prop_below_pub</span> <span class="o">=</span> <span class="nf">sum</span><span class="p">(</span><span class="n">below_pub</span><span class="p">)</span> <span class="o">/</span> <span class="nf">n</span><span class="p">(),</span> <span class="n">prop_below_opt</span> <span class="o">=</span> <span class="nf">sum</span><span class="p">(</span><span class="n">below_opt</span><span class="p">)</span> <span class="o">/</span> <span class="nf">n</span><span class="p">()</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 5 × 3 ## n prop_below_pub prop_below_opt ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 17 0.964 0.967 ## 2 20 0.967 0.965 ## 3 23 0.960 0.960 ## 4 24 0.959 0.957 ## 5 28 0.954 0.954 </code></pre></div> <p>In all cases, the tolerance limits are conservative when the data are normally distributed. Remember that we expect that 95% of the tolerance limits should be below the population quantile: here we see a slightly higher proportion than&nbsp;95%.</p> <p>We can repeat this with a distribution that is far from normal. Let&#8217;s try it withe the $\chi^2$&nbsp;distribution.</p> <div class="highlight"><pre><span></span><code><span class="n">df_chisq</span> <span class="o">&lt;-</span> <span class="m">6</span> <span class="nf">set.seed</span><span class="p">(</span><span class="m">2345678</span><span class="p">)</span> <span class="c1"># make this reproducible</span> <span class="n">sim_chisq</span> <span class="o">&lt;-</span> <span class="nf">pmap_dfr</span><span class="p">(</span><span class="n">published_r_n</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">j_pub</span><span class="p">,</span> <span class="n">z_pub</span><span class="p">)</span> <span class="p">{</span> <span class="n">j_opt</span> <span class="o">&lt;-</span> <span class="nf">optim_j</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="n">z_opt</span> <span class="o">&lt;-</span> <span class="nf">hk_ext_z</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">j_opt</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">conf</span><span class="p">)</span> <span class="nf">map_dfr</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">10000</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">i_sim</span><span class="p">)</span> <span class="p">{</span> <span class="nf">tibble</span><span class="p">(</span> <span class="n">n</span> <span class="o">=</span> <span class="n">n</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="nf">list</span><span class="p">(</span><span class="nf">sort</span><span class="p">(</span><span class="nf">rchisq</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">df_chisq</span><span class="p">))),</span> <span class="n">j_pub</span> <span class="o">=</span> <span class="n">j_pub</span><span class="p">,</span> <span class="n">j_opt</span> <span class="o">=</span> <span class="n">j_opt</span><span class="p">,</span> <span class="n">z_pub</span> <span class="o">=</span> <span class="n">z_pub</span><span class="p">,</span> <span class="n">z_opt</span> <span class="o">=</span> <span class="n">z_opt</span><span class="p">,</span> <span class="p">)</span> <span class="p">}</span> <span class="p">)</span> <span class="p">})</span> <span class="o">%&gt;%</span> <span class="nf">rowwise</span><span class="p">()</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span> <span class="n">T_pub</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="n">j_pub</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="n">x</span><span class="p">[</span><span class="n">j_pub</span><span class="p">])</span> <span class="o">^</span> <span class="n">z_pub</span><span class="p">,</span> <span class="n">T_opt</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="n">j_opt</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="n">x</span><span class="p">[</span><span class="n">j_opt</span><span class="p">])</span> <span class="o">^</span> <span class="n">z_opt</span> <span class="p">)</span> <span class="n">sim_chisq</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 50,000 × 8 ## # Rowwise: ## n x j_pub j_opt z_pub z_opt T_pub T_opt ## &lt;dbl&gt; &lt;list&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 1.00 1.03 ## 2 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 1.35 1.34 ## 3 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 1.39 1.40 ## 4 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 1.39 1.28 ## 5 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 1.39 1.43 ## 6 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 0.283 0.297 ## 7 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 0.514 0.497 ## 8 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 0.264 0.268 ## 9 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 1.68 1.61 ## 10 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 0.661 0.692 ## # … with 49,990 more rows </code></pre></div> <p>The population quantile&nbsp;is:</p> <div class="highlight"><pre><span></span><code><span class="n">x_p_chisq</span> <span class="o">&lt;-</span> <span class="nf">qchisq</span><span class="p">(</span><span class="m">1</span> <span class="o">-</span> <span class="n">p</span><span class="p">,</span> <span class="n">df_chisq</span><span class="p">)</span> <span class="n">x_p_chisq</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 2.204131 </code></pre></div> <p>The distribution of the tolerance limits calculated using the values of $j$ and $z$ that we calculate and those published. Again, the distributions are very&nbsp;similar.</p> <div class="highlight"><pre><span></span><code><span class="n">sim_chisq</span> <span class="o">%&gt;%</span> <span class="nf">pivot_longer</span><span class="p">(</span><span class="n">cols</span> <span class="o">=</span> <span class="n">T_pub</span><span class="o">:</span><span class="n">T_opt</span><span class="p">,</span> <span class="n">names_to</span> <span class="o">=</span> <span class="s">&quot;Approach&quot;</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">value</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="n">Approach</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_density</span><span class="p">()</span> <span class="o">+</span> <span class="nf">facet_wrap</span><span class="p">(</span><span class="n">n</span> <span class="o">~</span> <span class="n">.</span><span class="p">)</span> <span class="o">+</span> <span class="nf">ggtitle</span><span class="p">(</span><span class="s">&quot;Distribution of Tolerance Limits for Various Values of n&quot;</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-14-1" src="https://www.kloppenborg.ca/2021/06/calculating-ext-hk-tolerance-limits/Calculating-Ext-HK-Tolerance-Limits_files/figure-markdown/unnamed-chunk-14-1.png"></p> <p>We can now determine what proportion of the calculated tolerance limits were below the population&nbsp;quantile.</p> <div class="highlight"><pre><span></span><code><span class="n">sim_chisq</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">below_pub</span> <span class="o">=</span> <span class="n">T_pub</span> <span class="o">&lt;</span> <span class="n">x_p_chisq</span><span class="p">,</span> <span class="n">below_opt</span> <span class="o">=</span> <span class="n">T_opt</span> <span class="o">&lt;</span> <span class="n">x_p_chisq</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">group_by</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">summarise</span><span class="p">(</span> <span class="n">prop_below_pub</span> <span class="o">=</span> <span class="nf">sum</span><span class="p">(</span><span class="n">below_pub</span><span class="p">)</span> <span class="o">/</span> <span class="nf">n</span><span class="p">(),</span> <span class="n">prop_below_opt</span> <span class="o">=</span> <span class="nf">sum</span><span class="p">(</span><span class="n">below_opt</span><span class="p">)</span> <span class="o">/</span> <span class="nf">n</span><span class="p">()</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 5 × 3 ## n prop_below_pub prop_below_opt ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 17 0.963 0.965 ## 2 20 0.959 0.959 ## 3 23 0.959 0.958 ## 4 24 0.955 0.955 ## 5 28 0.953 0.953 </code></pre></div> <p>Again with this distribution, we see that the tolerance limits are&nbsp;conservative.</p> <p>Finally, let&#8217;s try again using a&nbsp;t-Distribution.</p> <div class="highlight"><pre><span></span><code><span class="n">df_t</span> <span class="o">&lt;-</span> <span class="m">3</span> <span class="n">offset_t</span> <span class="o">&lt;-</span> <span class="m">150</span> <span class="nf">set.seed</span><span class="p">(</span><span class="m">4567</span><span class="p">)</span> <span class="c1"># make this reproducible</span> <span class="n">sim_t</span> <span class="o">&lt;-</span> <span class="nf">pmap_dfr</span><span class="p">(</span><span class="n">published_r_n</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">j_pub</span><span class="p">,</span> <span class="n">z_pub</span><span class="p">)</span> <span class="p">{</span> <span class="n">j_opt</span> <span class="o">&lt;-</span> <span class="nf">optim_j</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="n">z_opt</span> <span class="o">&lt;-</span> <span class="nf">hk_ext_z</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">j_opt</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">conf</span><span class="p">)</span> <span class="nf">map_dfr</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">10000</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">i_sim</span><span class="p">)</span> <span class="p">{</span> <span class="nf">tibble</span><span class="p">(</span> <span class="n">n</span> <span class="o">=</span> <span class="n">n</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="nf">list</span><span class="p">(</span><span class="nf">sort</span><span class="p">(</span><span class="nf">rt</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">df_t</span><span class="p">)</span> <span class="o">+</span> <span class="n">offset_t</span><span class="p">)),</span> <span class="n">j_pub</span> <span class="o">=</span> <span class="n">j_pub</span><span class="p">,</span> <span class="n">j_opt</span> <span class="o">=</span> <span class="n">j_opt</span><span class="p">,</span> <span class="n">z_pub</span> <span class="o">=</span> <span class="n">z_pub</span><span class="p">,</span> <span class="n">z_opt</span> <span class="o">=</span> <span class="n">z_opt</span><span class="p">,</span> <span class="p">)</span> <span class="p">}</span> <span class="p">)</span> <span class="p">})</span> <span class="o">%&gt;%</span> <span class="nf">rowwise</span><span class="p">()</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span> <span class="n">T_pub</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="n">j_pub</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="n">x</span><span class="p">[</span><span class="n">j_pub</span><span class="p">])</span> <span class="o">^</span> <span class="n">z_pub</span><span class="p">,</span> <span class="n">T_opt</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="n">j_opt</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="n">x</span><span class="p">[</span><span class="n">j_opt</span><span class="p">])</span> <span class="o">^</span> <span class="n">z_opt</span> <span class="p">)</span> <span class="n">sim_t</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 50,000 × 8 ## # Rowwise: ## n x j_pub j_opt z_pub z_opt T_pub T_opt ## &lt;dbl&gt; &lt;list&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 140. 140. ## 2 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 147. 147. ## 3 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 144. 144. ## 4 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 147. 147. ## 5 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 147. 147. ## 6 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 145. 145. ## 7 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 146. 146. ## 8 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 147. 147. ## 9 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 146. 146. ## 10 17 &lt;dbl [17]&gt; 8 9 1.43 1.40 143. 143. ## # … with 49,990 more rows </code></pre></div> <p>The population quantile&nbsp;is:</p> <div class="highlight"><pre><span></span><code><span class="n">x_p_t</span> <span class="o">&lt;-</span> <span class="nf">qt</span><span class="p">(</span><span class="m">1</span> <span class="o">-</span> <span class="n">p</span><span class="p">,</span> <span class="n">df_t</span><span class="p">)</span> <span class="o">+</span> <span class="n">offset_t</span> <span class="n">x_p_t</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 148.3623 </code></pre></div> <p>The distribution of the tolerance limits using the two approaches are as follows. Again, the distributions are very&nbsp;similar.</p> <div class="highlight"><pre><span></span><code><span class="n">sim_t</span> <span class="o">%&gt;%</span> <span class="nf">pivot_longer</span><span class="p">(</span><span class="n">cols</span> <span class="o">=</span> <span class="n">T_pub</span><span class="o">:</span><span class="n">T_opt</span><span class="p">,</span> <span class="n">names_to</span> <span class="o">=</span> <span class="s">&quot;Approach&quot;</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">value</span><span class="p">,</span> <span class="n">color</span> <span class="o">=</span> <span class="n">Approach</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_density</span><span class="p">()</span> <span class="o">+</span> <span class="nf">facet_wrap</span><span class="p">(</span><span class="n">n</span> <span class="o">~</span> <span class="n">.</span><span class="p">)</span> <span class="o">+</span> <span class="nf">ggtitle</span><span class="p">(</span><span class="s">&quot;Distribution of Tolerance Limits for Various Values of n&quot;</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-18-1" src="https://www.kloppenborg.ca/2021/06/calculating-ext-hk-tolerance-limits/Calculating-Ext-HK-Tolerance-Limits_files/figure-markdown/unnamed-chunk-18-1.png"></p> <p>We can now determine what proportion of the calculated tolerance limits were below the population&nbsp;quantile.</p> <div class="highlight"><pre><span></span><code><span class="n">sim_t</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">below_pub</span> <span class="o">=</span> <span class="n">T_pub</span> <span class="o">&lt;</span> <span class="n">x_p_t</span><span class="p">,</span> <span class="n">below_opt</span> <span class="o">=</span> <span class="n">T_opt</span> <span class="o">&lt;</span> <span class="n">x_p_t</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">group_by</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">summarise</span><span class="p">(</span> <span class="n">prop_below_pub</span> <span class="o">=</span> <span class="nf">sum</span><span class="p">(</span><span class="n">below_pub</span><span class="p">)</span> <span class="o">/</span> <span class="nf">n</span><span class="p">(),</span> <span class="n">prop_below_opt</span> <span class="o">=</span> <span class="nf">sum</span><span class="p">(</span><span class="n">below_opt</span><span class="p">)</span> <span class="o">/</span> <span class="nf">n</span><span class="p">()</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 5 × 3 ## n prop_below_pub prop_below_opt ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 17 0.958 0.959 ## 2 20 0.953 0.952 ## 3 23 0.953 0.953 ## 4 24 0.954 0.954 ## 5 28 0.953 0.953 </code></pre></div> <p>For this distribution, the tolerance limits are still&nbsp;conservative.</p> <p>From this simulation work, it appears that both approaches to selecting the value of $j$ preform equally well. The tolerance limits produced using each approach for a particular sample will be different, but both approaches seem to be equally&nbsp;valid.</p> <p>The R package <code>cmstatr</code> contains the function <code>hk_ext_z_j_opt</code> which returns $j$ and $z$ for calculating tolerance limits with the optimization method described here (after version 0.8.0<sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup>). While the tolerance limits found for some particular samples may differ slightly from that produced by the tables published in <span class="caps">CMH</span>-17-1G, both results appear are equally&nbsp;valid.</p> <div class="footnote"> <hr> <ol> <li id="fn:1"> <p>It should be noted that <span class="caps">CMH</span>-17-1G uses $r$ and $k$ instead of $j$ and $z$ as used in this article and in Vangel&#8217;s paper.&#160;<a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">&#8617;</a></p> </li> <li id="fn:2"> <p><code>cmstatr</code> version 0.8.0 and earlier used a slightly different function that was optimized. That version of the code produces slightly different values of $j$ for certain values of $n$.&#160;<a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">&#8617;</a></p> </li> </ol> </div>Basis Values From Censored Data2021-02-09T00:00:00-05:002021-02-09T00:00:00-05:00Stefan Kloppenborgtag:www.kloppenborg.ca,2021-02-09:/2021/02/basis-values-censored-data/<p>Earlier, I wrote a post about using a <a href="https://www.kloppenborg.ca/2021/02/likelihood-basis-values/">likelihood-based approach to calculating Basis values</a>. In that post, I hinted that likelihood-based approaches can be useful when dealing with censored&nbsp;data.</p> <p>First of all, what does censoring mean? It means that the value reported is either artificially high or artificially low …</p><p>Earlier, I wrote a post about using a <a href="https://www.kloppenborg.ca/2021/02/likelihood-basis-values/">likelihood-based approach to calculating Basis values</a>. In that post, I hinted that likelihood-based approaches can be useful when dealing with censored&nbsp;data.</p> <p>First of all, what does censoring mean? It means that the value reported is either artificially high or artificially low. There are a few reasons that this could happen. It happens often with lifetime data: with fatigue tests, you set a number of cycles at which the specimen &#8220;runs out&#8221; and you stop the test; with studies of mortality, some of the subjects will still be alive when you do the analysis. In these cases, the true value is greater than the observed result, but you don&#8217;t know by how much. These are examples of <em>right-censored</em>&nbsp;data.</p> <p>Data can also be <em>left-censored</em>, meaning that the true value is less than the observed value. This can happen if some of the values are too small to be measured. Perhaps the instrument that you&#8217;re using can&#8217;t detect values below a certain&nbsp;amount.</p> <p>There is also <em>interval-censored</em> data. This often occurs in survey data. For example, you might have data for individuals aged 40-44, but you don&#8217;t know where they fall within that&nbsp;range.</p> <p>In this post, we&#8217;re going to deal with <em>right-censored</em>&nbsp;data.</p> <p>At my day job, I often deal with data from testing of metallic inserts installed in honeycomb sandwich panel. These metallic inserts have a hole in their centers that will accept a screw. Their purpose is to allow a screw to be fastened to the panel, and the strength of this connections is one of the important&nbsp;considerations.</p> <p>We determine the strength of the insert through testing. The usual test coupon that we use has two of these inserts installed, and we pull them away from each other to measure the shear strength. This is a convenient way of applying the load, but I&#8217;ve long thought that it must give low results. The loading of the coupon looks like&nbsp;this:</p> <p><img alt="Coupon" src="https://www.kloppenborg.ca/2021/02/basis-values-censored-data/basis-values-censored-data_files/figure-markdown/coupon.png"></p> <p>The reason that I&#8217;ve thought that this test method will give artificially low results is the fact that there are two inserts. The test ends when either one of these two insert fails: the other insert must be stronger than the one that failed&nbsp;first.</p> <p>To illustrate this, let&#8217;s do a slightly silly thought experiment. Let&#8217;s imagine that we&#8217;re making a set of these coupons. We decide that we&#8217;re going to install one insert in each coupon first, then come back tomorrow and install the other insert in each coupon. Tomorrow comes around, and we decide to let the brand new intern install the second insert. The intern hasn&#8217;t yet been fully trained, and they accidentally install the wrong type of insert in teh second hole, but unfortunately they look identical to the correct type. The correct type of insert has a strength that is always $1000 lbf$, but we don&#8217;t know that yet. The wrong type of insert always has a strength of exactly $500 lbf$. When we do our tests, all of the coupons fail on the side that the intern installed (the wrong insert) and the strength of each coupon is $500 lbf$. We conclude that the mean strength of these inserts is $500 lbf$ with a very low&nbsp;variance.</p> <p>But, we&#8217;d be&nbsp;wrong.</p> <p>In this thought experiment, the actual mean strength of the inserts (considering both the correct and incorrect types of inserts) is $750 lbf$ and there&#8217;s actually a pretty high variance. We were simply unable to observe the strength of the stronger screws because of <em>censoring</em>.</p> <p>In a more realistic case, we&#8217;re actually going to be dealing with parts that have strengths drawn from the same continuous distribution. As we move on, we&#8217;re going to assume that the strength of each individual insert is a random variable drawn from the same continuous distribution (that is, they are <span class="caps">IID</span>).</p> <p>Let&#8217;s create some simulated data. We&#8217;ll start by loading a few R packages that we&#8217;ll&nbsp;need.</p> <div class="highlight"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span> <span class="nf">library</span><span class="p">(</span><span class="n">cmstatr</span><span class="p">)</span> <span class="nf">library</span><span class="p">(</span><span class="n">stats4</span><span class="p">)</span> </code></pre></div> <p>Next, we&#8217;ll create a sample of $40$ simulated insert strengths. These will be drawn from a normal distribution with a mean of $1000$ and a standard deviation of&nbsp;$100$.</p> <div class="highlight"><pre><span></span><code><span class="n">pop_mean</span> <span class="o">&lt;-</span> <span class="m">1000</span> <span class="n">pop_sd</span> <span class="o">&lt;-</span> <span class="m">100</span> <span class="nf">set.seed</span><span class="p">(</span><span class="m">123</span><span class="p">)</span> <span class="c1"># make this example reproducible</span> <span class="n">strength</span> <span class="o">&lt;-</span> <span class="nf">rnorm</span><span class="p">(</span><span class="m">40</span><span class="p">,</span> <span class="n">pop_mean</span><span class="p">,</span> <span class="n">pop_sd</span><span class="p">)</span> </code></pre></div> <p>Now let&#8217;s calculate the mean of this sample. We expect it to be fairly close to 1000, and indeed it&nbsp;is.</p> <div class="highlight"><pre><span></span><code><span class="nf">mean</span><span class="p">(</span><span class="n">strength</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 1004.518 </code></pre></div> <p>And we can also calculate the standard&nbsp;deviation:</p> <div class="highlight"><pre><span></span><code><span class="nf">sd</span><span class="p">(</span><span class="n">strength</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 89.77847 </code></pre></div> <p>For the strength of most aircraft structures, we are concerned with a lower tolerance bound of the strength. For multiple load-path structure, we need to calculate the B-Basis strength, which is the $95/%$ lower confidence bound on the 10-th percentile of the&nbsp;strength.</p> <p>Since we know the actual strength of all 40 inserts, we can calculate the B-Basis based on these actual insert strengths. Ideally, the B-Basis value that we calculate later will be close to this&nbsp;value.</p> <div class="highlight"><pre><span></span><code><span class="nf">basis_normal</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">strength</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code><span class="c1">## outliers_within_batch not run because parameter batch not specified</span><span class="w"></span> <span class="c1">## between_batch_variability not run because parameter batch not specified</span><span class="w"></span> <span class="c1">## </span><span class="w"></span> <span class="c1">## Call:</span><span class="w"></span> <span class="c1">## basis_normal(x = strength)</span><span class="w"></span> <span class="c1">## </span><span class="w"></span> <span class="c1">## Distribution: Normal ( n = 40 )</span><span class="w"></span> <span class="c1">## B-Basis: ( p = 0.9 , conf = 0.95 )</span><span class="w"></span> <span class="c1">## 852.1482</span><span class="w"></span> </code></pre></div> <p>Now, we&#8217;ll take these $40$ insert strengths and put them into $20$ coupons: each with two inserts. The observed coupon strength will be set to the <em>lower</em> of the two inserts installed in that coupon, because the coupon will fail as soon as either one of the installed inserts&nbsp;fails.</p> <div class="highlight"><pre><span></span><code><span class="n">dat</span> <span class="o">&lt;-</span> <span class="nf">data.frame</span><span class="p">(</span> <span class="n">ID</span> <span class="o">=</span> <span class="m">1</span><span class="o">:</span><span class="m">20</span><span class="p">,</span> <span class="n">strength1</span> <span class="o">=</span> <span class="n">strength</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">20</span><span class="p">],</span> <span class="n">strength2</span> <span class="o">=</span> <span class="n">strength</span><span class="p">[</span><span class="m">21</span><span class="o">:</span><span class="m">40</span><span class="p">]</span> <span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">rowwise</span><span class="p">()</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">strength_observed</span> <span class="o">=</span> <span class="nf">min</span><span class="p">(</span><span class="n">strength1</span><span class="p">,</span> <span class="n">strength2</span><span class="p">))</span> <span class="o">%&gt;%</span> <span class="nf">ungroup</span><span class="p">()</span> <span class="n">dat</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 20 × 4 ## ID strength1 strength2 strength_observed ## &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 944. 893. 893. ## 2 2 977. 978. 977. ## 3 3 1156. 897. 897. ## 4 4 1007. 927. 927. ## 5 5 1013. 937. 937. ## 6 6 1172. 831. 831. ## 7 7 1046. 1084. 1046. ## 8 8 873. 1015. 873. ## 9 9 931. 886. 886. ## 10 10 955. 1125. 955. ## 11 11 1122. 1043. 1043. ## 12 12 1036. 970. 970. ## 13 13 1040. 1090. 1040. ## 14 14 1011. 1088. 1011. ## 15 15 944. 1082. 944. ## 16 16 1179. 1069. 1069. ## 17 17 1050. 1055. 1050. ## 18 18 803. 994. 803. ## 19 19 1070. 969. 969. ## 20 20 953. 962. 953. </code></pre></div> <p>Let&#8217;s look at the summary statistics for this&nbsp;data:</p> <div class="highlight"><pre><span></span><code><span class="n">dat</span> <span class="o">%&gt;%</span> <span class="nf">summarise</span><span class="p">(</span> <span class="n">mean</span> <span class="o">=</span> <span class="nf">mean</span><span class="p">(</span><span class="n">strength_observed</span><span class="p">),</span> <span class="n">sd</span> <span class="o">=</span> <span class="nf">sd</span><span class="p">(</span><span class="n">strength_observed</span><span class="p">),</span> <span class="n">cv</span> <span class="o">=</span> <span class="nf">cv</span><span class="p">(</span><span class="n">strength_observed</span><span class="p">)</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## # A tibble: 1 × 3 ## mean sd cv ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 954. 75.1 0.0788 </code></pre></div> <p>Hmmm. We see the mean is much lower than the mean of the individual insert strength. Remember that the mean insert strength was $1005$, but the mean strength of the coupons is&nbsp;$954$.</p> <p>Next, we&#8217;ll naively calculate a B-Basis value from the measured strength. We&#8217;ll assume a normal&nbsp;distribution.</p> <div class="highlight"><pre><span></span><code><span class="n">dat</span> <span class="o">%&gt;%</span> <span class="nf">basis_normal</span><span class="p">(</span><span class="n">strength_observed</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## ## Call: ## basis_normal(data = ., x = strength_observed) ## ## Distribution: Normal ( n = 20 ) ## B-Basis: ( p = 0.9 , conf = 0.95 ) ## 809.1911 </code></pre></div> <p>We&#8217;ll just keep this number in mind for now and we&#8217;ll move on to the idea of using a likelihood-based approach to calculate a B-Basis value, considering the fact that this data is&nbsp;censored.</p> <p>The way that this data is censored might not be immediately obvious. But, each time we test one of these coupons, which contain two inserts, we actually get two pieces of data. We get the strength of one of the inserts. This is an <em>exact</em> value. But we also get a second piece of data. We know that the strength of the other insert is at least as high as the one that failed first. This is a <em>right censored</em>&nbsp;value.</p> <p>In the <a href="https://www.kloppenborg.ca/2021/02/likelihood-basis-values/">previous post</a>, I gave an expression for the likelihood function. However, that function only considers exact observations. The expression for the likelihood, considering censored data as follows (see&nbsp;<a href='#Meeker_Hahn_Escobar2017' id='ref-Meeker_Hahn_Escobar2017-1'> Meeker et al. (2017) </a>).</p> <p>$$\mathcal{L}\left(\theta\right) = \prod_{i=1}^{n} \begin{cases} f\left(X_i;\,\theta\right) <span class="amp">&amp;</span> \mbox{if } X_i \mbox{ is exact} \ F\left(X_i;\,\theta\right) <span class="amp">&amp;</span> \mbox{if } X_i \mbox{ is left censored} \ 1 - F\left(X_i;\,\theta\right) <span class="amp">&amp;</span> \mbox{if } X_i \mbox{ is right censored} \end{cases}&nbsp;$$</p> <p>Where $f()$ is the probability density function and $F()$ is the cumulative density&nbsp;function.</p> <p>We can implement a log-likelihood function based on this in R as&nbsp;follows:</p> <div class="highlight"><pre><span></span><code><span class="n">log_likelihood_normal</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">censored</span><span class="p">)</span> <span class="p">{</span> <span class="nf">suppressWarnings</span><span class="p">(</span> <span class="nf">sum</span><span class="p">(</span><span class="nf">map2_dbl</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">censored</span><span class="p">,</span> <span class="nf">function</span><span class="p">(</span><span class="n">xi</span><span class="p">,</span> <span class="n">ci</span><span class="p">)</span> <span class="p">{</span> <span class="nf">if </span><span class="p">(</span><span class="n">ci</span> <span class="o">==</span> <span class="s">&quot;exact&quot;</span><span class="p">)</span> <span class="p">{</span> <span class="nf">dnorm</span><span class="p">(</span><span class="n">xi</span><span class="p">,</span> <span class="n">mean</span> <span class="o">=</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sd</span> <span class="o">=</span> <span class="n">sig</span><span class="p">,</span> <span class="n">log</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="p">}</span> <span class="n">else</span> <span class="nf">if </span><span class="p">(</span><span class="n">ci</span> <span class="o">==</span> <span class="s">&quot;left&quot;</span><span class="p">)</span> <span class="p">{</span> <span class="nf">pnorm</span><span class="p">(</span><span class="n">xi</span><span class="p">,</span> <span class="n">mean</span> <span class="o">=</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sd</span> <span class="o">=</span> <span class="n">sig</span><span class="p">,</span> <span class="n">log.p</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="p">}</span> <span class="n">else</span> <span class="nf">if </span><span class="p">(</span><span class="n">ci</span> <span class="o">==</span> <span class="s">&quot;right&quot;</span><span class="p">)</span> <span class="p">{</span> <span class="nf">pnorm</span><span class="p">(</span><span class="n">xi</span><span class="p">,</span> <span class="n">mean</span> <span class="o">=</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sd</span> <span class="o">=</span> <span class="n">sig</span><span class="p">,</span> <span class="n">log.p</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> <span class="n">lower.tail</span> <span class="o">=</span> <span class="kc">FALSE</span><span class="p">)</span> <span class="p">}</span> <span class="n">else</span> <span class="p">{</span> <span class="nf">stop</span><span class="p">(</span><span class="s">&quot;Invalid value of censored&quot;</span><span class="p">)</span> <span class="p">}</span> <span class="p">}))</span> <span class="p">)</span> <span class="p">}</span> </code></pre></div> <p>We can use this log-likelihood function to find the maximum-likelihood estimates (<span class="caps">MLE</span>) of the population parameters using the <code>stats4::mle()</code> function. First, we&#8217;ll find the <span class="caps">MLE</span> based only on the observed strength of each coupon, taken as a single exact&nbsp;value.</p> <div class="highlight"><pre><span></span><code><span class="nf">mle</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">)</span> <span class="p">{</span> <span class="o">-</span><span class="nf">log_likelihood_normal</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">,</span> <span class="n">dat</span><span class="o">$</span><span class="n">strength_observed</span><span class="p">,</span> <span class="s">&quot;exact&quot;</span><span class="p">)</span> <span class="p">},</span> <span class="n">start</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">1000</span><span class="p">,</span> <span class="m">100</span><span class="p">)</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## ## Call: ## mle(minuslogl = function(mu, sig) { ## -log_likelihood_normal(mu, sig, dat$strength_observed, &quot;exact&quot;) ## }, start = c(1000, 100)) ## ## Coefficients: ## mu sig ## 953.70230 73.27344 </code></pre></div> <p>(Note that the value of start is just a starting point for the numeric root&nbsp;finding.)</p> <p>Here, we get the same value of mean that we previously&nbsp;calculated.</p> <p>Now, we&#8217;ll repeat the <span class="caps">MLE</span> procedure, but now give it two pieces of data for each coupon: one exact value, and one right-censored&nbsp;value.</p> <div class="highlight"><pre><span></span><code><span class="nf">mle</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">)</span> <span class="p">{</span> <span class="o">-</span><span class="nf">log_likelihood_normal</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">,</span> <span class="nf">c</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength_observed</span><span class="p">,</span> <span class="n">dat</span><span class="o">$</span><span class="n">strength_observed</span><span class="p">),</span> <span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="s">&quot;exact&quot;</span><span class="p">,</span> <span class="m">20</span><span class="p">),</span> <span class="nf">rep</span><span class="p">(</span><span class="s">&quot;right&quot;</span><span class="p">,</span> <span class="m">20</span><span class="p">)))</span> <span class="p">},</span> <span class="n">start</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">1000</span><span class="p">,</span> <span class="m">100</span><span class="p">)</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## ## Call: ## mle(minuslogl = function(mu, sig) { ## -log_likelihood_normal(mu, sig, c(dat$strength_observed, ## dat$strength_observed), c(rep(&quot;exact&quot;, 20), rep(&quot;right&quot;, ## 20))) ## }, start = c(1000, 100)) ## ## Coefficients: ## mu sig ## 1003.90717 88.51774 </code></pre></div> <p>The mean estimated this way is remarkably close to the true&nbsp;value.</p> <p>As we did in the previous blog post, we&#8217;ll next create a function that returns the profile likelihood based on a value of $t_p$ (the value that the proportion $p$ of the population is&nbsp;below).</p> <div class="highlight"><pre><span></span><code><span class="n">profile_likelihood_normal</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">tp</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">censored</span><span class="p">)</span> <span class="p">{</span> <span class="n">m</span> <span class="o">&lt;-</span> <span class="nf">mle</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">)</span> <span class="p">{</span> <span class="o">-</span><span class="nf">log_likelihood_normal</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">censored</span><span class="p">)</span> <span class="p">},</span> <span class="n">start</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">1000</span><span class="p">,</span> <span class="m">100</span><span class="p">)</span> <span class="c1"># A starting guess</span> <span class="p">)</span> <span class="n">mu_hat</span> <span class="o">&lt;-</span> <span class="n">m</span><span class="o">&#64;</span><span class="n">coef</span><span class="p">[</span><span class="m">1</span><span class="p">]</span> <span class="n">sig_hat</span> <span class="o">&lt;-</span> <span class="n">m</span><span class="o">&#64;</span><span class="n">coef</span><span class="p">[</span><span class="m">2</span><span class="p">]</span> <span class="n">ll_hat</span> <span class="o">&lt;-</span> <span class="nf">log_likelihood_normal</span><span class="p">(</span><span class="n">mu_hat</span><span class="p">,</span> <span class="n">sig_hat</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">censored</span><span class="p">)</span> <span class="nf">optimise</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">sig</span><span class="p">)</span> <span class="p">{</span> <span class="nf">exp</span><span class="p">(</span> <span class="nf">log_likelihood_normal</span><span class="p">(</span> <span class="n">mu</span> <span class="o">=</span> <span class="n">tp</span> <span class="o">-</span> <span class="n">sig</span> <span class="o">*</span> <span class="nf">qnorm</span><span class="p">(</span><span class="n">p</span><span class="p">),</span> <span class="n">sig</span> <span class="o">=</span> <span class="n">sig</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="p">,</span> <span class="n">censored</span> <span class="o">=</span> <span class="n">censored</span> <span class="p">)</span> <span class="o">-</span> <span class="n">ll_hat</span> <span class="p">)</span> <span class="p">},</span> <span class="n">interval</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="n">sig_hat</span> <span class="o">*</span> <span class="m">5</span><span class="p">),</span> <span class="n">maximum</span> <span class="o">=</span> <span class="kc">TRUE</span> <span class="p">)</span><span class="o">$</span><span class="n">objective</span> <span class="p">}</span> </code></pre></div> <p>The shape of this curve is as&nbsp;follows:</p> <div class="highlight"><pre><span></span><code><span class="nf">data.frame</span><span class="p">(</span> <span class="n">tp</span> <span class="o">=</span> <span class="nf">seq</span><span class="p">(</span><span class="m">700</span><span class="p">,</span> <span class="m">1000</span><span class="p">,</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">200</span><span class="p">)</span> <span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">rowwise</span><span class="p">()</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">R</span> <span class="o">=</span> <span class="nf">profile_likelihood_normal</span><span class="p">(</span> <span class="n">tp</span><span class="p">,</span> <span class="m">0.1</span><span class="p">,</span> <span class="nf">c</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength_observed</span><span class="p">,</span> <span class="n">dat</span><span class="o">$</span><span class="n">strength_observed</span><span class="p">),</span> <span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="s">&quot;exact&quot;</span><span class="p">,</span> <span class="m">20</span><span class="p">),</span> <span class="nf">rep</span><span class="p">(</span><span class="s">&quot;right&quot;</span><span class="p">,</span> <span class="m">20</span><span class="p">))</span> <span class="p">))</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">tp</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">R</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_line</span><span class="p">()</span> <span class="o">+</span> <span class="nf">ggtitle</span><span class="p">(</span><span class="s">&quot;Profile Likelihood for the 10th Percentile&quot;</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-13-1" src="https://www.kloppenborg.ca/2021/02/basis-values-censored-data/basis-values-censored-data_files/figure-markdown/unnamed-chunk-13-1.png"></p> <p>Next, we&#8217;ll find the value of$u$that satisfies this&nbsp;equation:</p> <p>$$0.05 = \frac{ \int_{-\infty}^{u}R(t_p) d t_p }{ \int_{-\infty}^{\infty}R(t_p) d t_p }&nbsp;$$</p> <div class="highlight"><pre><span></span><code><span class="n">fn</span> <span class="o">&lt;-</span> <span class="nf">Vectorize</span><span class="p">(</span><span class="nf">function</span><span class="p">(</span><span class="n">tp</span><span class="p">)</span> <span class="p">{</span> <span class="nf">profile_likelihood_normal</span><span class="p">(</span> <span class="n">tp</span><span class="p">,</span> <span class="m">0.1</span><span class="p">,</span> <span class="nf">c</span><span class="p">(</span><span class="n">dat</span><span class="o">$</span><span class="n">strength_observed</span><span class="p">,</span> <span class="n">dat</span><span class="o">$</span><span class="n">strength_observed</span><span class="p">),</span> <span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="s">&quot;exact&quot;</span><span class="p">,</span> <span class="m">20</span><span class="p">),</span> <span class="nf">rep</span><span class="p">(</span><span class="s">&quot;right&quot;</span><span class="p">,</span> <span class="m">20</span><span class="p">)))</span> <span class="p">})</span> <span class="n">denominator</span> <span class="o">&lt;-</span> <span class="nf">integrate</span><span class="p">(</span> <span class="n">f</span> <span class="o">=</span> <span class="n">fn</span><span class="p">,</span> <span class="n">lower</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">upper</span> <span class="o">=</span> <span class="m">1000</span> <span class="p">)</span> <span class="nf">uniroot</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">upper</span><span class="p">)</span> <span class="p">{</span> <span class="n">trial_area</span> <span class="o">&lt;-</span> <span class="nf">integrate</span><span class="p">(</span> <span class="n">fn</span><span class="p">,</span> <span class="n">lower</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">upper</span> <span class="o">=</span> <span class="n">upper</span> <span class="p">)</span> <span class="nf">return</span><span class="p">(</span><span class="n">trial_area</span><span class="o">$</span><span class="n">value</span> <span class="o">/</span> <span class="n">denominator</span><span class="o">$</span><span class="n">value</span> <span class="o">-</span> <span class="m">0.05</span><span class="p">)</span> <span class="p">},</span> <span class="n">interval</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">700</span><span class="p">,</span> <span class="m">1000</span><span class="p">)</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>##$root ## [1] 845.7739 ## ## $f.root ## [1] -1.8327e-08 ## ##$iter ## [1] 10 ## ## $init.it ## [1] NA ## ##$estim.prec ## [1] 6.103516e-05 </code></pre></div> <p>This value of $846$ is much higher than the value of $809$ that we found earlier based on the coupon strength. But, this value of $846$ is a little lower than the B-Basis of $852$ that was based on the actual strength of all of the inserts&nbsp;installed.</p> <p>One way to view the differences between these three numbers is as follows. The B-Basis strength is related to the 10-th percentile of the strength. But it is actually a confidence bound on the 10-th percentile. If we have only a little bit of information about the strength, there is a lot of uncertainty about the actual 10-th percentile, so the lower confidence bound is quite low. If we have a lot of information about the strength, the uncertainty is small, so the lower confidence bound is close to the actual 10-th&nbsp;percentile.</p> <p>When we calculated a B-Basis from the observed coupon strength, we had 20 pieces of information. When we calculated a B-Basis from the actual insert strength, we had 40 pieces of information. When we calculated the B-Basis value considering the censored data, we had 40 pieces of information, but half that information wasn&#8217;t as informative as the other half: the exact values provide more information than the censored&nbsp;values.</p>Basis Values Using a Likelihood Approach2021-02-09T00:00:00-05:002021-02-09T00:00:00-05:00Stefan Kloppenborgtag:www.kloppenborg.ca,2021-02-09:/2021/02/likelihood-basis-values/<p>All materials have some variability in their strength: some pieces of a given material are stronger than others. The design standards for civil aircraft mandate that one must account for this material variability. This is done by setting appropriate material allowables such that either $90\%$ or $99\%$ of the material …</p><p>All materials have some variability in their strength: some pieces of a given material are stronger than others. The design standards for civil aircraft mandate that one must account for this material variability. This is done by setting appropriate material allowables such that either $90\%$ or $99\%$ of the material will have a strength greater than the allowable with $95\%$ confidence. These values are referred to as B-Basis and A-Basis values, respectively. In the language of statistics, they are lower tolerance bounds on the material&nbsp;strength.</p> <p>When you&#8217;re designing an aircraft part, one of the first steps is to determine the allowables to which you&#8217;ll compare the stress when determining the margin of safety. For many metals, A- or B-Basis values are published, and the designer will use those published values as the allowable. However, when it comes to composite materials, it is often up to the designer to determine the A- or B-Basis value&nbsp;themselves.</p> <p>The most common way of calculating Basis values is to use the statistical methods published in Volume 1 of <a href="https://www.cmh17.org/"><span class="caps">CMH</span>-17</a> and implemented in the R package <a href="https://www.cmstatr.net/"><code>cmstatr</code></a> (among other implementations). These methods area based on <em>frequentest inference</em>.</p> <p>For example, if the data is assumed to be normally distributed, with this frequentest approach, you would calculate the B-Basis value using the non-central <em>t-</em>distribution (see, for example,&nbsp;<a href='#Krishnamoorthy_Mathew_2008' id='ref-Krishnamoorthy_Mathew_2008-1'> Krishnamoorthy and Mathew (2008) </a>).</p> <p>However, the frequentest approach is not the only way to calculate Basis values: a <em>likelihood</em>-based approach can be used as well. The book <em>Statistical Intervals</em> by <a href='#Meeker_Hahn_Escobar2017' id='ref-Meeker_Hahn_Escobar2017-1'> Meeker et al. (2017) </a> discusses this approach, among other&nbsp;topics.</p> <p>The basic idea of a likelihood-based inference is that you can observe some data (by doing mechanical tests, or whatever), but you don&#8217;t yet know the population parameters, such as the mean and the variance. But, you can say that some possible values of the population parameters are more likely than others. For example, if you perform 18 tension tests of a material and the results are all around 100, the likelihood that the population mean is 100 is pretty high, but the likelihood that the population mean is 50 is really low. You can define a mathematical function to quantify this likelihood: this is called the <em>likelihood function</em>.</p> <p>If you just need a point-estimate of the population parameters, you can find the highest value of this likelihood function: this is called the maximum likelihood estimate. If you need to find an interval or a bound (for example, the B-Basis, which is a lower tolerance bound), you can plot this likelihood function versus the population parameters and use this distribution of likelihood to determine a range of population parameters that are &#8220;sufficiently likely&#8221; to be within the&nbsp;interval.</p> <p>The likelihood-based approach to calculating Basis values is more computationally expensive, but it allows you to deal with data that is left- or right-censored, and you can use the same computational algorithm for a wide variety of location-scale distributions. I&#8217;m planning on writing about calculating Basis values for censored data&nbsp;soon.</p> <h1>Example&nbsp;Data</h1> <p>For the purpose of this blog post, we&#8217;ll look at some data that is included in the <code>cmstatr</code> package. We&#8217;ll use this data to calculate a B-Basis value using the more traditional frequentest approach, then using a likelihood-based&nbsp;approach.</p> <p>We&#8217;ll start by loading several R packages that we&#8217;ll&nbsp;need:</p> <div class="highlight"><pre><span></span><code><span class="nf">library</span><span class="p">(</span><span class="n">cmstatr</span><span class="p">)</span> <span class="nf">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span> <span class="nf">library</span><span class="p">(</span><span class="n">stats4</span><span class="p">)</span> </code></pre></div> <p>Next, we&#8217;ll get the data that we&#8217;re going to use. We&#8217;ll use the &#8220;warp tension&#8221; data from the <code>carbon.fabric.2</code> data set that comes with <code>cmstatr</code>. We&#8217;ll consider only the <code>RTD</code> environmental&nbsp;condition.</p> <div class="highlight"><pre><span></span><code><span class="n">carbon.fabric.2</span> <span class="o">%&gt;%</span> <span class="nf">filter</span><span class="p">(</span><span class="n">test</span> <span class="o">==</span> <span class="s">&quot;WT&quot;</span> <span class="o">&amp;</span> <span class="n">condition</span> <span class="o">==</span> <span class="s">&quot;RTD&quot;</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## test condition batch panel thickness nplies strength modulus failure_mode ## 1 WT RTD A 1 0.113 14 129.224 8.733 LAB ## 2 WT RTD A 1 0.112 14 144.702 8.934 LAT,LWB ## 3 WT RTD A 1 0.113 14 137.194 8.896 LAB ## 4 WT RTD A 1 0.113 14 139.728 8.835 LAT,LWB ## 5 WT RTD A 2 0.113 14 127.286 9.220 LAB ## 6 WT RTD A 2 0.111 14 129.261 9.463 LAT ## 7 WT RTD A 2 0.112 14 130.031 9.348 LAB ## 8 WT RTD B 1 0.111 14 140.038 9.244 LAT,LGM ## 9 WT RTD B 1 0.111 14 132.880 9.267 LWT ## 10 WT RTD B 1 0.113 14 132.104 9.198 LAT ## 11 WT RTD B 2 0.114 14 137.618 9.179 LAT,LAB ## 12 WT RTD B 2 0.113 14 139.217 9.123 LAB ## 13 WT RTD B 2 0.113 14 134.912 9.116 LAT ## 14 WT RTD B 2 0.111 14 141.558 9.434 LAB / LAT ## 15 WT RTD C 1 0.108 14 150.242 9.451 LAB ## 16 WT RTD C 1 0.109 14 147.053 9.391 LGM ## 17 WT RTD C 1 0.111 14 145.001 9.318 LAT,LWB ## 18 WT RTD C 1 0.113 14 135.686 8.991 LAT / LAB ## 19 WT RTD C 1 0.112 14 136.075 9.221 LAB ## 20 WT RTD C 2 0.114 14 143.738 8.803 LAT,LGM ## 21 WT RTD C 2 0.113 14 143.715 8.893 LAT,LAB ## 22 WT RTD C 2 0.113 14 147.981 8.974 LGM,LWB ## 23 WT RTD C 2 0.112 14 148.418 9.118 LAT,LWB ## 24 WT RTD C 2 0.113 14 135.435 9.217 LAT/LAB ## 25 WT RTD C 2 0.113 14 146.285 8.920 LWT/LWB ## 26 WT RTD C 2 0.111 14 139.078 9.015 LAT ## 27 WT RTD C 2 0.112 14 146.825 9.036 LAT/LWT ## 28 WT RTD C 2 0.110 14 148.235 9.336 LWB/LAB </code></pre></div> <p>We really care only about the strength vector from this data, so we&#8217;ll save that vectory by itself in a variable for easy access&nbsp;later.</p> <div class="highlight"><pre><span></span><code><span class="n">dat</span> <span class="o">&lt;-</span> <span class="p">(</span><span class="n">carbon.fabric.2</span> <span class="o">%&gt;%</span> <span class="nf">filter</span><span class="p">(</span><span class="n">test</span> <span class="o">==</span> <span class="s">&quot;WT&quot;</span> <span class="o">&amp;</span> <span class="n">condition</span> <span class="o">==</span> <span class="s">&quot;RTD&quot;</span><span class="p">))[[</span><span class="s">&quot;strength&quot;</span><span class="p">]]</span> <span class="n">dat</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 129.224 144.702 137.194 139.728 127.286 129.261 130.031 140.038 132.880 ## [10] 132.104 137.618 139.217 134.912 141.558 150.242 147.053 145.001 135.686 ## [19] 136.075 143.738 143.715 147.981 148.418 135.435 146.285 139.078 146.825 ## [28] 148.235 </code></pre></div> <h1>Frequentest&nbsp;B-Basis</h1> <p>We can use the <code>cmstatr</code> package to calculate the B-Basis value from this example data. We&#8217;re going to assume that the data follows a normal distribution throughout this blog&nbsp;post.</p> <div class="highlight"><pre><span></span><code><span class="nf">basis_normal</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">dat</span><span class="p">,</span> <span class="n">p</span> <span class="o">=</span> <span class="m">0.9</span><span class="p">,</span> <span class="n">conf</span> <span class="o">=</span> <span class="m">0.95</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## ## Call: ## basis_normal(x = dat, p = 0.9, conf = 0.95) ## ## Distribution: Normal ( n = 28 ) ## B-Basis: ( p = 0.9 , conf = 0.95 ) ## 127.5415 </code></pre></div> <p>So using this approach, we get a B-Basis value of&nbsp;$127.54$.</p> <h1>Likelihood-Based&nbsp;B-Basis</h1> <p>The first step in implementing a likelihood-based approach is to define a likelihood function. This function is the product of the probability density function (<span class="caps">PDF</span>) at each observation ($X_i$), given a set of population parameters ($\theta$) (see&nbsp;<a href='#Wasserman_2004' id='ref-Wasserman_2004-1'> Wasserman (2004) </a>).</p> <p>$$\mathcal{L}\left(\theta\right) = \prod_{i=1}^{n} f\left(X_i;\,\theta\right)&nbsp;$$</p> <p>We&#8217;ll actually implement a log-likelihood function in R because taking a log-transform avoids some numerical issues. This log-likelihood function will take three arguments: the two parameters of the distribution (<code>mu</code> and <code>sigma</code>) and a vector of the&nbsp;data.</p> <div class="highlight"><pre><span></span><code><span class="n">log_likelihood_normal</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="nf">suppressWarnings</span><span class="p">(</span> <span class="nf">sum</span><span class="p">(</span> <span class="nf">dnorm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mean</span> <span class="o">=</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sd</span> <span class="o">=</span> <span class="n">sig</span><span class="p">,</span> <span class="n">log</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="p">)</span> <span class="p">)</span> <span class="p">}</span> </code></pre></div> <p>We can use this log-likelihood function to find the maximum-likelihood estimates (<span class="caps">MLE</span>) of the population parameters using the <code>stats4::mle()</code> function. This function takes the negative log-likelihood function and a starting guess for the&nbsp;parameters.</p> <div class="highlight"><pre><span></span><code><span class="nf">mle</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">)</span> <span class="p">{</span> <span class="o">-</span><span class="nf">log_likelihood_normal</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">,</span> <span class="n">dat</span><span class="p">)</span> <span class="p">},</span> <span class="n">start</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">130</span><span class="p">,</span> <span class="m">6.5</span><span class="p">)</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## ## Call: ## mle(minuslogl = function(mu, sig) { ## -log_likelihood_normal(mu, sig, dat) ## }, start = c(130, 6.5)) ## ## Coefficients: ## mu sig ## 139.626036 6.594905 </code></pre></div> <p>We will be denoting these maximum likelihood estimates as $\hat\mu$ and $\hat\sigma$. They match the sample mean and sample standard deviation within a reasonable tolerance, but are not exactly&nbsp;equal.</p> <div class="highlight"><pre><span></span><code><span class="nf">mean</span><span class="p">(</span><span class="n">dat</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 139.6257 </code></pre></div> <div class="highlight"><pre><span></span><code><span class="nf">sd</span><span class="p">(</span><span class="n">dat</span><span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## [1] 6.716047 </code></pre></div> <p>The relative likelihood is the ratio between the value of the likelihood function evaluated at a given set of parameters to the value of the likelihood function evaluated at the <span class="caps">MLE</span> of the parameters. The relative likelihood would then be a function with two arguments: one for each of the parameters $\mu$ and $\sigma$. To reduce the number of arguments, <a href='#Meeker_Hahn_Escobar2017' id='ref-Meeker_Hahn_Escobar2017-2'> Meeker et al. (2017) </a> use a <em>profile likelihood</em> function instead. This the same as the likelihood ratio, but it is maximized with respect to $\sigma$, as defined&nbsp;below:</p> <p>$$R\left(\mu\right) = \max_\sigma \left[\frac{\mathcal{L}\left(\mu, \sigma\right)}{\mathcal{L}\left(\hat\mu, \hat\sigma\right)}\right]&nbsp;$$</p> <p>When we&#8217;re trying to calculate a Basis value, we don&#8217;t really care about the mean as a population parameter. Instead, we care about a particular proportion of the population. Since a normal distribution (or other location-scale distributions) are uniquely defined by two parameters, <a href='#Meeker_Hahn_Escobar2017' id='ref-Meeker_Hahn_Escobar2017-3'> Meeker et al. (2017) </a> note that you can use two alternate parameters instead. In our case, we&#8217;ll keep $\sigma$ as one of the parameters, but we&#8217;ll use $t_p$ as the other instead. Here, $t_p$ is the value that the proportion $p$ of the population falls below. For example, $t_{0.1}$ would represent the 10-th percentile of the&nbsp;population.</p> <p>We can convert between $\mu$ and $t_p$ as&nbsp;follows:</p> <p>$$\mu = t_p - \sigma \Phi^{-1}\left(p\right)&nbsp;$$</p> <p>Given this re-parameterization, we can implement the profile likelihood function as&nbsp;follows:</p> <div class="highlight"><pre><span></span><code><span class="n">profile_likelihood_normal</span> <span class="o">&lt;-</span> <span class="nf">function</span><span class="p">(</span><span class="n">tp</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="n">m</span> <span class="o">&lt;-</span> <span class="nf">mle</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">)</span> <span class="p">{</span> <span class="o">-</span><span class="nf">log_likelihood_normal</span><span class="p">(</span><span class="n">mu</span><span class="p">,</span> <span class="n">sig</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="p">},</span> <span class="n">start</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">130</span><span class="p">,</span> <span class="m">6.5</span><span class="p">)</span> <span class="p">)</span> <span class="n">mu_hat</span> <span class="o">&lt;-</span> <span class="n">m</span><span class="o">&#64;</span><span class="n">coef</span><span class="p">[</span><span class="m">1</span><span class="p">]</span> <span class="n">sig_hat</span> <span class="o">&lt;-</span> <span class="n">m</span><span class="o">&#64;</span><span class="n">coef</span><span class="p">[</span><span class="m">2</span><span class="p">]</span> <span class="n">ll_hat</span> <span class="o">&lt;-</span> <span class="nf">log_likelihood_normal</span><span class="p">(</span><span class="n">mu_hat</span><span class="p">,</span> <span class="n">sig_hat</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="nf">optimise</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">sig</span><span class="p">)</span> <span class="p">{</span> <span class="nf">exp</span><span class="p">(</span> <span class="nf">log_likelihood_normal</span><span class="p">(</span> <span class="n">mu</span> <span class="o">=</span> <span class="n">tp</span> <span class="o">-</span> <span class="n">sig</span> <span class="o">*</span> <span class="nf">qnorm</span><span class="p">(</span><span class="n">p</span><span class="p">),</span> <span class="n">sig</span> <span class="o">=</span> <span class="n">sig</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="n">x</span> <span class="p">)</span> <span class="o">-</span> <span class="n">ll_hat</span> <span class="p">)</span> <span class="p">},</span> <span class="n">interval</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="n">sig_hat</span> <span class="o">*</span> <span class="m">5</span><span class="p">),</span> <span class="n">maximum</span> <span class="o">=</span> <span class="kc">TRUE</span> <span class="p">)</span><span class="o">$</span><span class="n">objective</span> <span class="p">}</span> </code></pre></div> <p>We can visualize the profile likelihood&nbsp;function:</p> <div class="highlight"><pre><span></span><code><span class="nf">data.frame</span><span class="p">(</span> <span class="n">tp</span> <span class="o">=</span> <span class="nf">seq</span><span class="p">(</span><span class="m">120</span><span class="p">,</span> <span class="m">140</span><span class="p">,</span> <span class="n">length.out</span> <span class="o">=</span> <span class="m">200</span><span class="p">)</span> <span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">rowwise</span><span class="p">()</span> <span class="o">%&gt;%</span> <span class="nf">mutate</span><span class="p">(</span><span class="n">R</span> <span class="o">=</span> <span class="nf">profile_likelihood_normal</span><span class="p">(</span><span class="n">tp</span><span class="p">,</span> <span class="m">0.1</span><span class="p">,</span> <span class="n">dat</span><span class="p">))</span> <span class="o">%&gt;%</span> <span class="nf">ggplot</span><span class="p">(</span><span class="nf">aes</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">tp</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">R</span><span class="p">))</span> <span class="o">+</span> <span class="nf">geom_line</span><span class="p">()</span> <span class="o">+</span> <span class="nf">ggtitle</span><span class="p">(</span><span class="s">&quot;Profile Likelihood for the 10th Percentile&quot;</span><span class="p">)</span> </code></pre></div> <p><img alt="unnamed-chunk-9-1" src="https://www.kloppenborg.ca/2021/02/likelihood-basis-values/likelihood-basis-values_files/figure-markdown/unnamed-chunk-9-1.png"></p> <p>The way to interpret this plot is that it&#8217;s quite unlikely that the true value of$t_p$is 120, and it&#8217;s unlikely that it&#8217;s 140, but it&#8217;s pretty likely that it&#8217;s around&nbsp;131.</p> <p>However, when we&#8217;re calculating Basis values, we aren&#8217;t trying to find the most likely value of$t_p$: we&#8217;re trying to find a lower bound of the value of&nbsp;$t_p$.</p> <p>The asymptotic distribution of$R$is the$\chi^2$distribution. If you&#8217;re working with large samples, you can use this fact to determine the lower bound of$t_p$. However, for the sample sizes that are typically used for composite material testing, the actual distribution of$R$is far enough from a$\chi^2$distribution, that you can&#8217;t actually do&nbsp;this.</p> <p>Instead, we can use numerical integration to find the lower tolerance bound. We can find a value of$t_p$, which we&#8217;ll call$u$, where$0.05\%$of the area under the$R$curve is to its left. This will give the$95\%$lower confidence bound on the population parameter. This can be written as follows. We&#8217;ll use numerical root finding to solve this expression for&nbsp;$u$.</p> <p>$$0.05 = \frac{ \int_{-\infty}^{u}R(t_p) d t_p }{ \int_{-\infty}^{\infty}R(t_p) d t_p }&nbsp;$$</p> <p>Since the value of$R$vanishes as we move far from about 130, we won&#8217;t actually integrate from$-\infty$to$\infty$, but rather integrate between two values are are relatively far from the peak of the$R$&nbsp;curve.</p> <p>We can implement this in the R language as follows. First, we&#8217;ll find the value of the&nbsp;denominator.</p> <div class="highlight"><pre><span></span><code><span class="n">fn</span> <span class="o">&lt;-</span> <span class="nf">Vectorize</span><span class="p">(</span><span class="nf">function</span><span class="p">(</span><span class="n">tp</span><span class="p">)</span> <span class="p">{</span> <span class="nf">profile_likelihood_normal</span><span class="p">(</span><span class="n">tp</span><span class="p">,</span> <span class="m">0.1</span><span class="p">,</span> <span class="n">dat</span><span class="p">)</span> <span class="p">})</span> <span class="n">denominator</span> <span class="o">&lt;-</span> <span class="nf">integrate</span><span class="p">(</span> <span class="n">f</span> <span class="o">=</span> <span class="n">fn</span><span class="p">,</span> <span class="n">lower</span> <span class="o">=</span> <span class="m">100</span><span class="p">,</span> <span class="n">upper</span> <span class="o">=</span> <span class="m">150</span> <span class="p">)</span> <span class="n">denominator</span> </code></pre></div> <div class="highlight"><pre><span></span><code>## 4.339919 with absolute error &lt; 8.9e-07 </code></pre></div> <div class="highlight"><pre><span></span><code><span class="nf">uniroot</span><span class="p">(</span> <span class="nf">function</span><span class="p">(</span><span class="n">upper</span><span class="p">)</span> <span class="p">{</span> <span class="n">trial_area</span> <span class="o">&lt;-</span> <span class="nf">integrate</span><span class="p">(</span> <span class="n">fn</span><span class="p">,</span> <span class="n">lower</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">upper</span> <span class="o">=</span> <span class="n">upper</span> <span class="p">)</span> <span class="nf">return</span><span class="p">(</span><span class="n">trial_area</span><span class="o">$</span><span class="n">value</span> <span class="o">/</span> <span class="n">denominator</span><span class="o">$</span><span class="n">value</span> <span class="o">-</span> <span class="m">0.05</span><span class="p">)</span> <span class="p">},</span> <span class="n">interval</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">100</span><span class="p">,</span> <span class="m">150</span><span class="p">)</span> <span class="p">)</span> </code></pre></div> <div class="highlight"><pre><span></span><code>##$root ## [1] 127.4914 ## ## $f.root ## [1] -3.810654e-08 ## ##$iter ## [1] 14 ## ## $init.it ## [1] NA ## ##$estim.prec ## [1] 6.103516e-05 </code></pre></div> <p>The B-Basis value that we get using this approach is $127.49$. This is quite close to $127.54$, which was the value that we got using the frequentest&nbsp;approach.</p> <p>In a simple case like this data set, it wouldn&#8217;t be worth the extra effort of using a likelihood-based approach to calculating the Basis value, but we have demonstrated that this approach does&nbsp;work.</p> <p>In a later blog post, we&#8217;ll explore a case where it is worth the extra effort. (<em>Edit: that post is <a href="https://www.kloppenborg.ca/2021/02/basis-values-censored-data/">here</a></em>)</p>cmstatr: Composite Material Data Statistics in R2020-07-22T00:00:00-04:002020-07-22T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2020-07-22:/2020/07/cmstatr/<p>From what I&#8217;ve seen, a lot of the statistical analysis of data from composite material data is done in <span class="caps">MS</span> Excel. There are a number of very good tools for doing this analysis in <span class="caps">MS</span> Excel: <a href="https://www.niar.wichita.edu/agate/Documents/default.htm"><span class="caps">ASAP</span></a>, <a href="http://www.niar.wichita.edu/coe/NCAMP_Documents/Programs/HYTEQ%20Feb%207%20%202011.xls"><span class="caps">HYTEQ</span></a>, <span class="caps">STAT</span>-17, and more recently, <a href="https://www.cmh17.org/RESOURCES/StatisticsSoftware.aspx"><span class="caps">CMH17</span>-<span class="caps">STATS</span></a>. I expect that the …</p><p>From what I&#8217;ve seen, a lot of the statistical analysis of data from composite material data is done in <span class="caps">MS</span> Excel. There are a number of very good tools for doing this analysis in <span class="caps">MS</span> Excel: <a href="https://www.niar.wichita.edu/agate/Documents/default.htm"><span class="caps">ASAP</span></a>, <a href="http://www.niar.wichita.edu/coe/NCAMP_Documents/Programs/HYTEQ%20Feb%207%20%202011.xls"><span class="caps">HYTEQ</span></a>, <span class="caps">STAT</span>-17, and more recently, <a href="https://www.cmh17.org/RESOURCES/StatisticsSoftware.aspx"><span class="caps">CMH17</span>-<span class="caps">STATS</span></a>. I expect that the reason for the popularity of <span class="caps">MS</span> Excel for this application is that everyone in the industry has <span class="caps">MS</span> Excel installed on their computer and <span class="caps">MS</span> Excel is easy to&nbsp;use.</p> <p>If you&#8217;ve read my blog before, you&#8217;ll know that I think that <a href="https://www.kloppenborg.ca/2019/06/reproducibility/">reproducibility</a> is important for engineering calculations. In my view, this includes statistical analysis. If the analysis isn&#8217;t reproducible, how does a reviewer &#8212; either now or in the future &#8212; know if it&#8217;s&nbsp;right?</p> <p>The current <span class="caps">MS</span> Excel tools are typically password protected so that users can&#8217;t view the macros that perform the calculations. I suspect that this was done with the best of intentions in order to prevent users from changing the code. But it also means that users can&#8217;t verify that the code is correct, or check if there are any unstated assumptions&nbsp;made.</p> <p>To allow statistical analysis of composite material data using open-source software, I&#8217;ve written an package for the <a href="https://www.r-project.org/">R programming language</a> that implements the statistical methods described in <a href="https://www.cmh17.org"><span class="caps">CMH</span>-17-1G</a>. This package, <a href="https://www.cmstatr.net"><code>cmstatr</code></a> has been released on <a href="https://cran.r-project.org/package=cmstatr"><span class="caps">CRAN</span></a>. There is also a brief discussion of this package in a <a href="https://doi.org/10.21105/joss.02265">paper published in the Journal of Open Source Software</a>.</p> <p>This R package allows statistical analysis to be performed using open-source tools &#8212; which can be verified by the user &#8212; and facilitates statistical analysis reports to be written at the same time that the analysis is performed by using <code>R-Notebooks</code> (see my <a href="https://www.kloppenborg.ca/2019/10/pandoc-report-templates/">earlier post</a>).</p> <p>I&#8217;ve tried to write the functions in a consistent manner so that it&#8217;s easier to learn how to use the package. I&#8217;ve also written functions to work well with the <a href="https://www.tidyverse.org/"><code>tidyverse</code></a> set of&nbsp;packages.</p> <p>There are some examples of how to use the <code>cmstatr</code> package in <a href="https://www.cmstatr.net/articles/cmstatr_Tutorial.html">this vignette</a>.</p> <p>I hope that people find this package useful. If you use this package and find a bug, have feedback or would like a feature added, please raise an issue on <a href="https://github.com/ComtekAdvancedStructures/cmstatr/issues">GitHub</a>.</p>Tracking Issues using Jupyter Notebooks2020-04-12T00:00:00-04:002020-04-12T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2020-04-12:/2020/04/tracking-issues/<p><em>Edit (26-May-2022): This post is largely obsolete, now that GitHub is able to <a href="https://github.blog/2022-05-19-math-support-in-markdown/">render math in Markdown documents</a>, including issues. I&#8217;m keeping this post up for historical reasons, but I&#8217;d now recommend that you now use GitHub Issues directly and include mathematica notation as&nbsp;needed.</em></p> <p>I&#8217;m currently …</p><p><em>Edit (26-May-2022): This post is largely obsolete, now that GitHub is able to <a href="https://github.blog/2022-05-19-math-support-in-markdown/">render math in Markdown documents</a>, including issues. I&#8217;m keeping this post up for historical reasons, but I&#8217;d now recommend that you now use GitHub Issues directly and include mathematica notation as&nbsp;needed.</em></p> <p>I&#8217;m currently collaborating on a paper. My collaborator and I are writing the paper using LaTeX and we&#8217;re using git to track and share changes to the manuscript. We currently have a shared repository on <a href="https://www.github.com">GitHub</a>.</p> <p>GitHub has a lot of great features for collaborating on software &#8212; after all that&#8217;s why it was developed. The &#8220;Issues&#8221; features in a repository is a particularly useful feature. This allows you to discuss problems, and track the resolution of those problems. Text formatting is supported in GitHub Issues using Markdown. In many flavors of markdown, you can also embed math using LaTeX syntax. Unfortunately, GitHub flavored markdown <a href="https://github.com/github/markup/issues/274">does not support math</a> (<em>Edit: Note that GitHub flavored markdown now <strong>does</strong> support math</em>). This is probably fine for the vast majority of software projects. However, it is a problem when we&#8217;re trying to discuss a mathematical&nbsp;model.</p> <p>Several people on the internet have suggested various solutions to this shortcoming. Some have suggested using an external engine to render your math as an image, then embed that image in markdown. This works, but I think it&#8217;s&nbsp;cumbersome.</p> <p>Several others have suggested using a <a href="https://jupyter.org">Jupyter Notebook</a>, which GitHub does actually render. I think that this is a better solution, and this is the solution that I&#8217;m planning on using with my&nbsp;collaborator.</p> <h1>Implementation&nbsp;Summary</h1> <p>In our git repository, I&#8217;m creating a folder called <code>issues-open</code>. Inside this folder is a set of Jupyter Notebooks, one per issue. Each collaborator can review these Notebooks, which conveniently get rendered on the GitHub web interface. When a collaborator has something to add to the issue, they can fire up their Jupyter instance and make some changes &#8212; either by adding new cells to the bottom of the notebook, or making changes to the existing text &#8212; and committing and pushing the changes. We&#8217;ve adopted the practice of starting each cell with a heading with the name of the author of that cell. This way, the Notebook looks a bit like a&nbsp;conversation.</p> <h1>Launching Jupyter&nbsp;Notebooks</h1> <p>We&#8217;re using a <a href="https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html">conda environment</a> for Python so that we&#8217;re synced up on the versions of each package we&#8217;re using. So, the first step will be creating the conda environment from the environment <span class="caps">YAML</span> file. In our case, this would look like&nbsp;this:</p> <div class="highlight"><pre><span></span><code>conda env create -f environment.yml </code></pre></div> <p>This only needs to be done once on each computer. Once that&#8217;s been done, you just need to activate the environment. This is basically just telling your terminal that you want to use that version of Python. This can be accomplished like the following (obviously, replace the name of the environment with the correct&nbsp;name):</p> <div class="highlight"><pre><span></span><code>conda activate my-environment </code></pre></div> <p>Now, you can launch the Jupyter Notebook session using the following. Your web browser should pop up and allow you to create new notebooks and edit existing notebooks in the browser once you run this&nbsp;command.</p> <div class="highlight"><pre><span></span><code>jupyter notebook </code></pre></div> <h1>Collaborating on&nbsp;Issues</h1> <p>The Jupyter Notebook interface is relatively straight forward and doesn&#8217;t need much discussion here. Most of the important features are available through the menus. There are keyboard shortcuts that come in handy, which can be found <a href="https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330">here</a>.</p> <p>Jupyter notebooks comprise a set of cells. The basic types of cells are markdown, code and raw. We&#8217;ll ignore raw cells here. Markdown cells contain text styled using <a href="https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html">markdown syntax</a>. Code cells contain executable code. In our case, this will all be Python&nbsp;code.</p> <p>If there is any code in the notebook, it&#8217;s important to realize that it runs interactively. You execute one code cell at a time. You don&#8217;t have to execute them in order either. So, if the code has side effects &#8212; like changing a global variable &#8212; the order that you run the cells in makes a difference. I think it&#8217;s good practice to restart your Python interpreter and re-run all the cells before committing a notebook in git. To do this, just click <code>Kernel</code> / <code>Restart &amp; Run All</code>. This guarantees that the cells were run in order and have repeatable&nbsp;output.</p> <p>The other advantage to restarting the kernel and re-running all the cells before committing is to avoid extraneous changes being tracked by git. The notebook files include a counter indicating the order in which the cells were executed. The first cell to be executed will have a counter value of 1, the second will have a value of 2, etcetera. If you execute the first five cells, then execute the first one again, it will now have a counter value of 6. If you&#8217;ve been playing around with a notebook for a while, all those counters will be incremented even higher. Even if you make no real changes to the notebook, git will register these counter changes as changes that need to be committed and tracked. You really only want the real changes to be tracked, and the easiest way to do this is to ensure that the code cells are executed in order starting from an execution count of&nbsp;one.</p> <h1>Closing an&nbsp;Issue</h1> <p>When it&#8217;s time to close an issue, whomever closes the issue simply moves the Jupyter Notebook discussing the issue to a folder called <code>issues-closed</code>. This should be a <code>git-mv</code> so that the history is&nbsp;maintained.</p> <p>As an example, to close the issue discussed in the Notebook <code>reorder-model-development.ipynb</code>, the command would&nbsp;be:</p> <div class="highlight"><pre><span></span><code>git mv issues-open/reorder-model-development.ipynb issues-closed/ </code></pre></div>Pandoc Report Templates2019-10-29T00:00:00-04:002019-10-29T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2019-10-29:/2019/10/pandoc-report-templates/<p>The main benefit of using Notebooks (R Notebooks or Jupyter Notebooks) is that the document is reproducible: the reader knows exactly how the results of the analysis were obtained. I wrote about the use of Notebooks in an <a href="https://www.kloppenborg.ca/2019/06/reproducibility/">earlier&nbsp;post.</a></p> <p>Most organizations have a certain report format: a certain cover …</p><p>The main benefit of using Notebooks (R Notebooks or Jupyter Notebooks) is that the document is reproducible: the reader knows exactly how the results of the analysis were obtained. I wrote about the use of Notebooks in an <a href="https://www.kloppenborg.ca/2019/06/reproducibility/">earlier&nbsp;post.</a></p> <p>Most organizations have a certain report format: a certain cover sheet layout, a certain font, a log of revisions, etcetera. For the most part, organizations have an <span class="caps">MS</span> Word template for this report format. If you want to use a Notebook for you analysis and to write your report, you have a few&nbsp;options:</p> <ul> <li>You could write front matter in <span class="caps">MS</span> Word using your company&#8217;s report template and then attach the Notebook as an&nbsp;appendix.</li> <li>You could also use <code>Pandoc</code> (more about what this is later) to convert the Notebook into a .docx file and then merge it into the report&nbsp;template.</li> <li>You could create your own <code>Pandoc</code> template to convert a Notebook directly into a <span class="caps">PDF</span> with the correct&nbsp;formatting.</li> </ul> <p>The first option of attaching a Notebook as an appendix to a report otherwise created in <span class="caps">MS</span> Word is effective but is means that you need to maintain two different files: the <span class="caps">MS</span> Word report and the Notebook itself. The second option of exporting the Notebook to <span class="caps">MS</span> Word and merging it into the template is problematic when it comes to document revisions. If the part of the analysis is revised, there is a temptation to change the affected part by either only re-exporting that section from the Notebook into docx, or worse, making the change directly in <span class="caps">MS</span> Word. In both cases, there is the possibility of breaking the reproducibility. For example, let&#8217;s say that in your report you define some constants at the beginning and do some math using these&nbsp;constants:</p> <div class="highlight"><pre><span></span><code><span class="n">P</span> <span class="o">=</span> <span class="mi">1000</span> <span class="n">A1</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">A2</span> <span class="o">=</span> <span class="mi">4</span> <span class="n">sigma1</span> <span class="o">=</span> <span class="n">P</span> <span class="o">/</span> <span class="n">A1</span> <span class="nb">print</span><span class="p">(</span><span class="n">sigma1</span><span class="p">)</span> <span class="c1"># 500</span> <span class="n">sigma2</span> <span class="o">=</span> <span class="n">P</span> <span class="o">/</span> <span class="n">A2</span> <span class="nb">print</span><span class="p">(</span><span class="n">sigma2</span><span class="p">)</span> <span class="c1"># 250</span> </code></pre></div> <p>Now let&#8217;s say that you ask your new intern to revise the document so that $P = 1200$. They just edit the <span class="caps">MS</span> Word version of the report thinking that they will save some time. They don&#8217;t notice that $P$ is used twice in the calculation and only update the result from the first time it&#8217;s used. Now the report&nbsp;reads:</p> <div class="highlight"><pre><span></span><code><span class="n">P</span> <span class="o">=</span> <span class="mi">1200</span> <span class="n">A1</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">A2</span> <span class="o">=</span> <span class="mi">4</span> <span class="n">sigma1</span> <span class="o">=</span> <span class="n">P</span> <span class="o">/</span> <span class="n">A1</span> <span class="nb">print</span><span class="p">(</span><span class="n">sigma1</span><span class="p">)</span> <span class="c1"># 600</span> <span class="n">sigma2</span> <span class="o">=</span> <span class="n">P</span> <span class="o">/</span> <span class="n">A2</span> <span class="nb">print</span><span class="p">(</span><span class="n">sigma2</span><span class="p">)</span> <span class="c1"># 250</span> </code></pre></div> <p>The report is now wrong. In a simple case like this, you&#8217;ll probably notice the error when you review your intern&#8217;s work, but if the math was significantly more complex, there is probably a fairly good chance that you wouldn&#8217;t pick up on the newly introduced&nbsp;error.</p> <p>For this reason, I think that the best option is to create a <code>Pandoc</code> template for your company&#8217;s report template. This means that you&#8217;ll be creating a <span class="caps">PDF</span> directly from the Notebook. In order to revise the report, you have to re-run the Notebook &#8212; the whole&nbsp;Notebook.</p> <p>For those unfamiliar with <a href="https://pandoc.org/"><code>Pandoc</code></a>, it is a program for converting between various file formats. It&#8217;s also free and open-source software. Commonly, it&#8217;s used for converting from Markdown into <span class="caps">HTML</span> or <span class="caps">PDF</span> (actually, <code>Pandoc</code> converts to a <a href="https://www.latex-project.org/">LaTeX</a> format and LaTeX converts to <span class="caps">PDF</span>, but this happens transparently). <code>Pandoc</code> can also convert into <span class="caps">MS</span> Word (.docx) and several other&nbsp;formats.</p> <p>When I decided to create a corporate format for use with notebooks, I looked at the types of notebooks that we use. Generally, statistics are done in an <a href="https://bookdown.org/yihui/rmarkdown/notebook.html">R-Notebook</a> and other analysis is done in a <a href="https://jupyter.org/">Jupyter notebook</a>. Unfortunately, R-Notebooks and Jupyter Notebooks use different templates. R-Notebooks use <code>pandoc</code> templates, while Jupyter uses its own template. Fortunately, there is a workaround. Jupyter is able to export to markdown, which can be read by <code>pandoc</code> and translated to <span class="caps">PDF</span> using a pandoc template. Thus, I made the decision to write a <code>pandoc</code> template.</p> <p>When <code>pandoc</code> converts a markdown file to <span class="caps">PDF</span>, it actually uses LaTeX. The <code>pandoc</code> template is actually a template for converting markdown into LaTeX. <code>Pandoc</code> then calls <code>pdflatex</code> to turn this <code>.tex</code> file into a <span class="caps">PDF</span>. </p> <p>When I first started figuring out how to write a template for converting markdown to <span class="caps">PDF</span>, I thought I was going to have to write a LaTeX class or style. I got scared. LaTeX classes are not for the faint of heart. But, I soon realized that I didn&#8217;t actually have to do that. The <code>pandoc</code> template that I needed to write was just a regular LaTeX document that has some parameters that <code>pandoc</code> can fill in. I&#8217;m not sure that I could figure out how to write a LaTeX class in a reasonable amount of time, but I sure can write a document using LaTeX. This is something that I learned to do when I wrote my undergraduate thesis, and while I don&#8217;t write LaTeX often anymore, it&#8217;s really not that&nbsp;hard.</p> <p>A very basic LaTeX file would look something like&nbsp;this:</p> <div class="highlight"><pre><span></span><code><span class="k">\documentclass</span><span class="nb">{</span>article<span class="nb">}</span> <span class="k">\begin</span><span class="nb">{</span>document<span class="nb">}</span> <span class="k">\title</span><span class="nb">{</span>My Report Title<span class="nb">}</span> <span class="k">\author</span><span class="nb">{</span>A. Student<span class="nb">}</span> <span class="k">\maketitle</span> <span class="k">\section</span><span class="nb">{</span>Introduction<span class="nb">}</span> Some text <span class="k">\end</span><span class="nb">{</span>document<span class="nb">}</span> </code></pre></div> <p>A <code>pandoc</code> template is just a LaTeX file, but with placeholder for the content that <code>pandoc</code> will insert. These placeholders are just variables surrounded with dollar signs. For example, <code>pandoc</code> has a variable called <code>body</code>. This variable will contain the body of the report. We would simply put <code>$body$</code> in the part of the template where we want <code>pandoc</code> to insert the body of the&nbsp;report.</p> <p><code>Pandoc</code> also supports <code>for</code> and <code>if</code> statements. A common pattern is to check for the existence of a variable and use it if it does exist and use a default value if it does not. The syntax for this would look something&nbsp;like:</p> <div class="highlight"><pre><span></span><code><span class="o">$</span><span class="k">if</span><span class="p">(</span><span class="n">myvar</span><span class="p">)</span><span class="o">$</span><span class="w"></span> <span class="w"> </span><span class="o">$</span><span class="n">myvar</span><span class="o">$</span><span class="w"></span> <span class="o">$</span><span class="k">else</span><span class="o">$</span><span class="w"></span> <span class="w"> </span><span class="n">Default</span><span class="w"> </span><span class="n">text</span><span class="w"></span> <span class="o">$</span><span class="n">endif</span><span class="o">$</span><span class="w"></span> </code></pre></div> <p>I&#8217;ve written the above code on multiple lines for readability, but it could be written on a single line&nbsp;too.</p> <p>Similarly, if a variable is a list, you&#8217;d use a <code>for</code> statement to iterate over the list. We&#8217;ll cover this later when we talk about adding logs of&nbsp;revisions.</p> <h1>Defining New Template&nbsp;Variables</h1> <p><code>Pandoc</code> defines a number of variables by default. However, you&#8217;ll likely need to define some variables of your own. First of all, you&#8217;ll likely need to define a variable for the report number and the&nbsp;revision.</p> <p>To create the variable, it&#8217;s just a matter of defining it in the <a href="http://yaml.org/"><code>YAML</code></a> header of the markdown file. Variables can either have a single value or they can be lists. Elements of a list start with dash at the beginning of the&nbsp;line.</p> <p>Once we add the report number (which we&#8217;ll call <code>report-no</code>) and the revision (which we&#8217;ll call <code>rev</code>) to the <code>YAML</code> header, the <span class="caps">YAML</span> header will look like the&nbsp;following:</p> <div class="highlight"><pre><span></span><code><span class="nt">title</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;Report</span><span class="nv"> </span><span class="s">Title&quot;</span><span class="w"></span> <span class="nt">author</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;A.</span><span class="nv"> </span><span class="s">Student&quot;</span><span class="w"></span> <span class="nt">report-no</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;RPT-001&quot;</span><span class="w"></span> <span class="nt">rev</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">B</span><span class="w"></span> </code></pre></div> <p>(Bonus points if you immediately though of William Sealy Gosset when you read&nbsp;that).</p> <p>We&#8217;ll probably want to add a log of revisions to the report. The contents of this log of revisions will have to come from somewhere, and the <code>YAML</code> header is the most logical place. The log of revisions will be a list with one element of the list corresponding to each revision in the log. Lists can have nested members. In our case, an entry within the log of revisions will have a revision letter, a date and a description. Including the log of revisions, the <code>YAML</code> header will look like&nbsp;this:</p> <div class="highlight"><pre><span></span><code><span class="nt">title</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;Report</span><span class="nv"> </span><span class="s">Title&quot;</span><span class="w"></span> <span class="nt">author</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;A.</span><span class="nv"> </span><span class="s">Student&quot;</span><span class="w"></span> <span class="nt">report-no</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;RPT-001&quot;</span><span class="w"></span> <span class="nt">rev</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">B</span><span class="w"></span> <span class="nt">rev-log</span><span class="p">:</span><span class="w"></span> <span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">rev</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">A</span><span class="w"></span> <span class="w"> </span><span class="nt">date</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1-Jun-2019</span><span class="w"></span> <span class="w"> </span><span class="nt">desc</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Initial release</span><span class="w"></span> <span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">rev</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">B</span><span class="w"></span> <span class="w"> </span><span class="nt">date</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">18-Jun-2019</span><span class="w"></span> <span class="w"> </span><span class="nt">desc</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Updated loads based on fligt test data</span><span class="w"></span> </code></pre></div> <p>We can now use these variables in our <code>pandoc</code> template. Using the variables <code>report-no</code> and <code>rev</code> are straight forward and will be just the same as using the default variables (like <code>title</code> and <code>author</code>).</p> <p>Using the list variables will require the use of a <code>for</code> statement. In the case of a log of revisions, each revision will get a row in a LaTeX table. Using the variable <code>rev-log</code>, this table will look like&nbsp;this:</p> <div class="highlight"><pre><span></span><code><span class="k">\begin</span><span class="nb">{</span>tabular<span class="nb">}{</span>| m<span class="nb">{</span>0.25in<span class="nb">}</span> | m<span class="nb">{</span>0.95in<span class="nb">}</span> | m<span class="nb">{</span>4.0in<span class="nb">}</span> |<span class="nb">}</span> <span class="k">\hline</span> Rev Ltr <span class="nb">&amp;</span> Date <span class="nb">&amp;</span> Description <span class="k">\\</span> <span class="s">$</span><span class="nb">for</span><span class="o">(</span><span class="nb">rev</span><span class="o">-</span><span class="nb">log</span><span class="o">)</span><span class="s">$</span> <span class="k">\hline</span> <span class="s">$</span><span class="nb">rev</span><span class="o">-</span><span class="nb">log.rev</span><span class="s">$</span> <span class="nb">&amp;</span> <span class="s">$</span><span class="nb">rev</span><span class="o">-</span><span class="nb">log.date</span><span class="s">$</span> <span class="nb">&amp;</span> <span class="s">$</span><span class="nb">rev</span><span class="o">-</span><span class="nb">log.desc</span><span class="s">$</span> <span class="k">\\</span> <span class="s">$</span><span class="nb">endfor</span><span class="s">$</span> <span class="k">\hline</span> <span class="k">\end</span><span class="nb">{</span>tabular<span class="nb">}</span> </code></pre></div> <p>In the above LaTeX code, everything between <code>$for(...)$</code> and <code>$endfor$</code> gets repeated for each item in the list <code>rev-log</code>. We can access the nested members using dot&nbsp;notation.</p> <h1>Using the Pandoc Template from an&nbsp;R-Notebook</h1> <p>RStudio handles a lot of the interface with <code>pandoc</code>. Adding the following to the <code>YAML</code> header of the R-Notebook should cause RStudio to use your new template when it compiles the R-Notebook to <span class="caps">PDF</span>. This <em>should</em> be all you need to&nbsp;do.</p> <div class="highlight"><pre><span></span><code><span class="nt">output</span><span class="p">:</span><span class="w"></span> <span class="w"> </span><span class="nt">pdf_document</span><span class="p">:</span><span class="w"></span> <span class="w"> </span><span class="nt">template</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">my_template_file.tex</span><span class="w"></span> <span class="w"> </span><span class="nt">toc_depth</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">3</span><span class="w"></span> <span class="w"> </span><span class="nt">fig_caption</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span> <span class="w"> </span><span class="nt">keep_tex</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">false</span><span class="w"></span> <span class="w"> </span><span class="nt">df_print</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">kable</span><span class="w"></span> </code></pre></div> <h1>Using the Pandoc Template from a Jupyter&nbsp;Notebook</h1> <p>Using your new <code>pandoc</code> template from a Jupyter Notebook is a bit more complicated because Jupyter doesn&#8217;t work directly with <code>pandoc</code>. First of all, we need to tell <code>nbconvert</code> to convert to markdown. I think that it&#8217;s best to re-run the notebook at the same time (to make sure that it is, in fact, fully reproducible. You can do this using <code>nbconvert</code> as&nbsp;follows:</p> <div class="highlight"><pre><span></span><code>jupyter nbconvert --execute --to markdown my-notebook.ipynb </code></pre></div> <p>But, Jupyter notebooks don&#8217;t have <code>YAML</code> headers like R-Notebooks do, so we need a place to put all the variables that the template needs. The easiest way to do this is to create a cell at the beginning of the notebook with the cell type set as <code>raw</code>, then enter the <code>YAML</code> header into this cell, including the starting end ending fences (<code>---</code>). This cell would, then, have a content similar to the following. Cells of type <code>raw</code> simply get copied to the output, so this becomes the <code>YAML</code> header in the resulting markdown&nbsp;file.</p> <div class="highlight"><pre><span></span><code><span class="nn">---</span><span class="w"></span> <span class="nt">title</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;Report</span><span class="nv"> </span><span class="s">Title&quot;</span><span class="w"></span> <span class="nt">author</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;A.</span><span class="nv"> </span><span class="s">Student&quot;</span><span class="w"></span> <span class="nt">report-no</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;RPT-001&quot;</span><span class="w"></span> <span class="nt">rev</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">B</span><span class="w"></span> <span class="nt">rev-log</span><span class="p">:</span><span class="w"></span> <span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">rev</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">A</span><span class="w"></span> <span class="w"> </span><span class="nt">date</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1-Jun-2019</span><span class="w"></span> <span class="w"> </span><span class="nt">desc</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Initial release</span><span class="w"></span> <span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">rev</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">B</span><span class="w"></span> <span class="w"> </span><span class="nt">date</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">18-Jun-2019</span><span class="w"></span> <span class="w"> </span><span class="nt">desc</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Updated loads based on flight test data</span><span class="w"></span> <span class="nn">---</span><span class="w"></span> </code></pre></div> <p>Once you&#8217;ve used <code>nbconvert</code> to create the markdown file, you can call <code>pandoc</code>. You&#8217;ll have to provide the template as a command-line argument and also specify the output filename (so that <code>pandoc</code> knows you want a pdf) and also give the code highlighting style. The call to <code>pandoc</code> will look something like&nbsp;this.</p> <div class="highlight"><pre><span></span><code><span class="sb"></span>pandoc<span class="sb"></span> my-notebook.md -N --template<span class="o">=</span>my_template_file.tex -o my-notebook.pdf --highlight-style<span class="o">=</span>tango </code></pre></div> <h1>Documentation of Your&nbsp;Template</h1> <p>A &#8220;trick&#8221; that I&#8217;ve used is to add some documentation about how to use the template inside the template itself. It&#8217;s pretty unlikely that the user will actually open up the template, but it&#8217;s relatively likely that the user will forget one of the variables that the template expects. Since <code>pandoc</code> allows <code>if/else</code> statements, I&#8217;ve added the following to my&nbsp;template:</p> <div class="highlight"><pre><span></span><code><span class="s">$</span><span class="nb">if</span><span class="o">(</span><span class="nb">abstract</span><span class="o">)</span><span class="s">$</span> <span class="k">\abstract</span><span class="nb">{</span><span class="s">$</span><span class="nb">abstract</span><span class="s">$</span><span class="nb">}</span> <span class="s">$</span><span class="nb">else</span><span class="s">$</span> <span class="k">\abstract</span><span class="nb">{</span> The documentation for using the template goes here <span class="nb">}</span> <span class="s">$</span><span class="nb">endif</span><span class="s">$</span> </code></pre></div> <p>This means that if the user forgets to define the <code>abstract</code> variable, the cover page of the report (where the abstract normally goes in my case) will contain the documentation for the&nbsp;template.</p> <h1>Change Bars: Future&nbsp;Work</h1> <p>One of the things that I haven&#8217;t yet figured out are change bars. In my organization, we put vertical bars in the margin of reports to indicate what part of a report has been revised. There are LaTeX packages for (manually) inserting <a href="https://www.ctan.org/pkg/changebar">change bars into documents</a>. However, I haven&#8217;t yet figured out how to automatically insert these into a report generated using <code>pandoc</code>. I&#8217;m sure there&#8217;s a way,&nbsp;though.</p> <h1>Conclusion</h1> <p>I hope that this demystifies the process of writing a <code>pandoc</code> template to allow you to create reports directly from Jupyter Notebooks or R-Notebooks in your company&#8217;s report&nbsp;format.</p> <p><em>(Edited to fix a few&nbsp;typos)</em></p>Package Adequacy for Engineering Calculations2019-06-29T00:00:00-04:002019-06-29T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2019-06-29:/2019/06/adequacy/<p>If you do engineering calculations or analysis using a language like <a href="https://www.r-project.org/">R</a> or <a href="https://www.python.org/">Python</a>, chances are that you&#8217;re going to use some packages. Packages are collections of code that someone else has written that you can use in your code. For example, if you need to solve a system …</p><p>If you do engineering calculations or analysis using a language like <a href="https://www.r-project.org/">R</a> or <a href="https://www.python.org/">Python</a>, chances are that you&#8217;re going to use some packages. Packages are collections of code that someone else has written that you can use in your code. For example, if you need to solve a system of linear equations by inverting a matrix and you&#8217;re using Python, you might use <a href="http://www.numpy.org/">numpy</a>. Or if you&#8217;re using R and you need to fit a linear model to some data, you would probably use the <a href="https://stat.ethz.ch/R-manual/R-devel/library/stats/html/00Index.html">stats</a>&nbsp;package.</p> <p>If you&#8217;re involved in &#8220;engineering,&#8221; you need a high level of confidence that the results that you&#8217;re getting are correct. Note that in this post &#8212; and my blog in general &#8212; that when I say &#8220;engineering,&#8221; I don&#8217;t mean <em>software engineering:</em> I mean design and analysis of of structures or systems that have an effect on safety. I work in civil aeronautics, mainly dealing with composites, but also dealing with metallic structure regularly. Depending on the particular type of engineering that you&#8217;re engaged in and the particular problem at hand, the consequences of getting the wrong answer could be fairly severe. You better be sure that both the interpreter and the packages are correct. Probably the best way to do this is to validate the results using another method: are there published results for a similar problem that you can use as a benchmark? Perhaps you can do some physical testing? But even if you&#8217;re doing you due diligence and validating the results somehow, you will still waste a lot of your time if there were a problem with either the interpreter or one of the&nbsp;packages.</p> <p>Compiled languages &#8212; like <code>C</code> or <code>FORTRAN</code> &#8212;are compiled into machine code that runs directly on the processor. Interpreted languages, like Python, R or JavaScript, are not compiled into machine code, but instead an interpreter (a piece of software) reads each line of code and figures out how to run it when you run the code (not ahead of time). As far as interpreters go, if you&#8217;re using <code>CPython</code> (the &#8220;standard&#8221; Python interpreter) or <code>GNU-R</code> (the &#8220;standard&#8221; R interpreter), I think there is a rather low risk that there are any errors in the interpreter. These interpreters are written by a bunch of smart people, and both are open source, so the code that makes up the interpreters themselves are read by a much larger group of smart people. Furthermore, both interpreters are widely used and have been around for a while, so it&#8217;s very likely that significant bugs that are likely to change the result of an engineering calculation would have been found by users by now and would have been&nbsp;fixed.</p> <p>Packages are more of a risk than interpreters are. Again, if you&#8217;re using a very widely used package that has been around for a while, like <code>numpy</code> (in Python) or <code>stats</code> (in R), there&#8217;s a pretty good chance that any bugs that would affect your calculations would have been found by now &#8212; and packages like these are maintained by groups of dedicated&nbsp;people.</p> <p>If you&#8217;re using R, chances are that you&#8217;re getting your packages from <a href="https://cran.r-project.org/"><span class="caps">CRAN</span></a>. You should be reading the <span class="caps">CRAN</span> page for the package that you&#8217;re using. You can find an example of such a page <a href="https://cran.r-project.org/package=MASS">here</a>. There are a few things that you should look for to help you evaluate the reliability of the package (in addition to reference manual and any vignettes that explain how to use the package). The first is the priority of the package. Not all packages have a priority, but if the priority is &#8220;base&#8221; or &#8220;recommended,&#8221; the package is maintained by the r-core team and is almost certainly used by a lot of people. You can be fairly comfortable with these&nbsp;packages.</p> <p>The second thing that you should look at on the <span class="caps">CRAN</span> page for a package is the <span class="caps">CRAN</span> Checks. <span class="caps">CRAN</span> will test all the packages every time a new version of R is released and it tests all the packages routinely to determine if a change in one package caused errors in another packages. You can see an example <span class="caps">CRAN</span> Check for my package <code>rde</code> <a href="https://cran.r-project.org/web/checks/check_results_rde.html">here</a>.</p> <p>This practice is called <a href="https://en.m.wikipedia.org/wiki/Continuous_integration">continuous integration</a>. It does all of these checks on several different operating systems &#8212; Windows, <span class="caps">OSX</span>, and several Linux distributions. If you open the <span class="caps">CRAN</span> Checks results for a package, you&#8217;ll see a table of all the various combinations of R version and operating system that have been tested along with the amount of time that it took to run the test and a status for each. If the Status is &#8220;<span class="caps">OK</span>,&#8221; then there were no errors identified. If the Status is &#8220;<span class="caps">NOTE</span>,&#8221; &#8220;<span class="caps">WARNING</span>,&#8221; or &#8220;<span class="caps">ERROR</span>.&#8221; There might be something wrong and it may or may not be serious. If you click on the Status link, you&#8217;ll see details and can evaluate for&nbsp;yourself.</p> <p>I think that these <span class="caps">CRAN</span> checks are actually a very strong point for the R ecosystem. It ensures that package maintainers know when something outside of their package breaks their code. And, it enforces a certain level of quality: package maintainers are given a certain amount of time to fix errors, and if they don&#8217;t the package gets removed from <span class="caps">CRAN</span>.</p> <p>The <span class="caps">CRAN</span> checks do a few things. First, they check that the package can, in fact, be loaded (maybe there&#8217;s an error that prevents you from using it at all). There are a few other things that it does, but the most important in terms of reliability of the package is that the <span class="caps">CRAN</span> checks will run any test created by the package maintainer. These tests are called <a href="https://en.m.wikipedia.org/wiki/Unit_testing">unit tests</a>. They are test that determine if the code in the package actually has the expected behavior. Package maintainers don&#8217;t have to write unit tests, but the good ones do. You can look at what tests the package maintainer has written by downloading the code of the package (you can download it from <span class="caps">CRAN</span>). The test are in a folder called <code>tests</code>. Tests basically work by providing some input to the package&#8217;s functions, and checking that the result is correct. For R packages, the <code>testthat</code> framework is a popular testing framework. For packages that use the <code>testthat</code> framework, you&#8217;ll see a number of statements that use the <code>expect_...</code> family of functions. Some of these tests will likely ensure that the package works at all &#8212; checking things like the return type for functions, or that a function actually does raise certain errors when invalid arguments are passed to it. Some of the tests should also ensure that the package provides correct results. When I write tests for a package, I always write both types of tests. For the tests that ensure that the results are correct, I often either check cases that have closed-form solutions, or check that the code in the package produces results that are approximately equal to example results published in articles or books. You&#8217;ll need to read through the tests to decide if they provide enough assurance that the package is&nbsp;correct.</p> <p>If you decide that the tests for a package are not sufficient, you have three&nbsp;options.</p> <ul> <li>You could choose not the use that package: maybe there is another that does something&nbsp;similar.</li> <li>You can write tests yourself and contribute those tests back to the package maintainer. After all, R packages are open-source and users are encouraged to contribute back to the community. Most package maintainers would be happy to receive a patch that adds more tests: writing tests is not fun, and most people would be grateful if someone else offers to do&nbsp;it.</li> <li>You could also manually test the package. The difficulty here is ensuring that you re-test the package every time you update the version of this package on your&nbsp;system.</li> </ul> <p>In the python world, continuous integration isn&#8217;t as well integrated into the ecosystem. Most packages that you install probably come from PyPI. As far as I know, PyPI doesn&#8217;t do any continuous integration: it&#8217;s up to the package maintainer to run their tests regularly. Package maintainers can do one of two things: they can run the tests on their own machine before releasing a new version to PyPI, or they can use a continuous integration service like Travis-<span class="caps">CI</span> or CircleCI. Many of the continuous integration services provide the service for free for open source projects, so many Python packages do use a continuous integration services. Packages that use a continuous integration service normally advertise it in their <span class="caps">README</span> file. You&#8217;ll still need to assess whether the tests are adequate, and if the package doesn&#8217;t use continuous integration, you&#8217;ll have to either run the test yourself, or trust that the package maintainer&nbsp;did.</p> <p>If you have already written tests for your package, setting up continuous integration using Travis-<span class="caps">CI</span> is quite straight forward. I haven&#8217;t personally used CircleCI, but I would imagine that it&#8217;s similarly easy to use. You can see the continuous integration results from my pcakge <code>rde</code> on Travis-<span class="caps">CI</span> <a href="https://travis-ci.com/kloppen/rde">here</a>.</p> <p>Whether you&#8217;re using Python or R, there are ways of ensuring that the packages you use for engineering calculations are adequate for your needs. Some people seem to be a little bit scared of open source packages and software for engineering calculations, but in a lot of ways, open source software is actually better for this since you have the ability of verifying it yourself and making a decision about whether to use&nbsp;it.</p>Automating Software Validation Reports2019-06-20T00:00:00-04:002019-06-20T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2019-06-20:/2019/06/automating-software-validation-reports/<p>I&#8217;ve been working on a Python package to analyze adhesively bonded joints recently. This package will be used to analyze adhesive joints in certain aircraft structure and will be used to substantiate the design of structural repairs, amongst other uses. Because of this, the output of this package needs …</p><p>I&#8217;ve been working on a Python package to analyze adhesively bonded joints recently. This package will be used to analyze adhesive joints in certain aircraft structure and will be used to substantiate the design of structural repairs, amongst other uses. Because of this, the output of this package needs to be validated against test data. This validation also needs to be documented in an engineering&nbsp;report.</p> <p>I&#8217;ve been thinking about how to do this. On one hand, I&#8217;ve been thinking about the types of (mechanical) tests that we&#8217;ll need to run to validate the model and the various test configurations that we&#8217;ll need to include in the validation test matrix. On the other hand, I&#8217;ve also been thinking about change change management of the package and ensuring that validation report stays up to&nbsp;date.</p> <p>I&#8217;m imagining the scenario where we run the validation testing and find that the model and the test results agree within, say, 10%. Maybe that&#8217;s good enough for the purpose (depending on the direction of the disagreement). We can then write our validation report and type out the sentence <em>&#8220;the test data and the model were found to agree within 10%.&#8221;</em> Then, I&#8217;m imagining that we make a refinement to the model formulation and release a new version of the package that now agrees with the test data within 5%. Now, we have a validation report for the old version of the package, but no report describing the validation of the new version. We&#8217;d need to go back through the validation report, re-run the model for all the validation cases and update the&nbsp;report.</p> <p>When we update the validation report manually, there&#8217;s probably a pretty good chance that some aspect of the update gets missed. Maybe it&#8217;s as simple as a one of the model outputs doesn&#8217;t get updated in the revised validation report. It&#8217;s also potentially rather time consuming to update this report. It would be faster to make this validation report a Jupyter Notebook (which I&#8217;ve <a href="https://www.kloppenborg.ca/2019/06/reproducibility/">previously talked about</a>). I haven&#8217;t yet written about it here, but it is possible to have a Jupyter Notebook render to a <span class="caps">PDF</span> using a corporate report format, so it&#8217;s even possible to make this validation report look like it should <em>(Edit: I&#8217;ve now written about this <a href="https://www.kloppenborg.ca/2019/10/pandoc-report-templates/">here</a>)</em>. We could also set up a test in the package to re-run the Jupyter Notebook, and perhaps integrate it into a continuous integration system so that the Notebook gets re-run every time a change is made to the package. This would mean that the validation report is always up to&nbsp;date.</p> <p>When you write a Jupyter Notebook, it usually has some code that produces a result &#8212; either a numeric result, or a graph &#8212; and then you have some text that you&#8217;ve written which explains the result. The problem is that this text that you&#8217;ve written doesn&#8217;t doesn&#8217;t respond to changes in the result. Sure, there are ways of automatically updating individual numbers inside the text that you&#8217;ve written, but sometimes the way that the result of the code changes warrants a change in the sentiment of the text. Maybe the text needs to change from <em>&#8220;the model shows poor agreement with experimental results and shouldn&#8217;t be used in this case&#8221;</em> to <em>&#8220;the model shows excellent agreement with experimental results and has been validated.&#8221;</em> There&#8217;s no practical way that this type of update to the text could be automated. But if the update to the result of the code in the Notebook has been automated, there&#8217;s a good chance that the text and the results from the code will end up disagreeing &#8212; especially if the report is more than a few&nbsp;pages.</p> <h1>The&nbsp;Solution</h1> <p>So, what can be done to rectify this? We want to have the ease of having the results of the code automatically update, but we want to make sure that those results and the text of the report match. One approach to this problem &#8212; and the approach that I intend to use for the adhesive joint analysis package &#8212; is to add <code>assert</code> statements to the Notebook. This way, if the assertion fails, the Notebook won&#8217;t automatically rebuild and our attention will be drawn to the&nbsp;issue.</p> <p>As an example, if the text says that the model is conservative, meaning that the strain predicted by the model is higher than the strain measured by strain gauges installed on the test articles from the validation testing, we could write the following assert statement in the Jupyter&nbsp;Notebook:</p> <div class="highlight"><pre><span></span><code><span class="k">assert</span><span class="p">(</span><span class="n">model_strain</span> <span class="o">&gt;</span> <span class="n">experimental_strain</span><span class="p">)</span> </code></pre></div> <p>Now, if we later make a change to the model that causes it to under-predict strain, we&#8217;ll be alerted to this and prompted to update the validation&nbsp;report.</p> <h1>Implementing the&nbsp;Solution</h1> <p>To run a Jupyter Notebook from code (for example in a test suite), I&#8217;ve use the following code in the past. This code was based on code found on <a href="http://blog.thedataincubator.com/2016/06/testing-jupyter-notebooks/">The Data Incubator&nbsp;Blog</a></p> <div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">_notebook_run</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">path</span><span class="p">):</span> <span class="n">kernel_name</span> <span class="o">=</span> <span class="s2">&quot;python</span><span class="si">{}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">version_info</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="n">file_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="vm">__file__</span><span class="p">)</span> <span class="n">errors</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="n">nb</span> <span class="o">=</span> <span class="n">nbformat</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">as_version</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span> <span class="n">nb</span><span class="o">.</span><span class="n">metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&quot;kernelspec&quot;</span><span class="p">,</span> <span class="p">{})[</span><span class="s2">&quot;name&quot;</span><span class="p">]</span> <span class="o">=</span> <span class="n">kernel_name</span> <span class="n">ep</span> <span class="o">=</span> <span class="n">ExecutePreprocessor</span><span class="p">(</span><span class="n">kernel_name</span><span class="o">=</span><span class="n">kernel_name</span><span class="p">,</span> <span class="n">timeout</span><span class="o">=</span><span class="mi">3600</span><span class="p">)</span> <span class="k">try</span><span class="p">:</span> <span class="n">ep</span><span class="o">.</span><span class="n">preprocess</span><span class="p">(</span><span class="n">nb</span><span class="p">,</span> <span class="p">{</span><span class="s2">&quot;metadata&quot;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;path&quot;</span><span class="p">:</span> <span class="n">file_dir</span><span class="p">}})</span> <span class="k">except</span> <span class="n">CellExecutionError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="k">if</span> <span class="s2">&quot;SKIP&quot;</span> <span class="ow">in</span> <span class="n">e</span><span class="o">.</span><span class="n">traceback</span><span class="p">:</span> <span class="n">errors</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="o">.</span><span class="n">traceback</span><span class="p">))</span> <span class="k">else</span><span class="p">:</span> <span class="k">raise</span> <span class="n">e</span> <span class="k">return</span> <span class="n">nb</span><span class="p">,</span> <span class="n">errors</span> <span class="n">_notebook_run</span><span class="p">(</span><span class="s2">&quot;file-name-of-my-notebook.ipynb&quot;</span><span class="p">)</span> </code></pre></div> <p>This code will run the Notebook <code>file-name-of-my-notebook.ipynb</code> and will raise an error if an error is encountered. If this is inside a <code>unittest2</code> or <code>NoseTest</code> test suite, this will cause a test&nbsp;failure.</p> <h1>Conclusion</h1> <p>Validating software used in a way that affects an aircraft design is very important in ensuring the safety of that design. Keeping the validation report up to date can be tedious, but can be automated using Jupyter Notebooks. The conclusions drawn in the validation report need to match the results of the software being validated. One approach to ensuring that this is always true is to add <code>assert</code> statements to the Jupyter Notebook that forms the validation&nbsp;report.</p>Reproducibility of Engineering Calculations2019-06-20T00:00:00-04:002019-06-20T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2019-06-20:/2019/06/reproducibility/<p>Reproducibility in engineering work doesn&#8217;t seem to get the attention that it deserves. I can&#8217;t count the number of times that I&#8217;ve read an old engineering report in search of a particular result, only to find that the calculation that lead to that result is only barely …</p><p>Reproducibility in engineering work doesn&#8217;t seem to get the attention that it deserves. I can&#8217;t count the number of times that I&#8217;ve read an old engineering report in search of a particular result, only to find that the calculation that lead to that result is only barely described, or there is just a screenshot of an Excel workbook with a few input numbers and a final result. When I find things like this, it makes me a little nervous: did the original author use the correct formula when computing this result? What assumptions did the author make and neglect to document? What approximations were made? Was the original review of the report diligent enough to check this particular&nbsp;result?</p> <p>Let&#8217;s take a hypothetical example. For simplicity, let&#8217;s assume that we&#8217;re analyzing some sort of bracket. It&#8217;s 2 inches wide, 0.125 inches thick and 5 inches long. It&#8217;s cantilievered with a load applied 2 inches from the free edge. We care about both the deflection and the maximum stress. The formulae for deflection and stress are given by Roark<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup>. We&#8217;ll adapt those equations&nbsp;slightly:</p> <p>$$\delta_a = \frac{-P}{6 E I} (2 L^3 - 3 L^2 a + a^3)&nbsp;$$</p> <p>$$\sigma = \frac{M_B \frac{t}{2}}{I} = \frac{P (L - a) \frac{t}{2}}{I}&nbsp;$$</p> <p>Given these equations and the data above, we could quite easily do the calculation in an spreadsheet program like <span class="caps">MS</span>-Excel. But, if we want to include our calculation in a report (most likely as a screenshot of the spreadsheet), our report will probably just look like&nbsp;this:</p> <p><img alt="excel1" src="https://www.kloppenborg.ca/2019/06/reproducibility/reproducibility_excel.png"></p> <p>This shows the &#8220;right&#8221; answer, but if you&#8217;re reviewing the report, how do you know that the answer is right? If you&#8217;re reviewing the report before it&#8217;s released, you can probably get a copy of the Excel file and check the formulae in the cells. You&#8217;ll spend a few minutes deciphering the formula to figure out if it&#8217;s correct. But, if you&#8217;re reading the report later, especially if you&#8217;re outside the company that wrote it, good luck. You&#8217;re going to have to get out a pen, paper and your calculator to repeat the calculation and figure out if it&#8217;s right. This problem is even worse if the author of the report hard coded in a few of the input values (i.e. length, width, elastic modulus, etc.) into the&nbsp;formulae.</p> <p>There are a few ways to address this problem of reproducibility. We&#8217;ll explore two of these ways. The first is to use software like <a href="https://www.ptc.com/en/products/mathcad">MathCAD</a>, or it&#8217;s free alternative <a href="https://en.smath.info/view/SMathStudio/summary">SMath-Studio</a>. Both of these products are <a href="https://en.wikipedia.org/wiki/WYSIWYG"><span class="caps">WYSIWYG</span></a> math editors that are unit aware. With either of these, your could do your calculations in the MathCAD or SMath-Studio and paste a screenshot of this into your&nbsp;report. </p> <p><img alt="smath" src="https://www.kloppenborg.ca/2019/06/reproducibility/reproducibility_smath.png"></p> <p>Now, the input data and the formula would be shown directly in the report. The added benefit is that, since these pieces of software are unit aware, you can&#8217;t make simple unit errors &#8212;- if you forget an exponent, the units shown in the result won&#8217;t be what you expect, so you know that you&#8217;ve made a&nbsp;mistake.</p> <p>The other way to approach this problem is to use something called a notebook. If you&#8217;re comfortable enough to write simple code in <a href="https://www.python.org/">Python</a>, you could use a <a href="http://jupyter.org/">jupyter notebook</a>. If you&#8217;re doing some data analysis or statistics, you might prefer to write some code in <a href="https://www.r-project.org/">R</a> (though, you could use <a href="https://pandas.pydata.org/">pandas</a> if you prefer to use Python). While you use R with jupyter notebooks (as well as several other languages), in my opionion R Studio&#8217;s <a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Notebooks</a> are a little bit better to work with. If you were to do the same calculation with a notebook (in this case, we&#8217;ll use a jupyter notebook and Python), it would look like&nbsp;this:</p> <p><img alt="notebook" src="https://www.kloppenborg.ca/2019/06/reproducibility/reproducibility_notebook.png"></p> <p>There are a few advantages of using a notebook. First, you can use a programming language with a little bit more power than MathCAD or SMath-Studio &#8212; if you need to do an iterative calculation or find the root of system of non-linear equations, you can do it with a language like Python or R &#8212; and do so in a way that&#8217;s not too difficult for the reader to understand. The other advantage of using a notebook is that notebooks are intended to mix code, results and text. You could actually write your whole report using a notebook! You could explain your approach to solving the problem, include the code used to solve the problem and then show the results all in the same document. No need to copy-and-paste anything and no need to store multiple files (like a word document and a SMath-Studio&nbsp;file).</p> <p>Text written in a notebook (either a jupyter notebook or an R Notebook) is written using using something called <a href="https://en.m.wikipedia.org/wiki/Markdown">markdown</a>. This is a &#8220;lightweight&#8221; way of formatting text. If you want a bullet list, you just type an asterix at the beginning of each line; if you want a heading, you start the line with a hash symbol (or two for a sub-heading). And, most importantly for engineering reports, you can include formulae using <a href="https://en.m.wikibooks.org/wiki/LaTeX/Mathematics">LaTeX</a> from within markdown just by enclosing the formula with two dollar signs before and after it &#8212; no need to suffer through using the <span class="caps">MS</span>-Word Equation&nbsp;Editor.</p> <p>If you need a corporate format for your report, there are ways to create PDFs from either a jupyter notebook or an R Notebook using a custom format. I plan on writting about this in a later post. Stay tuned. <em>(Edit: I&#8217;ve written about this <a href="https://www.kloppenborg.ca/2019/10/pandoc-report-templates/">here</a>)</em></p> <p>We&#8217;ve explored a few ways of making an engineering report more reproducible. Neither of the solutions explored are idea for every scenario &#8212; some scenarios are more suited to one of the solutions or the other &#8212; but both will improve many engineering&nbsp;reports.</p> <div class="footnote"> <hr> <ol> <li id="fn:1"> <p>W. Young and R. Budynas, Roark&#8217;s Formulas for Stress and Strain, Seventh Edition. New York: McGraw-Hill, 2002.&#160;<a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">&#8617;</a></p> </li> </ol> </div>rde: Now on CRAN2018-07-09T00:00:00-04:002018-07-09T00:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2018-07-09:/2018/07/rde/<p>For the last couple of years, we&#8217;ve been using the statistical programming language <a href="https://www.r-project.org">R</a> when we do statistical analysis or data visualizations at work. We typically deal with <em>small data</em> &mdash; most of the time, our data sets are high-tens or low-hundreds of rows of&nbsp;data.</p> <p>A lot of the …</p><p>For the last couple of years, we&#8217;ve been using the statistical programming language <a href="https://www.r-project.org">R</a> when we do statistical analysis or data visualizations at work. We typically deal with <em>small data</em> &mdash; most of the time, our data sets are high-tens or low-hundreds of rows of&nbsp;data.</p> <p>A lot of the time, we create <a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Notebooks</a> with our analysis and visualizations. This works well for us: the R Notebook contains the code used to do the analysis, the results of the analysis and the visualizations, all in one place. This eliminates questions like: &#8220;did you remove outliers before making the graph?&#8221; Or, &#8220;did you check that the data are distributed normally before you did that test?&#8221; A reviewer of the R Notebook can see exactly what was&nbsp;done.</p> <p>By default, the R Notebook produces an html file that you can open in your browser. You can email this html file to a colleague, and they can see your results and graphs, as well as exactly how you obtained them. If you made a logical mistake, or an inappropriate assumption, your colleague has the opportunity to find&nbsp;it.</p> <p>There is also a button in the html file that the R Notebook gets exported to that says &#8220;Download Rmd.&#8221; This allows your colleague to open the notebook in <a href="https://www.rstudio.com/">R Studio</a> and run your code. <em>If you sent your&nbsp;data.</em></p> <p>The one problem with just emailing R Notebooks to a colleague is that the R Notebook does not include the data. This might be okay if the data source is a file on a network, or a database that you both have access to, but in a lot of cases &mdash; at least in my work &mdash; the data is a <span class="caps">CSV</span> or Excel file. Now, if I want to send an R Notebook to a colleague to review, I need to remember to send the data file along with&nbsp;it.</p> <p>Enter <code>rde</code>.</p> <p>I wrote the package <a href="https://cran.r-project.org/web/packages/rde/"><code>rde</code></a> (which stands for Reproducible Data Embedding) to tackle this problem. This package allows you to embed data right in your R Notebook (or any other R code). It does so by compressing the data and then <a href="https://en.wikipedia.org/wiki/Base64">base-64 encoding</a> it into an <span class="caps">ASCII</span> string. This string can be pasted into the R Notebook and converted back into the original data when someone re-runs the&nbsp;Notebook.</p> <p>I won&#8217;t go into all the details of how to use the package. If you&#8217;d like to learn more, you can read the package <a href="https://cran.r-project.org/web/packages/rde/vignettes/rde_tutorial.html">vignette</a>.</p> <p>This isn&#8217;t the first R Package that I&#8217;ve written, but it is the first one that I&#8217;ve submitted to <a href="https://cran.r-project.org/"><span class="caps">CRAN</span></a>. When you install an R package using <code>install.packages()</code>, you&#8217;re installing it from <span class="caps">CRAN</span>. I think that <span class="caps">CRAN</span> is one of the best parts of the R ecosystem since it does <a href="https://en.wikipedia.org/wiki/Continuous_integration">continuous integration</a> for all of the packages hosted there. This helps ensure that all the packages continue to work as R is updated and as other packages are updated. I&#8217;ll likely talk about this more in a future blog&nbsp;post.</p> <p>If you&#8217;re an R user and you think that the package <code>rde</code> would help you in your workflow, check it out. You can install it by typing <code>install.packages("rde")</code> in R. If you find a bug, please file an <a href="https://github.com/kloppen/rde/issues">issue on GitHub</a>. And, if you would like to add functionality or improve it in some way, feel free to send me a pull&nbsp;request.</p>Welcome to Kloppenborg.ca2018-06-27T22:00:00-04:002018-06-27T22:00:00-04:00Stefan Kloppenborgtag:www.kloppenborg.ca,2018-06-27:/2018/06/welcome/<p>Welcome to&nbsp;kloppenborg.ca</p> <p>I plan to use this website as a blog where I discuss topics related to engineering, technology and whatever else I&#8217;m thinking about at the&nbsp;time.</p> <p>If you find any of the posts here interesting, feel free to share them. If you don&#8217;t feel …</p><p>Welcome to&nbsp;kloppenborg.ca</p> <p>I plan to use this website as a blog where I discuss topics related to engineering, technology and whatever else I&#8217;m thinking about at the&nbsp;time.</p> <p>If you find any of the posts here interesting, feel free to share them. If you don&#8217;t feel free to ignore&nbsp;them.</p>