# Variance

Gy defines the fundamental variance for a sample S to be the rela­tive vari­ance of the sample’s criti­cal con­tent, aS, when the sample is cor­rect and frag­ments are selected for S inde­pend­ently of each other (i.e., in inde­pend­ent Bernoulli trials). He esti­mates the fun­da­men­tal vari­ance using the follow­ing equation.

The fundamental vari­ance is con­sidered to be the small­est rela­tive sam­pling vari­ance that is prac­tically achiev­able with­out increas­ing the sample size or reducing the frag­ment sizes (i.e., grind­ing or mill­ing the material before sam­pling). In rou­tine prac­tice one can expect the sam­pling vari­ance to be some­what larger than the fun­da­men­tal vari­ance, but any addi­tional vari­ance tends to be harder to estimate.

Equation 1 can be derived by a tech­nique some­times called uncer­tainty propa­ga­tion or error propa­ga­tion. Using error propa­ga­tion, one esti­mates the rela­tive vari­ance of aS by

When one makes Gy’s assump­tions, with the selec­tion prob­abil­ity for each frag­ment equal to p = mS / mL, and sub­sti­tutes exact ex­pres­sions for var(AS), E(AS), var(mS), E(mS), and cov(AS, mS) into Equation 2, one obtains Equation 1 for the fun­da­mental variance.

One may also derive an ex­pres­sion for the sam­pling vari­ance under the assump­tion that S is a random sample of exactly k frag­ments from the lot (for some positive number k). The fact that S is a random sam­ple of size k means that for all sub­sets G of L of size k,

Under this assumption Equation 2 yields a slightly dif­fer­ent ex­pres­sion for the rela­tive vari­ance, which is shown below.

Equation 4, which was derived for a cor­rect sample, S, seems also to work well enough for a fair sample, SF, of size k such that for all sub­sets G of L of size k,

And of course, SF has the advan­tage of be­ing provably unbiased, so that E(aSF) = aL.

Both Equation 1 and Equation 4 are only approxi­mately true, but they can be applied to many real-life situa­tions in the labora­tory. There is another equa­tion that is exactly true, but which obviously does not apply to many real-life situa­tions in the lab.

Theorem   Suppose m1 = m2 = = mN and S is a sample such that Pr[S = G] = Pr[S = H] when­ever |G| = |H|. Then

When all the frag­ment masses are equal, select­ing frag­ments for the sam­ple in inde­pend­ent Bernoulli trials makes the premise of the theo­rem true.* So, in this case at least, Gy’s equa­tion for the fun­da­men­tal vari­ance is a good approxi­ma­tion, except for the miss­ing fac­tor N / (N − 1), which is near unity when N is large, and there­fore can be neglected.

Generally, one may expect Equation 1 to be a good approxi­ma­tion as long as the rela­tive stand­ard devia­tion of mS is small. If RSD(mS) is large, say because the sample is too small or there are some very mas­sive frag­ments in the lot, the approxi­ma­tion may be much worse.

It is proved elsewhere that when the sam­pling is cor­rect, keep­ing RSD(mS) small also ensures that the sam­pling bias is negli­gible in com­parison to the stand­ard deviation.

* Well, there is the minor issue of the empty sample. If one selects all the frag­ments in inde­pendent Bernoulli trials, it is pos­sible that no frag­ments will be selected at all. In that case the sam­pling must be repeated until a non­empty set of frag­ments is obtained, but the premise of the theo­rem is still true. It is pos­sible to avoid the prob­lem by select­ing the first frag­ment “correctly” and then select­ing all the other frag­ments in inde­pendent Bernoulli trials using Gy’s pro­cedure. The sampling is still “correct” and there is no danger of an empty sample. Or, to ensure the sample is truly un­biased, select the first frag­ment with each frag­ment’s selec­tion prob­abil­ity pro­por­tional to its mass. This approach theo­retically pro­duces a non­empty and un­biased sample, but it is even more im­practical than the original.