# Subsampling

## Laboratory Subsampling

French statistician/geologist Pierre Gy has developed a theory of material sampling, which is applicable to the subsampling of solid materials in the laboratory for radiochemical analysis. The complete theory is extensive and covers many aspects of sampling for lots of 0, 1, 2, or 3 dimensions. However, a laboratory sample corresponds to what Gy calls a 0-dimensional lot, which is the simplest kind of lot.

The aspects of the theory of 0-dimensional lots considered here include:

• How to reduce or eliminate sampling bias
• How to reduce the sampling variance
• How to quantify the fundamental sampling variance

### Definitions

Certain terms will be used repeatedly and need to be defined up front.

The lot is the collection of solid material whose properties are of interest. The lot consists of N fragments, or particles, where N is typically a very large number. For our purposes the lot is usually a laboratory sample.

The critical component is the component of interest (analyte) in the lot. The critical content of the lot or of any portion of the lot is the ratio of the mass of the critical component to the total mass of the lot or portion. For our purposes the critical component is usually a radionuclide, such as 137Cs or 238Pu, and the critical content is the mass fraction of that radionuclide (which is proportional to the specific activity).

The sample is a random nonempty subset of the fragments of the lot. The sample is taken from the lot and the measured properties of the sample are assumed to represent those of the lot. For our purposes the sample is usually a subsample, or aliquot, taken from a laboratory sample (the lot) for radiochemical analysis. To avoid confusion, we will usually avoid the sample/​subsample terminology and talk only about the lot/​sample.

Gy defines a probabilistic sample to be correct if every fragment in the lot has the same probability of being included in the sample. A probabilistic sample is biased if the expected value (or mean) of the critical content of the sample differs from the critical content of the lot. If the sample is not biased, it is unbiased. To say that a sample is unbiased does not mean that its critical content exactly equals that of the lot. Instead, it means only that the sample is selected in such a way that if the sampling could be repeated many times (with replacement), the average value of the sample’s critical content would equal the critical content of the lot.

It is not always possible to determine from the manner of selecting a sample whether it is biased. Whether the sample is biased depends on the masses and critical contents of its fragments. For example, if every fragment has exactly the same critical content, then any sample is unbiased, but if only one fragment contains all the lot’s critical component, then unbiased sampling requires more care. In practice, one never knows the critical contents of the fragments; so, one would like a sample that is unbiased regardless of what those critical contents might be.

We will define a sample to be fair if it is unbiased for any possible values of the critical contents of the fragments. Whether a sample is fair depends on the fragment masses but not on their critical contents.

Although in theory fair sampling is a nice goal, in normal practice correct sampling seems to be a more reasonable alternative.

### Notation

We denote the lot by L. When we write equations for sampling bias and variance, we will assume the fragments of the lot are numbered from 1 to N. Then mi denotes the mass of fragment i, Ai denotes the mass of critical component in fragment i, and ai denotes the critical content of fragment i. If G is any nonempty subset of the fragments, then mG denotes the total mass of G, AG denotes the mass of critical component in G, and aG denotes the critical content of G. So, mL denotes the mass of the entire lot and aL denotes its critical content.