We have cast the optimization problem as one of finding peaks, or at least relatively high values of a real valued response surface for in an experimental space, .
Typically the space may not be sampled exhaustively because it is too large. So the standard approach for optimization is to take a set of experiments that subsamples the space, analyze them, and then choose a second set of experiments, analyze them, choose a third set, and so on. This procedure is called iterated high throughput experimentation. The difficult question is which experiments to do in each set.
We will call each set a population of experiments, and each successive population a generation. The first generation is comprised of the initial experiments, and the experimental design task requires the specification of each successive generation given the information from all the previous generations. We will denote each generation of experiments as . The problem is how to specify as a function of .
For optimization to be effective, typically many generations must be performed, and so the optimization process requires a certain level of experimental bandwidth. How much bandwidth depends on (i) how big the experimental space is (including discretization), and (ii) how rugged the response surface is. The amount of experimental bandwidth needed cannot be predicted in advance, and is typically discovered as the iterated high throughput experiments progress.
In terms of the evolutionary procedure described above, the experimental bandwidth (measured in experiments per unit time) is characterized by how many experiments may be done per generation, P, together with the number of time that each generation takes. The total experimental capacity is usually set by resource constraints (consumables, and time), and is PN.
Typically, populations of experiments are carried out in multiwell plates, where the population size is P=96 or P=384 or some small integer multiple of these numbers, and the number of generations is 10-20.