PACE has created a novel ICT for experimental optimization of complex chemical systems. The frontiers of experimental design (also known as design of experiments, or DOE) have been extended to the point where iterated high throughput experiments may be optimized to "program" target functionality. In so doing, these methods provide a set of techniques for the indirect programming of microscopic matter.
Designing experiments to create new chemical systems and to understand the process of emergence of new entities, we encountered the problem of the high dimensionality of the search space and the high throughput setting of experimentation. The large set of parameters that may affect the result of the experiments and the dramatic expansion of the resulting data set made prohibitive any exhaustive, or even accurate, explorationof the space with conventional combinatorial designs.
PACE research addressed this challenging problem of designing experiments for high dimensional and high throughput experimentation developing a new methodological approach and software, to conduct in an efficient and effective way experimental optimization. This approach, named Model Based Evolutionary - design (MBE-design)evolves the experimental strategy in an adaptive and sequential way, uncovering at each step of the evolution the relevant information in the experimental results and embedding this information in the search strategy. The approach built on real experiments and simulations, involves a bottom up modelling strategy (data-driven models) followed by top down modelling.
The performance of the approach is excellent with respect to conventional design strategies and with respect to simple evolutionary procedures such as a standard genetic algorithm or simulated annealing. The information achieved from statistical modelling enhances the evolutionary process, making faster the convergence to the optimal region of the experimental space. A major consequence of this result is the drastic reduction of the number of experiments that are needed for a given target optimization level and this is particularly relevant in biochemical experimentation where experiments are expensive and often difficult to do.
Two main areas of impact of this achievement are:
Design of experiments in high dimensional settings (HD-designs)
Statistical modelling for prediction and data mining in high throughput experimentation (HT-analysis)
PACE addressed the problem of the analysis of large data sets building a set of statistical tools and data driven models to discover the information hidden in the data. In particular we built Bayesian probabilistic networks where families of probability distributions, represented in term of graphs, describe the relations of conditional dependence / independence among variables. Bayesian networks have been very successful in getting inside of the interaction network of the experimental variables (Slanzi, et al, 2007). Cluster analysis and multiobjective optimization have also been applied, with interesting results. Finally, we developed an evolutionary approach to model data: models were derived with genetic algorithms, creating an algebra of models. The structure of the models, initially random, was transformed by the paradigm of evolution and led to a region of the models space with an optimality criterion (Minerva, et al. 2008).