With seed data to hand our Active Learning algorithms, in combination with other techniques such as evolutionary algorithms, determine which compounds to make and test next in order to best inform overall project knowledge. The objective is to achieve the largest increment in overall information for the fewest compounds synthesized and tested. As the project progresses and more information is generated, the construction of project specific models from the accumulated data will also become possible.
Performance of our Active Learning methods was covered in a Nature report [need link] in 2017 showing for the first time that algorithms could learn their way into a drug discovery dataset more effectively than most humans. Since that time our Active Learning techniques have been continually refined and improved so that now they represent a de facto approach in Exscientia drug discovery projects.