A data-cleaning tool for building better prediction models -- MEDICA Trade Fair

A data-cleaning tool for building better prediction models

09/01/2016

Image: Software screenshot; Copyright: Eugene Wu

Tested on a real data set, ActiveClean(red), was able to clean 5,000 records to bring the prediction model to a 90 percent accuracy. Active learning(green), had to clean 50,000 records to achieve comparable results. The most common method (purple) provided minimal model improvement; ©Eugene Wu