Exploiting noisy and incomplete biological data for prediction and knowledge discovery