Automated model structure variation and policy robustness testing