Fusion of stereo and monocular depth estimates in a self-supervised learning context