Social interaction for efficient agent learning from human reward