Multi-frame deep learning models for action detection in surveillance videos