Residential demand response of thermostatically controlled loads using batch Reinforcement Learning