Learning variable selection rules for the branch-and-bound algorithm using reinforcement learning