Use Reinforcement Learning to Generate Testing Commands for Onboard Software of Small Satellites