All-at-once optimization for kernel machines with canonical polyadic decompositions