aarch64: h264pred: Optimize the inner loop of existing 8 bit functions