Replace the raw *mlx.Array token input with a ForwardBatch struct that carries InputIDs alongside sequence metadata (SeqIDs, SeqLens). InputIDs remain [1, N] shaped — all model code is unchanged beyond the signature.