[0.0.22.post4] Build binaries for pytorch 2.1.0 / cuda12.1
Also adds back support for Flash-Attention on windows (only for cuda 12.1 build) - the wheels won't include FA on windows for now, as we have some issues to fix in our CI first (should be done in ~a week hopefully)