b3274
gemma2: add sliding window mask (#8227) * gemma2: add sliding window mask * fix data_swa uninitialized * better naming * add co-author Co-authored-by: Arlo Phoenix <[email protected]> * replace list with single tensor * update * llama : minor styling * convert : add sanity check for query_pre_attn_scalar * fix small typo in README --------- Co-authored-by: Arlo Phoenix <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>