Release 1.1.0
- All the kernels are now de-unrolled
Prior to this, all mad or fetch operations were manually unrolled which created register pressure on low-end devices. Now the operations are put in a for loop and left for the compiler to unroll - General improvements