-
Notifications
You must be signed in to change notification settings - Fork 31
Add a matrix type using an Unboxed vector underneath. #56
base: master
Are you sure you want to change the base?
Conversation
This looks quite well. Thanks for the work! I wonder if it won't be a burden maintaining two versions of the same code. I would say the tests are one of the most fundamental parts to implement before merging. Benchmarks would be a nice addition too, especially if we could compare it to the boxed version. That would give this variant a good reason to exist. I'd say don't worry about all the multiplication algorithms. If the benchmarks say this version multiplies faster (and it probably will), I'm happy to accept it like that. We can always make progress to that particular purpose later if we want to. About the rules... They can probably be left there. I don't see any harm besides the warnings. They say the rules might not fire, but when they do it's a performance win. Although ideally we should try to fix #37. But I don't think that stops this PR from merging. |
@Daniel-Diaz after reading your comment, I put the rules back in place, added benchmarks for the unboxed version: overall, unboxed multiplications are 2-4x faster! In the process I also added some missing (imho) SPECIALIZE pragmas to the boxed version, for strassen multiplication, which improved the performances there by approx. 10%. Commit and detailed report will follow... |
(for an updated benchmark, see below. This one is kept here to see the Strassen and StrassenU times which lead to opening #57, and which are not in the updated benchmark)
|
…h unboxed and boxed versions to improve performances.
(Edited) Reflecting on these results, I wonder if strassen multiplication is of any use to users of the library, because it seems so slow compared to others, and blows up the memory when using a square matrix of length 512 or so (4GB during the benchmark, note that this may also be due to criterion running tests in parallel, but still...). strassen /mixed/ multiplication seems ok though. |
Updated benchmarks with new multiplication functions (now boxed and unboxed versions have the same functionalities), and where the strassen benchmark was removed (cf #57):
So the unboxed version is at least 2x faster (for smaller matrix sizes) and up-to 8x faster, for bigger matrix sizes, which is what we could reasonably expect. |
So I guess now what remains to be done is to add some tests, as I mentionned in the edited first message, we can't use I probably won't have time to work on it in the near future, so contributions on that side are welcome :) |
@Daniel-Diaz this PR would actually be very interesting performance-wise, any update? Thanks! |
Following up on #54, I implemented an unboxed matrix.
My intent was to have a version with minimal working functionality that would suit my project. Nevertheless, I open this PR in case someone has the same need or wants to continue the work.
What (probably) needs to be done before a merge:
Port multStd__, multStd2, strassenMixed and multStrassenMixed from the boxed implementations (I removed them here as I couldn't find out how to port them).(Done)mapMat
which does what fmap would do.Also, I removed the rules triggering warnings (see GHC 8 warnings about REWRITE rules not firing #37). They should be reintroduced once GHC 8 warnings about REWRITE rules not firing #37 is fixed.(see discussion below)Integer
which cannot be unboxed. We would have to useInt
tests.Benchmarking : on my project I saw better performances, but it would be nice to document it with criterion for example.(Done)