Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name proposal for lazy operations: past tense #11

Open
andyferris opened this issue Nov 3, 2018 · 7 comments
Open

Name proposal for lazy operations: past tense #11

andyferris opened this issue Nov 3, 2018 · 7 comments

Comments

@andyferris
Copy link
Member

andyferris commented Nov 3, 2018

I've been happy to seperate the semantics of greedy-vs-lazy operations into seperate functions, like map vs mapview and group vs groupview. However, the view suffix is a little tiresome.

I'm wondering if we should follow the example set by Base.Broadcast which uses broadcast(...) = materialize(broadcasted(...)), where broadcasted is more-or-less a lazy version of broadcast and materialize is something that behaves a bit like copy when necessary.

That would be something like:

  • mapview -> mapped.
  • groupview -> grouped.
  • Lazy join functions ending in joined instead of join.
  • A new filtered for lazy filter.
  • product is a noun, not a verb, and seems fine being lazy.
  • flatten vs flattened?
  • We've been discussing splitdims at Base and slice/slices seems like a possible naming. sliced could be a lazy version?
  • etc...

Does anyone have any thoughts or opinions?

@nalimilan
Copy link
Member

I'm not sure whether using past tense is explicit enough. Anyway, better discuss that in the base Julia repo?

@bramtayl
Copy link

bramtayl commented Dec 6, 2018

i like this

@bramtayl
Copy link

Ok, further thoughts: why not just make everything lazy? Eager methods could just be optimization methods of Base.collect

@andyferris
Copy link
Member Author

While I may somewhat agree with you... I try to follow Base semantics here for things like map. It might be too late to start fiddling with that (when it was discussed with earlier versions of Julia I understand it was felt at the time there may be too much run-time overhead with laziness. Even if the compiler is better at allowing zero-cost abstractions these days, complexity still goes up considerably).

Also, some operations aren't obviously better being lazy. groupview is only partially lazy (full lazy would be much worse!). filtered cannot possibly preserve array-ness. It's a tricky space, I feel.

@bramtayl
Copy link

I mean, map internally creates a generator and collects it, so...
Being super-lazy opens up options for optimizations. For example, mapping a reduce function over truly lazy groups would enable the groupreduce optimization.

@andyferris
Copy link
Member Author

For example, mapping a reduce function over truly lazy groups would enable the groupreduce optimization.

Unlike most operations where we can just rely on iteration and separation of concerns and still get optimal performance, for this particular case I think it would be necessary to overload the method explicitly. There's worse complications regarding anonymous functions that can't be introspected and the fact that map and broadcast do not work on AbstractDict in the first place. That is, I couldn't figure out a way to make map(g -> reduce(+, g), grouped(by, itr)) or even sum.(grouped(by, itr)) work. I raged a bit on JuliaLang/julia at the time :)

Of course, for something like mapreduce this is trivial, as reduce(op, mapview(f, itr); init = ...) already works out-of-the-box.

@bramtayl
Copy link

I think the solution in that case is pretty easy: collect(Generator(Reduce(f), Grouped(by, iter)) would just have to optimized to groupreduce(f, by, iter). I have an implementation of Reduce in JuliennedArrays. Grouped just has to be truly lazy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants