Demo: adding m[x] notation for indexing #9174

richcarl · 2024-12-10T20:23:23Z

This is just a proof of concept as a discussion point. We could easily have m[x] notation for looking up a value in a map (and possibly other data types), if we wanted to. It's no different grammatically from how it works in e.g. Python or Javascript.

f(X, Y) ->
    A=#{a => #{1=>a, 2=>b, 3=>c},
        b => #{1=>p, 2=>q, 3=>r},
        c => #{1=>x, 2=>y, 3=>z}},
    A[X][Y].

As people are using maps more and more, I see that there could be a case for adding this syntax now, as an alias for erlang:map_get/2, especially for accessing nested maps. Works in guards as well.

github-actions · 2024-12-10T20:24:14Z

CT Test Results

2 files 96 suites 1h 9m 8s ⏱️
2 173 tests 2 125 ✅ 48 💤 0 ❌
2 536 runs 2 486 ✅ 50 💤 0 ❌

Results for commit f47e1ec.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

Maria-12648430 · 2024-12-12T09:34:55Z

This is just a proof of concept as a discussion point. We could easily have m[x] notation for looking up a value in a map (and possibly other data types), if we wanted to. It's no different grammatically from how it works in e.g. Python or Javascript.
f(X, Y) ->
    A=#{a => #{1=>a, 2=>b, 3=>c},
        b => #{1=>p, 2=>q, 3=>r},
        c => #{1=>x, 2=>y, 3=>z}},
    A[X][Y].
As people are using maps more and more, I see that there could be a case for adding this syntax now, as an alias for erlang:map_get/2, especially for accessing nested maps. Works in guards as well.

This is something I have wished for for a long time, thanks ❤️

I have some issues with the terminology and syntax, though 😅

The term "index" seems wrong to me in the context of maps, and should be "key" or something. The grammar token should be something like map_lookup_expr, in order to remove the "index" part and make it obvious that it concerns maps.

The proposed syntax of M[X][...] using square brackets puts this uncomfortably close to lists territory. It may be just me, but when I see square brackets, I think "lists". What I mean becomes more obvious when you put in some more formatting. Using the last line from your example, imagine it written like this:

    ...
    A
        [X]
        [Y].

It's not really obvious at a glance what this means. Each line on its own is valid syntax. Which also means, if a stray comma sneaks in, like this...

    ...
    A
        [X],
        [Y].

... it won't be detected by the compiler, since it is all totally valid. But the result won't be what you want at all: [Y] instead of maps:get(maps:get(A, X), Y), all because of a misplaced, hard to spot comma. Debug this 👊

My suggestion is, at least put in a #. Me, when I see #, I always think "maps". Maybe also abolish square brackets here, another option would be something like #{X}, that is, a map-like construct without the assignment. But most important, make it so that the lookup construct is not valid code on its own.

williamthome · 2024-12-12T11:02:53Z

I don't dislike the proposed notation, but I agree with Maria.
However, I'll give another suggestion.

This is a record notation:

-record(foo, {bar}).

rec() ->
    Rec = #foo{bar = bar},
    Rec#foo.bar.

I suggest using a notation like the one on record:

map() ->
    Map = #{foo => bar}.
    #Map.foo.

To me, this notation feels more obvious that I'm getting a map value instead of Map[foo].

Note that the below are an invalid record notations:

invalid_rec_1() ->
    Foo = foo, 
    Rec = #foo{bar = bar},
    Rec#Foo.bar.

invalid_rec_2() ->
    Foo = foo,
    #Foo{bar = bar}.

richcarl · 2024-12-12T11:15:38Z

The term "index" seems wrong to me in the context of maps, and should be "key" or something.

The act of looking up something is commonly referred to as indexing, and the terminology should also not be too specific to the particular application of looking up from a dictionary, since it could be made to apply to things like tuples or binaries as well.

The proposed syntax of M[X][...] using square brackets puts this uncomfortably close to lists territory. It may be just me, but when I see square brackets, I think "lists".

This is true, but the same goes for Python, Javascript, Ruby, etc. They all have the same possibility of accidentally leaving out a comma, but I don't see many people pointing this out as a big problem. The advantages of using a universally recognized notation compared to adding yet another Erlang specific quirk should outweigh the risk of occasional typos, I think.

Maria-12648430 · 2024-12-12T11:15:48Z

map() ->
    Map = #{foo => bar}.
    #Map.foo.

This is kinda ok (though I can't say I like it much, sorry to say) as long as the key is an atom. It gets pretty weird when the key is something else: #Map."foo", #Map.{foo}, #Map.[foo], #M.#{foo => bar}. And it gets even weirder when the map to access is not in a variable but given as a literal: ##{foo => bar}.foo.

Maria-12648430 · 2024-12-12T11:18:04Z

This is true, but the same goes for Python, Javascript, Ruby, etc.

This doesn't mean it is good, just saying 😜

The advantages of using a universally recognized notation compared to adding yet another Erlang specific quirk should outweigh the risk of occasional typos, I think.

Let's just say I really disagree on that point 😅

Curious to see what others think...

richcarl · 2024-12-12T11:21:31Z

I suggest using a notation like the one on record:

Keep in mind that both map and key can be computed values. You probably don't want this:

    #(build_a_map(Of, Stuff)).(get_the_key(From, Somewhere))

richcarl · 2024-12-12T12:07:05Z

Oh, and don't forget that you probably want to be able to chain them in a good way. Consider:

    #(#(get_map_of_maps(Of, Stuff).(get_the_key(From, Somewhere))).(get_other_key(Elsewhere))

compared to

    get_map_of_maps(Of, Stuff)[get_the_key(From, Somewhere)][get_other_key(Elsewhere)]

juhlig · 2024-12-12T12:43:09Z

The act of looking up something is commonly referred to as indexing

Really? I would say that the common expression is "looking up (by index, by key, by ...)". English is not my native language, though, so I may be mistaken.

the terminology should also not be too specific to the particular application of looking up from a dictionary, since it could be made to apply to things like tuples or binaries as well.

I think this should be restricted to maps. Because aside from maps, what this could be applied to are:

Lists. However, accessing list elements by their index is discouraged, because, performance etc. But if you introduce a shortcut for doing it, you encorage it instead, because if the language provides a shortcut for it, it can't be bad, right?
Tuples. Yes, but how often do you want to access tuple elements by their index? They are usually small and pattern-matched against. Cases where you need access by index are rare enough that using element/2 is not too much of a pain.
Binaries. No. What should Bin[N] give you? The Nth byte? The Nth bit? What about other unit sizes? And what should the type of the returned value be? An integer? In what endianness, and signed or unsigned? Or a (sub-) binary?

richcarl · 2024-12-12T13:41:00Z

I think this should be restricted to maps.

That's perfectly fine, but the (suggested) notation is neutral and shouldn't be hard coded to maps only, because we don't know what we might want to add in the future. Maybe built-in arrays or other kinds of vectors as an example.

garazdawi · 2024-12-12T13:50:03Z

Maybe built-in arrays or other kinds of vectors as an example.

Or somekind of Access protocol similar to what Elixir uses :) So that we could use it for custom datatypes.

richcarl · 2024-12-12T14:04:05Z

Or somekind of Access protocol similar to what Elixir uses :) So that we could use it for custom datatypes.

Yes, it's an interesting question if that can be added to Erlang in a nice way (with little overhead and no complications at compile time and supporting dynamic code updates and with good debugging support).

juhlig · 2024-12-13T09:52:57Z

Maybe built-in arrays or other kinds of vectors as an example.

Or somekind of Access protocol similar to what Elixir uses :) So that we could use it for custom datatypes.

(Disclaimer: This is just my personal opinion)

Please don't introduce multiple mostly-alike-but-oh-so-slightly-different implementations of the same thing. Also, please don't introduce "magic wands" that work on anything remotely alike except when it doesn't.

This is one thing I hate most about how Clojure (what I do for a living) handles things, and what I cherish most about how Erlang handles things.

Like, lists and vectors in Clojure. Basically the same thing from an outside (user) perspective; except when you eg conj, with lists in prepends, with vectors it appends.
In Erlang in contrast, there is one thing for one purpose, and so there is one way of doing things, with absolutely predictable (and debuggable) results.

map, filter and friends work on lists/vectors, sets, maps (and whatnot, basically anything that can be traversed), except that those are really different things, and you may have to at least design the mapping/filtering/etc function according to what you think is the input; they are handled all the same as if they were vectors/lists (and nil is taken as "empty", which doesn't make it better); and the result is always a list, no matter what the input was (-> see lists, vectors, conj above), which you tend to forget.
In Erlang, if I want to do something with maps, I have to use the maps module or maps-specific syntax, which means I can be sure that what goes in and/or comes out is a map, not a list, not a tuple; and if I want to do something with lists, I have to use the lists module or lists-specific syntax, which means I can be sure that what goes in and/or comes out is a list.

The Clojure way of "pick your best fit of a variety of similiar data structures" and "one function to work on every data structure" may seem nice and shiny at first. Until you get to debugging things, and all your logic is fine, but different things happen because the data running through it is of one type and not another, or got silently transformed from one type into a another, one that can be (and is) processed, just with slightly different results.

juhlig · 2024-12-13T10:08:38Z

The advantages of using a universally recognized notation compared to adding yet another Erlang specific quirk should outweigh the risk of occasional typos, I think.

Let's just say I really disagree on that point 😅

So do I, wholeheartedly! I think that fitting a new feature/syntax into a language should, above all, be guided by the specifics (or quirks, if you want) of the language it is introduced into, not by what it looks like in other languages that already have that feature.

What you will get by adopting the latter approach is a very messy mish-mash conglomerate of syntaxes borrowed from all over the place. "$THIS has to be done the ~~rubbish~~ Ruby-ish way; $THAT has to be done the Elixir-ish way; $OTHER_THING has to be done the Haskell-ish way; and there still is the Erlang heritage way for older things".

tl;dr, my 2ct: We are doing Erlang, so let's just keep doing things the Erlang way. There are enough things for newcomers to get used to, not only syntax, that is IMO a minor part. Learning a different but not too far off access syntax is just another small thing on the learners side to overcome. Using a construct that does not fit into the overall language is a permanent sore once you're into the language.

Maria-12648430 · 2024-12-13T10:28:26Z

tl;dr, I totally agree with what @juhlig said.

To illustrate the point of "works on everything, except when it doesn't" in the current access syntax context, imagine the following:

M = #{a => b, 1 => c}.
L = [b, a].

M[1] and L[1] will both work, but access different things (the map key 1 in the former, the element at position 1 in the latter case), and yield different results
M[a] will work, but L[a] won't
M[2] won't work, but L[2] will

By using the same access syntax on different things, you never know what it is that you will get. To be sure, you have to use type checks (is_map, is_list, ...) beforehand and act differently, but then again you can usually just pattern match out the interesting thing while you're doing that.

richcarl · 2024-12-13T10:32:45Z

The Clojure way of "pick your best fit of a variety of similiar data structures" and "one function to work on every data structure" may seem nice and shiny at first. Until you get to debugging things, and all your logic is fine, but different things happen because the data running through it is of one type and not another, or got silently transformed from one type into a another, one that can be (and is) processed, just with slightly different results.

I agree that this kind of situation is not what you want, and thanks for sharing your Clojure experiences. Like I said, debugging must work well. On the other hand there are situations where you do want to be able to run the same nontrivial piece of code on different implementations of a data type, and you do not want to maintain several copies of that code each tailored to use a separate implementation. Today, you can solve this using preprocessor macros, or passing around module names for dynamic calls, or by parse transforms or similar code generation approaches, and all those are mostly worse for debugging. It would be nice to have a general built-in approach with good support for tracing and debugging.

garazdawi · 2024-12-13T13:00:11Z

Protocols can be a source a great confusion, but also reduce the amount of code written and make things more performant. For example being able to let the json module iterate over a gb_tree or an ets table, instead of having to provide a custom encoder function. Being able to lookup an element in a list by an index, not so much, which I imagine is why you get this in Elixir if you try:

** (ArgumentError) the Access module does not support accessing lists by index, got: 1

Accessing a list by index is typically discouraged in Elixir, instead we prefer to use the Enum module to manipulate lists as a whole. If you really must access a list element by index, you can Enum.at/1 or the functions in the List module
    (elixir 1.17.3) lib/access.ex:347: Access.get/3
    iex:2: (file)

I doubt that we will ever get protocols in Erlang, but at times I think we really are missing out. The combination of protocols and structs in Elixir is very neat IMO.

michalmuskala · 2024-12-16T11:45:10Z

I agree that addition of a map-key access syntax would be a good addition and it would allow simplifying quite a bit of code. I'm not convinced about the M[key] syntax.

The original map EEP proposed M#{key} as the access syntax, but it was never implemented.
I think one of the issues with that syntax is that it's easy to confuse with the update syntax.

I agree with the premise of this discussion about using [] for access as a clear "indexing" marker. I would however, argue to keep the existing "syntax marker" for maps of #. We'd end up with a M#[key] syntax in this case.

Polymorphism is a tool with advantages and disadvantages - traditionally Erlang had fairly little polymorphism, and in places where it had it - notably size/1 BIF, it was later effectively deprecated and is a performance pitfall today. I'm not sure changing the situation just in this one place makes sense - if indeed Erlang should be extended with some form of polymorphism, it should be more generic than just access syntax.

bjorng added the team:LG Assigned to OTP language group label Dec 11, 2024

richcarl force-pushed the index-expressions branch from 32cbe09 to f15f68b Compare December 28, 2024 14:39

Add m[x] notation for indexing

f47e1ec

richcarl force-pushed the index-expressions branch from f15f68b to f47e1ec Compare December 28, 2024 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo: adding m[x] notation for indexing #9174

Demo: adding m[x] notation for indexing #9174

richcarl commented Dec 10, 2024

github-actions bot commented Dec 10, 2024 •

edited

Loading

Maria-12648430 commented Dec 12, 2024

williamthome commented Dec 12, 2024

richcarl commented Dec 12, 2024

Maria-12648430 commented Dec 12, 2024

Maria-12648430 commented Dec 12, 2024 •

edited

Loading

richcarl commented Dec 12, 2024

richcarl commented Dec 12, 2024

juhlig commented Dec 12, 2024 •

edited

Loading

richcarl commented Dec 12, 2024

garazdawi commented Dec 12, 2024

richcarl commented Dec 12, 2024

juhlig commented Dec 13, 2024 •

edited

Loading

juhlig commented Dec 13, 2024 •

edited

Loading

Maria-12648430 commented Dec 13, 2024

richcarl commented Dec 13, 2024

garazdawi commented Dec 13, 2024

michalmuskala commented Dec 16, 2024 •

edited

Loading

Demo: adding m[x] notation for indexing #9174

Are you sure you want to change the base?

Demo: adding m[x] notation for indexing #9174

Conversation

richcarl commented Dec 10, 2024

github-actions bot commented Dec 10, 2024 • edited Loading

CT Test Results

Artifacts

Maria-12648430 commented Dec 12, 2024

williamthome commented Dec 12, 2024

richcarl commented Dec 12, 2024

Maria-12648430 commented Dec 12, 2024

Maria-12648430 commented Dec 12, 2024 • edited Loading

richcarl commented Dec 12, 2024

richcarl commented Dec 12, 2024

juhlig commented Dec 12, 2024 • edited Loading

richcarl commented Dec 12, 2024

garazdawi commented Dec 12, 2024

richcarl commented Dec 12, 2024

juhlig commented Dec 13, 2024 • edited Loading

juhlig commented Dec 13, 2024 • edited Loading

Maria-12648430 commented Dec 13, 2024

richcarl commented Dec 13, 2024

garazdawi commented Dec 13, 2024

michalmuskala commented Dec 16, 2024 • edited Loading

github-actions bot commented Dec 10, 2024 •

edited

Loading

Maria-12648430 commented Dec 12, 2024 •

edited

Loading

juhlig commented Dec 12, 2024 •

edited

Loading

juhlig commented Dec 13, 2024 •

edited

Loading

juhlig commented Dec 13, 2024 •

edited

Loading

michalmuskala commented Dec 16, 2024 •

edited

Loading