Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly return matched content instead of explicitely chopping off backticks #2

Open
dennib opened this issue Nov 27, 2017 · 9 comments

Comments

@dennib
Copy link

dennib commented Nov 27, 2017

https://github.com/dmg01/snippet-bot/blob/7857918d8e390ca9a06294e5dd566626f3037161/bot.js#L112-L119

I think what you need on line 118 is something like

var matches = regex.exec(string)
return (matches[1] || [])

I'm not sure, it's a bit late here, maybe I'll test this tomorrow after some sleep :P

@MegaLoler
Copy link
Collaborator

 function getCodeBlocks(string) 
 { 
 	const regex = /\`\`\`([a-z]*[\s\S]*?)\`\`\`/g;
        var matches = regex.exec(string);
        return matches == null ? [] : matches[1];
 } 

trying this variation on your idea (accounting for no match, where matches is set to null), it works for me, but it only extracts the first match, so if there are multiple code blocks, it ignores all but the first haha

@dennib
Copy link
Author

dennib commented Nov 28, 2017

Oh I see you're right, I explicitly indicated matches[1] as value to return (that is only the first match), I misread the behavior of the function.

Anyway, given n matches you should have an array (matches) of length n + 1 where matches[0] is the entire string (I think).
So in case we have at least 1 match, we could do something with matches array cutting off the first element, so something like:

return matches == null ? [] : matches.shift();

Does this make sense?

@MegaLoler
Copy link
Collaborator

it does make sense! although matches for me seems to look differently
given the input string "123 ```456``` abc ```def```", the result of regex.exec is this:

[ '```456```',
  '456',
  index: 4,
  input: '123 ```456``` abc ```def```' ]

@dennib
Copy link
Author

dennib commented Nov 28, 2017

Right, I took a better look at the result of exec (More here)
It seems like we have to chop off not only the first array element but also the last two (index and input properties), so i think we could try something like this:

Instead of
return matches == null ? [] : matches.shift();
somenthing like
return matches == null ? [] : matches.slice(1,matches.length - 2);

Is this better?

@MegaLoler
Copy link
Collaborator

apparently the key,value pairs don't contribute to the length of the object, and matches.length is actually just 2 xD (so my result from your suggestion there is always an empty array)

aside from that though, it doesn't seem to capture the second match at all, like the above example should have had 'def' included in the matches as well, which is weird, because the docs totally say it should be an array of all the matches

is something amiss with the regex? is there perhaps a regex flag we need to make it process all of them? (although i thought that's what g was supposed to do)

@MegaLoler
Copy link
Collaborator

oh, i think i get it better, matches looks like its supposed to return an array corresponding to different groups in the regex, so since we only have one pair of () there's only one match

but it looks like regex.exec can be called iteratively to get successive matches, i'll try that out!

@MegaLoler
Copy link
Collaborator

MegaLoler commented Nov 28, 2017

the original code i had which used match captured all the occurrences successfully, but it did have the backticks in the result

i think its probably easy (for me at least) to confuse occurrences of the regex in the string and matches in ():

  • exec seems to return an array of matches in () (so we only get 1 match because we only have one pair of () in our regex)
  • and match seems to return an array of occurrences of the regex in the string

but exec can also iteratively return results for each occurrence of the regex when called iteratively on the same RegEx object, which i got working like this:

function getCodeBlocks(string)
{       
        const regex = /\`\`\`([a-z]*[\s\S]*?)\`\`\`/g;
        const result = [];
        var matches; 
        while((matches = regex.exec(string)) !== null) result.push(matches[1]);
        return result;
}

which feels a little out of the way to me personally, but maybe i'm just trying to overly minimize the code xD it works fine anyways

i wonder which one is ultimately nicer, the original code with match and slicing off backticks, or this iterative one with exec

@dennib
Copy link
Author

dennib commented Nov 28, 2017

Ohh, I get it now! Nice!
Seems strange to me too, but I'm glad it works now anyway :P

@MegaLoler
Copy link
Collaborator

but yeah, thanks for talking through it with me xD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants