-
Notifications
You must be signed in to change notification settings - Fork 775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hashing for verifying correct input of code #72
Comments
There are actually a fair number of cases where you might/will type in only parts of the code: Treap, FastSubsetTransform (that one's weird), euclid, chinese, 2sat, TreePower, HLD (on the chopping block), sideOf, Angle, KMP, SuffixTree, Hashing, AhoCorasick, IntervalContainer. And in several more I can imagine that the 100->50 line reduction is handy. So if we could come up with some slick UI for indicating sections I'd be all for it. I agree with your comment about ambiguity, though, and I think we can start simple. |
Just a note: I updated the hash script in our book to include the |
I think knowing you have a mistype in 50 vs 100 lines of code is actually linearly (~2x) better for finding the bug, which amounts to maybe 5 minutes of time (and feeling a lot happier). |
Also, I'll note that we would've hashed sections in more files if we used them more/weren't too lazy to add the annotations. Honestly, we mostly used kactl for the stuff we added (which we broke into sections) and the geometry (which is short to begin with). |
Thanks for the note, I've made that change: dcdc34a (note also the golfed vimrc: |
Hi, I want to propose an idea for "partial hashes", idea communicated to me by https://codeforces.com/profile/camc let's say you want a struct:
you can split it up like:
... etc Now each member function is in it's own file, thus has it's own hash. Furthermore, you type exactly what you need: if you only need lca function, you only type it;verify hash, then copy into struct. If you need lca,dist, inSubtree, you type all three, verify all their hashes, then copy them into the struct Furthermore, the include statements tell you exactly where to put the member functions |
Now you don't want to force the user to type those include statements, so for me, when I generate the
|
furthermore, if you use something like the expander script for codeforces rounds where you can copy-paste; this method should still work |
for example for you can split apart fenwick tree lower bound https://github.com/kth-competitive-programming/kactl/blob/main/content/data-structures/FenwickTree.h#L24 as you rarely need that function for example for this https://github.com/kth-competitive-programming/kactl/blob/main/content/graph/CompressTree.h#L18 where you pass in |
See #63 (comment)
I don't think that hashing sections is worth it. MIT does hashing in 8 snippets: LCT, LinearRecurrence,Simplex.h, Polynomial,CycleCounting,GraphDominator, and both suffix arrays
I would split that into
"Should be split into different sections": Polynomial, CycleCounting
"trying to avoid hashing the typedef": LinearRecurrence
"Has parts that you don't always want": Both suffix arrays (ie: don't always need LCP)
"Not sure": LCT, Simplex, and GraphDominator (I don't know enough about the algorithms to understand whether you pretty much always want all functions)
That's a maximum of 4 snippets where it might be advantageous to have section-wise hashing.
The other argument for hashing sections is that if the hash fails, then you need to look at less of your code. I haven't done many offline contests with a TCR, but from my experience, knowing that you have a mistype in 50 lines of code is only marginally better than knowing you have a mistype in 100 lines of code. Both of these are massively better than not knowing whether you have a mistype or a logic error.
If we were to hash by section, I would propose having some kind of lightweight syntax (like a
//<--
) to demarcate sections, and then putting the hashes (truncated to 5 characters) in the header.Like so:
Another question with hashing is how we deal with things like typedefs, especially if they're typedefs that are likely to be typed multiple times (for example,
typedef vector<ll> Poly
). I think it's not too big of a deal, I would suggest to just get used to typing them in for the purpose of hashing.My biggest problem with avoiding them automatically is ambiguity with what hashes represent. "We hash everything that's printed" is obvious. "We hash everything after the typedefs" is less obvious.
The text was updated successfully, but these errors were encountered: