Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wl clean validate #5
base: main
Are you sure you want to change the base?
Wl clean validate #5
Changes from 11 commits
d42e94d
eb86505
fee3c78
05d548d
96d1507
f406da0
f989c6b
3a7ddf3
8148e3f
d0cb2ff
915bef5
1c0b396
d7d5235
93791e7
ccc7d5e
956688e
77bdc92
9be3fad
3f2d183
cd71ea2
124e09d
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be the filename, not the path. What if the filename changes? Then this script would break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
point to the file for names file, not just the directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, its just the comment that is wrong, not the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I see something this long, and with values like this hard-coded into a script I start to think how we can do this better... One thing is that the range of possible values is similar across most measures, whether it's the mean or max (which makes sense). And even for some se's.... Another is the repeat of all the Falses. Finally, if we wanted to update these values, editing the script feels wrong... I feel like a config file or a data file might be best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the other expectations have similar large amounts of data or is it just this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function does too much. A function should do just one thing. I would move the open command to the script (so read in the whole file) and then just have the regular expressions as what is modularized to the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not "tests" per say... I would refer to them as "input validation", "input validation checks" or "defensive programming" in general. And here, probably "input validation checks". This comment applies to all the comments you have named this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fairly certain that
open
returnsFileNotFoundError: [Errno 2] No such file or directory: 'filename.txt'
when the file doesn't exist, so we don't need this, nor the test for this? I think we have to be careful to not over-test for this "demonstration" of what good code should look like.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to be so strict here. Any file name could be fine, it's more important that it exists and that content that matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic numbers here are a brittle and confusing to others that won't know where they come from. This is the number of columns in
raw_data
, right, and so you can get this number fromraw_data.shape[1]
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't have this in your function, will Python's
.to_csv
give you aFileNotFoundError
or some other path error anyways? If so, we should remove this as it has already been handled byos.path.exists
.