Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug/skip date with output #314

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

TheFrok
Copy link
Collaborator

@TheFrok TheFrok commented Apr 6, 2020

I reopened the PR because it was stuck in an odd state.

The changes are to the part that searches for existing output files.

--
From the description of the original PR (#313):

  • Fixed an error in the function that searches for dates without output
  • added skip as an option to the configuration
  • added the correct regex for file names to the example config and set it as a default value
  • removed unnecessary fields from the minimal config example

 - added skip as an option to the configuration
 - added the correct regex for file names to the example config and set it as a default value
 - removed unnecessary fields from the minimal config example
 - added skip as an option to the configuration
 - added the correct regex for file names to the example config and set it as a default value
 - removed unnecessary fields from the minimal config example
@TheFrok TheFrok requested review from OrBin and cjer April 6, 2020 11:20
},
"output_file_name_regexp": "^(?P<date_str>[^_]+?)_(?P<type>\\w+)",
"output_file_type": "csv.gz"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove output_file_name_regexp? If you changed it to be optional, specify it in the schema

@@ -7,7 +7,7 @@
"filtered_feeds": "filtered_feeds",
"logs": "logs"
},
"output_file_name_regexp": "^(?P<date_str>[^_]+?)_(?P<type>\\w+)",
"output_file_name_regexp": "^(?P<type>\\w+)_(?P<date_str>[^_]+?)",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about this change? Did you consult with @cjer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you can see here, there is no option to change the format of the filenames:

date_str = date.strftime('%Y-%m-%')
trip_stats_output_path = join(output_folder, f'trip_stats_{date_str}.{output_file_type}')
route_stats_output_path = join(output_folder, f'route_stats_{date_str}.{output_file_type}')

@@ -28,6 +28,12 @@
"type": "boolean",
"default": false
},
"skip_date_with_output": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date or dates?


# validate that the regex used the correct group names
faulty_group_names = False
if "type" in regexp.groupindex and "date_str" in regexp.groupindex:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the condition be negated?

file_type = (parse_conf_date_format(date_str), stats_type)
file_type = _parse_file_name_regex_match(match, faulty_group_names)
if file_type is None:
# return empty list if there was an error in one of the files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd guess we'd like to return only the found files. Why would we give up on all of the files if one failed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that it is indicating of something weird going on, but maybe you are right.
I can just log that and continue


# validate that the regex used the correct group names
faulty_group_names = False
if "type" in regexp.groupindex and "date_str" in regexp.groupindex:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't you like to put it inside _parse_file_name_regex_match?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that but than the log message would log for every file, although they are all using the same regex.
I thought creating a global variable just for that is a bit disgusting.

@TheFrok
Copy link
Collaborator Author

TheFrok commented Jun 2, 2020

Does anyone knows why github thinks the travis build are still running?
@cjer

@OrBin
Copy link
Collaborator

OrBin commented Jun 3, 2020

Does anyone knows why github thinks the travis build are still running?
@cjer

Just a bug in GitHub, I guess. An administrator can merge even without the checks passing, so don't worry.

@cjer
Copy link
Member

cjer commented Oct 22, 2020

@TheFrok @OrBin Can we merge this?

@TheFrok
Copy link
Collaborator Author

TheFrok commented Oct 26, 2020

I think we can merge.
The only thing is the log discussion, which is kind of a taste thing that I'm not sure about

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants