Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotfix 533 xml download yesterday misses gen url #534

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/
### Changed
- Change License identifier for pypi [#525](https://github.com/OpenEnergyPlatform/open-MaStR/pull/525)
- Change header to identify as open-mastr during http request [#526](https://github.com/OpenEnergyPlatform/open-MaStR/pull/526)
- Fixed missing call to gen_url in case first bulk download fails as xml file for today is not yet available [#534](https://github.com/OpenEnergyPlatform/open-MaStR/pull/534)
### Removed

## [v0.14.3] Fix Pypi Release - 2024-04-24
Expand Down
26 changes: 17 additions & 9 deletions open_mastr/xml_download/utils_download_bulk.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@
from open_mastr.utils.config import setup_logger

try:
USER_AGENT = f"open-mastr/{version('open-mastr')} python-requests/{version('requests')}"
USER_AGENT = (
f"open-mastr/{version('open-mastr')} python-requests/{version('requests')}"
)
except PackageNotFoundError:
USER_AGENT = "open-mastr"
log = setup_logger()
Expand Down Expand Up @@ -58,7 +60,8 @@ def gen_version(when: time.struct_time = time.localtime()) -> str:
# only the last two digits of the year are used
year = str(year)[-2:]

return f'{year}.{release}'
return f"{year}.{release}"


def gen_url(when: time.struct_time = time.localtime()) -> str:
"""
Expand All @@ -77,7 +80,7 @@ def gen_url(when: time.struct_time = time.localtime()) -> str:
version = gen_version(when)
date = time.strftime("%Y%m%d", when)

return f'https://download.marktstammdatenregister.de/Gesamtdatenexport_{date}_{version}.zip'
return f"https://download.marktstammdatenregister.de/Gesamtdatenexport_{date}_{version}.zip"


def download_xml_Mastr(
Expand Down Expand Up @@ -125,18 +128,23 @@ def download_xml_Mastr(
time_a = time.perf_counter()
r = requests.get(url, stream=True, headers={"User-Agent": USER_AGENT})
if r.status_code == 404:
# presumably todays download is not ready yet, retry with yesterdays date
log.warning("Download file was not found. Assuming that the new file was not published yet and retrying with yesterday.")
now = time.localtime(time.mktime(now) - (24 * 60 * 60)) # subtract 1 day from the date
log.warning(
"Download file was not found. Assuming that the new file was not published yet and retrying with yesterday."
)
now = time.localtime(
time.mktime(now) - (24 * 60 * 60)
) # subtract 1 day from the date
url = gen_url(now)
r = requests.get(url, stream=True, headers={"User-Agent": USER_AGENT})
if r.status_code == 404:
log.error("Could not download file: download URL not found")
return

total_length = int(18000 * 1024 * 1024)
with open(save_path, "wb") as zfile, tqdm(
desc=save_path, total=(total_length / 1024 / 1024), unit=""
) as bar:
with (
open(save_path, "wb") as zfile,
tqdm(desc=save_path, total=(total_length / 1024 / 1024), unit="") as bar,
):
for chunk in r.iter_content(chunk_size=1024 * 1024):
# chunk size of 1024 * 1024 needs 9min 11 sek = 551sek
# chunk size of 1024 needs 9min 11 sek as well
Expand Down
Loading