Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the measurements creation to use parquet/dask rather than arrow/vaex files #800

Merged
merged 66 commits into from
Jan 29, 2025
Merged
Changes from 1 commit
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
3957434
Switched to output parquet file by default
ddobie Jan 22, 2025
5dee5f9
Remove timezone stuff
ddobie Jan 22, 2025
f47ba4a
Dependencies
ddobie Jan 22, 2025
f425524
Updated commands file
ddobie Jan 22, 2025
a49ae2b
Updated pipeline.utils.py
ddobie Jan 22, 2025
5697ad1
Updated forms.py
ddobie Jan 22, 2025
c2be6eb
Updated views.py
ddobie Jan 22, 2025
fd24ade
Updated config_template.yaml.j2
ddobie Jan 22, 2025
d344040
Updated templates/run_detail.html
ddobie Jan 22, 2025
613221f
More updates
ddobie Jan 22, 2025
a26c043
Fix run_detail template
ddobie Jan 22, 2025
6254837
Correctly handle overwrite check - check exists rather than isfile
ddobie Jan 23, 2025
6016e42
Check it for meas too
ddobie Jan 23, 2025
25f7e12
Correctly handle directory deletion
ddobie Jan 23, 2025
afaac16
Committed missing file?
ddobie Jan 23, 2025
9f233d8
Resolve merge conflicts
ddobie Jan 23, 2025
16faba0
Resolve merge conflicts
ddobie Jan 23, 2025
7c88fc3
Updated genparquet.md - still need to update a lot of the associated …
ddobie Jan 23, 2025
1c3e1aa
Done docs/using and docs/adminusage
ddobie Jan 23, 2025
891f9cb
Updated outputs docs
ddobie Jan 23, 2025
9c8a4c4
updated outputs docs
ddobie Jan 23, 2025
14bc5c3
Removed final vaex references
ddobie Jan 23, 2025
3e3746a
Fixed typo
ddobie Jan 23, 2025
664ef65
Merge branch 'v2.0' into v2-measurements-creation
ddobie Jan 24, 2025
c85fe0a
Update dependencies to remove vaex
ddobie Jan 24, 2025
1b970c3
Add initial batch of updated screenshots
ddobie Jan 24, 2025
4000761
Renamed docs/imgs arrow->parquet
ddobie Jan 24, 2025
b7a0ec3
Updated run_detail page
ddobie Jan 24, 2025
6eda109
Add files via upload
ddobie Jan 24, 2025
95c0772
Replaced parquet-modal
ddobie Jan 24, 2025
069f615
Updated Apache Parquet link
ddobie Jan 24, 2025
54fada1
Add files via upload
ddobie Jan 24, 2025
91649e4
Temp fix
ddobie Jan 24, 2025
0774cee
Added notes
ddobie Jan 24, 2025
05b55b2
Merge branch 'v2-measurements-creation' of github.com:askap-vast/vast…
ddobie Jan 24, 2025
c7168d6
Scrap pairs parquet generation
ddobie Jan 24, 2025
1d3c73d
Updated docs to reflect generating a single parquet file vs measureme…
ddobie Jan 28, 2025
3d08182
Updated webpage templates
ddobie Jan 28, 2025
4a9c756
First pass update of webform options
ddobie Jan 28, 2025
212772f
Updated run_detail.html
ddobie Jan 28, 2025
f83c23f
Fixed typo in logging
ddobie Jan 28, 2025
e78aec8
Fixed measurements parquet existence check
ddobie Jan 28, 2025
1fe50f2
Fixed parquet removal and pipeline config variable name
ddobie Jan 28, 2025
106ed89
write_parquet_files -> write_measurements_parquet in docs
ddobie Jan 28, 2025
d1692fa
Fix naming
ddobie Jan 28, 2025
af3a483
Update screenshots
ddobie Jan 28, 2025
9dc40b7
Merge branch 'v2-measurements-creation' of github.com:askap-vast/vast…
ddobie Jan 28, 2025
893ac3c
Update screenshot names
ddobie Jan 28, 2025
274ad12
Reorganisation
ddobie Jan 28, 2025
281cb07
Maybe commit uncommitted changes?
ddobie Jan 28, 2025
535382f
Remove unused import
ddobie Jan 28, 2025
368c049
Missed commit?
ddobie Jan 28, 2025
3637d53
Update docs/using/genparquet.md
ddobie Jan 29, 2025
51f645b
Update docs/using/runconfig.md
ddobie Jan 29, 2025
fcd6a6b
Added delete_file_or_dir function to utils
ddobie Jan 29, 2025
094f208
Implemented delete_file_or_dir
ddobie Jan 29, 2025
94ccf5d
Added missing import
ddobie Jan 29, 2025
e7c52ae
Remove arrow backup
ddobie Jan 29, 2025
5a239df
Implement copy_file_or_dir
ddobie Jan 29, 2025
c54ca92
Fix backup_parquets
ddobie Jan 29, 2025
4066850
Update variable names in backup_parquets
ddobie Jan 29, 2025
48f20dd
Added delete_file_or_dir import to pipeline.utils.py
ddobie Jan 29, 2025
6dbc6e0
Added missing shutils import
ddobie Jan 29, 2025
7667227
Fixed final os.remove
ddobie Jan 29, 2025
6984081
Fix deprecated dask config
ddobie Jan 29, 2025
cb74bf1
Remove unused shutil import - stupid linter
ddobie Jan 29, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Remove timezone stuff
ddobie committed Jan 22, 2025
commit 5dee5f947a5ed4d81344a1651d2d52a6fa5b14ee
3 changes: 0 additions & 3 deletions vast_pipeline/pipeline/utils.py
Original file line number Diff line number Diff line change
@@ -1349,9 +1349,6 @@ def _process_measurements_file(m_file: str,
measurements['id'].isin(associations_merge.index)
]

# drop timezone from datetime for vaex compatibility. V2 NOTE - remove
measurements['time'] = measurements['time'].dt.tz_localize(None)

measurements = optimise_numeric(measurements)
measurements = measurements.merge(associations_merge, right_index=True, left_on='id', how="inner").rename(columns={'source_id': 'source'})