Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadworks cannot be imported into PostGIS #306

Open
richardforsterNVBW opened this issue Jan 28, 2025 · 3 comments · May be fixed by #307
Open

Roadworks cannot be imported into PostGIS #306

richardforsterNVBW opened this issue Jan 28, 2025 · 3 comments · May be fixed by #307
Assignees

Comments

@richardforsterNVBW
Copy link
Contributor

richardforsterNVBW commented Jan 28, 2025

After the productive update today, the import of roadworks_geojson.json into PostGIS failed. Therefore, the database and the visualization in GeoServer are empty.
The logs show the following matching error of a point geometry:

2025-01-28 12:05:27.552 UTC [925981] ERROR:  Geometry type (Point) does not match column type (LineString)
2025-01-28 12:05:27.552 UTC [925981] CONTEXT:  COPY roadworks, line 210, column geometry: "0101000020E6100000020CCB9F6F3F48400F7BA180EDB02340"
2025-01-28 12:05:27.552 UTC [925981] STATEMENT:  COPY "public"."roadworks" ("index", "geometry", "id", "type", "subtype", "starttime", "endtime", "description", "reference", "street", "direction") FROM STDIN WITH CSV

This is the error displayed in Dagster:

dagster._core.errors.DagsterExecutionHandleOutputError: Error occurred while handling output "result" of step "roadworks":  File "/usr/local/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_plan.py", line 245, in dagster_event_sequence_for_step    for step_event in check.generator(step_events):  File "/usr/local/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_step.py", line 506, in core_dagster_event_sequence_for_step    for evt in _type_check_and_store_output(step_context, user_event):  File "/usr/local/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_step.py", line 553, in _type_check_and_store_output    for evt in _store_output(step_context, step_output_handle, output):  File "/usr/local/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_step.py", line 758, in _store_output    for elt in iterate_with_context(  File "/usr/local/lib/python3.11/site-packages/dagster/_utils/__init__.py", line 480, in iterate_with_context    with context_fn():  File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__    self.gen.throw(typ, value, traceback)  File "/usr/local/lib/python3.11/site-packages/dagster/_core/execution/plan/utils.py", line 84, in op_execution_error_boundary    raise error_cls(The above exception was caused by the following exception:psycopg2.errors.InvalidParameterValue: Geometry type (Point) does not match column type (LineString)CONTEXT:  COPY roadworks, line 210, column geometry: "0101000020E6100000020CCB9F6F3F48400F7BA180EDB02340"  File "/usr/local/lib/python3.11/site-packages/dagster/_core/execution/plan/utils.py", line 54, in op_execution_error_boundary    yield  File "/usr/local/lib/python3.11/site-packages/dagster/_utils/__init__.py", line 482, in iterate_with_context    next_output = next(iterator)                  ^^^^^^^^^^^^^^  File "/usr/local/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_step.py", line 748, in _gen_fn    gen_output = output_manager.handle_output(output_context, output.value)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/opt/dagster/app/pipeline/resources/postgis_geopandas_io_manager.py", line 239, in handle_output    obj.to_postgis(  File "/usr/local/lib/python3.11/site-packages/geopandas/geodataframe.py", line 2062, in to_postgis    geopandas.io.sql._write_postgis(  File "/usr/local/lib/python3.11/site-packages/geopandas/io/sql.py", line 461, in _write_postgis    gdf.to_sql(  File "/usr/local/lib/python3.11/site-packages/pandas/util/_decorators.py", line 333, in wrapper    return func(*args, **kwargs)           ^^^^^^^^^^^^^^^^^^^^^  File "/usr/local/lib/python3.11/site-packages/pandas/core/generic.py", line 3087, in to_sql    return sql.to_sql(           ^^^^^^^^^^^  File "/usr/local/lib/python3.11/site-packages/pandas/io/sql.py", line 842, in to_sql    return pandas_sql.to_sql(           ^^^^^^^^^^^^^^^^^^  File "/usr/local/lib/python3.11/site-packages/pandas/io/sql.py", line 2018, in to_sql    total_inserted = sql_engine.insert_records(                     ^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/usr/local/lib/python3.11/site-packages/pandas/io/sql.py", line 1558, in insert_records    return table.insert(chunksize=chunksize, method=method)           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/usr/local/lib/python3.11/site-packages/pandas/io/sql.py", line 1119, in insert    num_inserted = exec_insert(conn, keys, chunk_iter)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/usr/local/lib/python3.11/site-packages/geopandas/io/sql.py", line 348, in _psql_insert_copy    cur.copy_expert(sql=sql, file=s_buf)

In test, the import worked smoothly. Any ideas, what would be the reason?

I noticed that roadworks_geojson.json in productive is different from roadworks_geojson.json in test. Test has one line geometry which cannot be found in the prod GeoJSON. Its ID is 32066446-32066447-32066450-32066451-sperrung.

@hbruch
Copy link
Collaborator

hbruch commented Jan 28, 2025

Reason is that the geojson contains road closures for parkings which are provided as point geometries, which can't be imported into the LineString geometry column.

mobidata-bw/ipl-dagster-pipeline#184 fixes this.

I assume test has the latest, not yet released ipl-dagster-pipeline images installed(?)

@hbruch hbruch linked a pull request Jan 28, 2025 that will close this issue
@richardforsterNVBW
Copy link
Contributor Author

Which images are used on production and test?

@derhuerst
Copy link
Member

derhuerst commented Jan 29, 2025

Which images are used on production and test?

Both on md-ipl-test as well as on md-ipl-p, outdated images are in use (the latest is ghcr.io/mobidata-bw/dagster-pipeline:2025-01-28t15-57).

# md-ipl-test
docker ps -f 'name=ipl-dagster-pipeline-1' --format json | jq -rc .Image
# ghcr.io/mobidata-bw/dagster-pipeline:2024-12-03t09-35
# md-ipl-p
docker ps -f 'name=ipl-dagster-pipeline-1' --format json | jq -rc .Image
# ghcr.io/mobidata-bw/dagster-pipeline:2024-12-03t09-35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants