Replies: 2 comments 4 replies
-
The As I understand it, it also uses the column's summary statistics - in particular, the max value, to determine the kind of numeric data type it will use when creating the table schema definition in PostgreSQL. I do something similar with Datapusher+, which uses @kindly, can you confirm? |
Beta Was this translation helpful? Give feedback.
-
@mhkeller yes the approach was to for the most lenient types. I had issue with running over the int size on a second loading into the same table. An aim of the type-guessing was for speed for large datasets. So the least options to check for each field was preferable, so checking for size of float and ints for every row was not my priority. Float parsing is particularly expensive. I thought extra disk space for these fairly negligible, compared to any text fields. However, for heavy numerical data this may not be ideal. |
Beta Was this translation helpful? Give feedback.
-
Can you give any insight into the algorithm that determines how you go from CSV types to Postgres types? I was doing some tests and saw in a csv like this:
it generates:
sql
and I was surprised it went for a BIGINT. Also I was curious about the choice of NUMERIC instead of a more specific float field. I imagine it goes with the most forgiving types for the most compatibility?
edit: it looks like a string like
2024-01-01
gets converted to a TIMESTAMP type instead of a DATE, which would also be consistent with that logic.Beta Was this translation helpful? Give feedback.
All reactions