Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate database type to UUID instead of STRING #6870

Open
sanderegg opened this issue Nov 29, 2024 · 3 comments
Open

Migrate database type to UUID instead of STRING #6870

sanderegg opened this issue Nov 29, 2024 · 3 comments
Assignees
Labels
a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:services-library issues on packages/service-libs t:maintenance Some planned maintenance work

Comments

@sanderegg
Copy link
Member

In order to make the usage of asyncpg more streamlined, it would be nice that we migrate the types in the databsae to UUID instead of using a STRING type.

It might also save memory in the database.

@sanderegg sanderegg added a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:services-library issues on packages/service-libs t:maintenance Some planned maintenance work labels Nov 29, 2024
@pcrespov
Copy link
Member

pcrespov commented Dec 5, 2024

Consider option with partial uuid and prefix as well such us prj_3ceedb6a-c356-48d3. This hybrid has many obvious advantages when compared with pure uuids or big-integers as identifiers

This can be achieved using

  • sa.String column and client/server side default generator (proposed by @matusdrobuliak66 )
  • sa.UUID column with a pydantic model with specialized context serializer that appends/drops the prefix when reading/writing to the pg database (proposed by @sanderegg )

@sanderegg
Copy link
Member Author

#6897 (comment)

@sanderegg
Copy link
Member Author

sanderegg commented Dec 5, 2024

chatty says:
@matusdrobuliak66 @pcrespov

Advantages of Using STRING for UUIDs:

  1. Compatibility with Legacy Systems:

If your application originally used string representations of UUIDs, sticking with strings might allow for easier integration without refactoring your existing codebase.
2. Flexibility:

Using a string (typically a 36-character length) allows some flexibility in case UUID-like values need to include additional identifiers or custom formats.
3. Simplified Data Import/Export:

If you frequently import/export data to systems that treat UUIDs as strings, it may simplify the handling process by removing the need for type conversion.

Advantages of Using UUID Type:

  1. Storage Efficiency:

PostgreSQL's UUID type is stored as 16 bytes, whereas a STRING/VARCHAR for a UUID typically takes 36 bytes (additional storage is needed for the dashes and string encoding).

  1. Performance:

Native UUID handling in PostgreSQL ensures more efficient indexing and searching operations, as comparisons and operations are optimized for the binary representation rather than a string comparison.
3. Data Integrity:

Using the UUID type ensures that only valid UUIDs can be stored in the column, maintaining data integrity by preventing accidental storage of malformed UUIDs or incorrect data types.
4. Built-in Functions:

PostgreSQL provides built-in functions for generating and working with UUIDs when using the UUID type, offering easier, more readable, and less error-prone ways to generate and manipulate UUIDs.
5. Standardization:

Using a dedicated type reinforces best practices and standardization of how UUIDs are handled, which can be beneficial for code readability and maintenance over time.

Conclusion:

While using strings for storing UUIDs might sometimes be necessary due to legacy constraints or specific application requirements, using the UUID type is generally advantageous for performance, storage efficiency, and ensuring data integrity. In new applications or when refactoring, it's usually better to lean towards the UUID type unless there's a strong reason to do otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:services-library issues on packages/service-libs t:maintenance Some planned maintenance work
Projects
None yet
Development

No branches or pull requests

6 participants