Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UUID queries are made with NCHAR #406

Open
nils-van-zuijlen opened this issue Jul 17, 2024 · 3 comments
Open

UUID queries are made with NCHAR #406

nils-van-zuijlen opened this issue Jul 17, 2024 · 3 comments

Comments

@nils-van-zuijlen
Copy link

nils-van-zuijlen commented Jul 17, 2024

There are some features which are not supported yet. Please check the Limitations first to see if your bug is listed.

Software versions

  • Django: 4.2
  • mssql-django: 1.4.2
  • python: 3.11.8
  • SQL Server: 14.0.0
  • OS: Linux 6.6.36 NixOS SMP PREEMPT_DYNAMIC Thu Jun 27 11:49:15 UTC 2024 x86_64 GNU/Linux

Table schema and Model

class Foobar(models.Model):
    reference = models.UUIDField(_("reference"), default=uuid.uuid4, primary_key=True)

Database Connection Settings

DATABASES = {
    "default": {
        "ENGINE": "mssql",
        "NAME": os.getenv("DB_NAME"),
        "USER": os.getenv("DB_USER"),
        "PASSWORD": os.getenv("DB_PASSWORD"),
        "HOST": os.getenv("DB_HOST", "127.0.0.1"),
        "PORT": int(os.getenv("DB_PORT", "1433")),
        "OPTIONS": {
            # This database doesn't have any triggers so can use return
            # rows from bulk insert feature
            "return_rows_bulk_insert": True
        },
    }
}

Problem description and steps to reproduce

The UUID column gets created using a CHAR(24) column, and gets a corresponding primary key index, also in CHAR(24).

When executing a query filtering on that column, using Foobar.objects.get(reference=uuid.UUID(...)), the uuid gets passed as an NCHAR to the database. This makes MSSQL do a full index scan, converting all index values to NCHAR to compare them.

Expected behavior and actual behavior

I would expect the UUID to be passed as CHAR values, in order to have parameters match the index and column types without conversion.

Any other details that can be helpful

The workaround we found for our most used queries is to do them manually and insert the uuid directly into the query string, taking care to ensure only valid UUIDs can be inserted at that point to avoid SQL injection. But that approach is not scalable to all queries.

@nils-van-zuijlen
Copy link
Author

This is caused by https://github.com/django/django/blob/c5d196a65264136ee6795356871a29f3d22ec52f/django/db/models/fields/__init__.py#L2674

As can be tested with the following query:

cursor.execute(
    "SELECT CONVERT(nvarchar(max), SQL_VARIANT_PROPERTY(%s, 'BaseType'))",
    (uuid.uuid4().hex,)
).fetchall()

[('nvarchar',)]

This can be fixed using native fields, as in #134, because the types will then match between the column and the query:

cursor.execute(
    "SELECT CONVERT(nvarchar(max), SQL_VARIANT_PROPERTY(%s, 'BaseType'))",
    (uuid.uuid4(),)
).fetchall()

[('uniqueidentifier',)]

@mShan0
Copy link
Contributor

mShan0 commented Jul 19, 2024

Thanks for bringing this up. We will look into adding this.

@nils-van-zuijlen
Copy link
Author

I found another way of improving on this, as documented in https://github.com/mkleehammer/pyodbc/wiki/Tips-and-Tricks-by-Database-Platform#speeding-up-filtering-of-indexed-varchar-columns, it is possible to improve on this by explicitly telling pyodbc about the type of the parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants