[FORMATTING] a few cases of unexpected identifierCase "lower" behavior #809

gajus · 2024-12-19T21:31:39Z

These are the unexpected diffs from PostgreSQL codebase after using the experiment { identifierCase: 'lower' }:

- user_job_posting AS MATERIALIZED (
+ user_job_posting AS materialized (

- ON CONFLICT DO NOTHING
+ ON CONFLICT DO nothing

- ON CONFLICT (foreign_cloudinary_asset_id) DO NOTHING
+ ON CONFLICT (foreign_cloudinary_asset_id) DO nothing

- WITH ORDINALITY nids
+ WITH ordinality nids

- ORDER BY job_posting_approved_at DESC NULLS LAST,
+ ORDER BY job_posting_approved_at DESC nulls last,

- DROP MATERIALIZED VIEW IF EXISTS discover_independents_user_account_v3 CASCADE;
+ DROP MATERIALIZED VIEW if EXISTS discover_independents_user_account_v3 CASCADE;

- RESET ROLE;
+ RESET role;

- CREATE DATABASE "foo" TEMPLATE contra_template
+ CREATE DATABASE "foo" template contra_template

Otherwise, it works great.

Thanks for the awesome lib!

The text was updated successfully, but these errors were encountered:

nene · 2024-12-20T08:55:40Z

Well, there's a reason why the identifierCase option is marked as experimental.

Though really, all of the other case options (keywordCase, dataTypeCase, functionCase) are problematic as well. The root of all this is that SQL Formatter doesn't really fully parse the SQL. It only uses some heuristics to do it's formatting. But without proper parsing it can't really distinguish whether something like last is a keyword or identifier, because it depends on the context.

PostgreSQL in particular is pretty relaxed with regards to keywords - only a few keywords in PostgreSQL are fully reserved and can never be used as identifiers. (Though there are SQL dialects that are even worse in this regard.)

The approach SQL Formatter currently uses is to find some sort of sweetspot between two extremes:

Detect anything that could be a keyword always as a keyword - results in lots of normal identifiers detected as keywords.
Detect only the keywords that can never be used as identifiers as keywords - results in lots of keywords being detected as identifiers.

SQL Formatter leans on the latter side, which basically means that using keywordCase: "upper" option will mostly not convert ordinary identifiers to uppercase. However on the flip-side this means that identifierCase: "lower" will result in lots of keywords being converted to lowercase.

This identifierCase option really only exists for the sake of completeness. But I wouldn't really recommend it's use. The only sensible time to use it is when you want all your SQL to be either upper or lowercase.

nene · 2024-12-20T09:02:46Z

If you're looking for a tool that does better job formatting PostgreSQL, you might want to consider prettier-plugin-sql-cst, which does use a proper parser and is capable to properly distinguish between keywords and identifiers. Though currently it only supports changing the case of keywords and it doesn't have full PostgreSQL support because... well... PostgreSQL is such a huge language.

Refs #809

nene · 2024-12-20T09:14:48Z

I updated the docs to make the flaws of this option more apparent.

I'll close this issue as "won't fix", because it sort of works as designed.

Closes #809

karlhorky · 2025-01-05T15:45:00Z

@gajus I opened PR #821 to resolve many of these cases, but one which I could not resolve is TEMPLATE (see the challenges in #821 (comment))

But everything else should be good, once #821 is published.

gajus added the bug label Dec 19, 2024

nene added a commit that referenced this issue Dec 20, 2024

Document the reasons why identifierCase is experimental

f4cda21

Refs #809

nene closed this as not planned Won't fix, can't repro, duplicate, stale Dec 20, 2024

This was referenced Jan 5, 2025

Add more PostgreSQL keywords and phrases #821

Merged

Add DROP IDENTITY, OVERRIDING SYSTEM VALUE to PostgreSQL clauses #820

Merged

nene added a commit that referenced this issue Jan 5, 2025

Add more PostgreSQL keywords and phrases (#821)

366bd7b

Closes #809

nene closed this as completed in #821 Jan 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FORMATTING] a few cases of unexpected identifierCase "lower" behavior #809

[FORMATTING] a few cases of unexpected identifierCase "lower" behavior #809

gajus commented Dec 19, 2024

nene commented Dec 20, 2024

nene commented Dec 20, 2024

nene commented Dec 20, 2024

karlhorky commented Jan 5, 2025

[FORMATTING] a few cases of unexpected identifierCase "lower" behavior #809

[FORMATTING] a few cases of unexpected identifierCase "lower" behavior #809

Comments

gajus commented Dec 19, 2024

nene commented Dec 20, 2024

nene commented Dec 20, 2024

nene commented Dec 20, 2024

karlhorky commented Jan 5, 2025