Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 30277 create api factory methods to insert in the unique fields table #30466

Conversation

freddyDOTCMS
Copy link
Contributor

@freddyDOTCMS freddyDOTCMS commented Oct 28, 2024

We use a Lucene query to check for unique field values, but this approach has issues due to a race condition. The ElasticSearch data isn’t updated immediately after the Contentlet database update, so if another Contentlet with the same unique values is saved before ElasticSearch is refreshed, duplicates can occur. This issue is particularly likely during high-volume imports, such as when importing hundreds of Contentlets at once.

I was able to reproduce this error locally by directly creating Contentlets via the API endpoint, sending 100 simultaneous requests using Postman.

The new approach for validating unique fields involves using an additional table. This table will have a primary key created from a hash, which combines the following elements: the ContentType's variable name, language, Field's variable name, Field's value, and—if the uniquePerSite field variable is set to TRUE—the site ID.

Proposed Changes

  • We don't want to remove the old approach with the ES validation, we want to keep it if we need to come back to it to avoid any unexpected problem with the new approach, so I am going to create a Strategy to switch between this 2 approaches using a config property.

https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R237

https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R7654

https://github.com/dotCMS/core/pull/30466/files#diff-445f8d01aa4de058eaaf883e573d17ef1123b1e74a9a98120ac156b78f4c6522R17

  • For our new Extra table approach we ned to save the register in the extra table with the Contentlet's ID, the Contentlet's ID is not used for the hash calculation but is going to be sued later to clean up the table when a Contentlet is deleted, that is why a afterSaved method is included in the Strategy, this method is going to be called after saved the Contentlet

https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R5528

  • I am going to remove all the ES validation code, and add it in the new ESUniqueFieldValidationStrategy class

https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82L7635-L7725

https://github.com/dotCMS/core/pull/30466/files#diff-e3b9fd8560a668db0c37d00715b112bfa3fdfddc778b5dd1c4bac3b73f1fdd72R45

  • I need to create a new ExtraTableUniqueFieldValidationStrategy for the Extra table validation Strategy implementation

https://github.com/dotCMS/core/pull/30466/files#diff-efdeeb8f5e148f60415043882bbe26bafc2d643378abdb58e7b8fc944087fe52R40

  • Create a Singleton util Class to provide all the method to work with the new extra table, I don't create a Factory because in this case really we don't have a API to match the Factory

https://github.com/dotCMS/core/pull/30466/files#diff-b0aed2eddcc36be2a2e2f46624345a83c8698db238fe4dc228c0fb13a11e344fR16

Checklist

  • Tests
  • Translations
  • Security Implications Contemplated (add notes if applicable)

Additional Info

** any additional useful context or info **

Screenshots

Original Updated
** original screenshot ** ** updated screenshot **

This PR fixes: #30277

wezell and others added 29 commits October 9, 2024 09:30
ref: #29555
Changing link from `/c/dotAI` to `/c/dotai`
…-unique_fields-table' of https://github.com/dotCMS/core into issue-30277-Create-API-Factory-methods-to-insert-in-the-unique_fields-table
…-unique_fields-table' of https://github.com/dotCMS/core into issue-30277-Create-API-Factory-methods-to-insert-in-the-unique_fields-table
…API-Factory-methods-to-insert-in-the-unique_fields-table
…-unique_fields-table' of https://github.com/dotCMS/core into issue-30277-Create-API-Factory-methods-to-insert-in-the-unique_fields-table
@freddyDOTCMS freddyDOTCMS added this pull request to the merge queue Oct 30, 2024
Merged via the queue into main with commit d6eb7b8 Oct 30, 2024
35 checks passed
@freddyDOTCMS freddyDOTCMS deleted the issue-30277-Create-API-Factory-methods-to-insert-in-the-unique_fields-table branch October 30, 2024 19:45
dsolistorres pushed a commit that referenced this pull request Nov 5, 2024
… table (#30466)

We use a Lucene query to check for unique field values, but this
approach has issues due to a race condition. The ElasticSearch data
isn’t updated immediately after the Contentlet database update, so if
another Contentlet with the same unique values is saved before
ElasticSearch is refreshed, duplicates can occur. This issue is
particularly likely during high-volume imports, such as when importing
hundreds of Contentlets at once.

I was able to reproduce this error locally by directly creating
Contentlets via the API endpoint, sending 100 simultaneous requests
using Postman.

The new approach for validating unique fields involves using an
additional table. This table will have a primary key created from a
hash, which combines the following elements: the ContentType's variable
name, language, Field's variable name, Field's value, and—if the
uniquePerSite field variable is set to TRUE—the site ID.

### Proposed Changes
* We don't want to remove the old approach with the ES validation, we
want to keep it if we need to come back to it to avoid any unexpected
problem with the new approach, so I am going to create a Strategy to
switch between this 2 approaches using a config property.


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R237


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R7654


https://github.com/dotCMS/core/pull/30466/files#diff-445f8d01aa4de058eaaf883e573d17ef1123b1e74a9a98120ac156b78f4c6522R17

* For our new Extra table approach we ned to save the register in the
extra table with the Contentlet's ID, the Contentlet's ID is not used
for the hash calculation but is going to be sued later to clean up the
table when a Contentlet is deleted, that is why a afterSaved method is
included in the Strategy, this method is going to be called after saved
the Contentlet


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R5528

- I am going to remove all the ES validation code, and add it in the new
ESUniqueFieldValidationStrategy class


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82L7635-L7725


https://github.com/dotCMS/core/pull/30466/files#diff-e3b9fd8560a668db0c37d00715b112bfa3fdfddc778b5dd1c4bac3b73f1fdd72R45

- I need to create a new ExtraTableUniqueFieldValidationStrategy for the
Extra table validation Strategy implementation


https://github.com/dotCMS/core/pull/30466/files#diff-efdeeb8f5e148f60415043882bbe26bafc2d643378abdb58e7b8fc944087fe52R40

- Create a Singleton util Class to provide all the method to work with
the new extra table, I don't create a Factory because in this case
really we don't have a API to match the Factory


https://github.com/dotCMS/core/pull/30466/files#diff-b0aed2eddcc36be2a2e2f46624345a83c8698db238fe4dc228c0fb13a11e344fR16


### Checklist
- [ ] Tests
- [ ] Translations
- [ ] Security Implications Contemplated (add notes if applicable)

### Additional Info
** any additional useful context or info **

### Screenshots
Original             |  Updated
:-------------------------:|:-------------------------:
** original screenshot **  |  ** updated screenshot **

---------

Co-authored-by: Will Ezell <[email protected]>
spbolton pushed a commit that referenced this pull request Nov 11, 2024
… table (#30466)

We use a Lucene query to check for unique field values, but this
approach has issues due to a race condition. The ElasticSearch data
isn’t updated immediately after the Contentlet database update, so if
another Contentlet with the same unique values is saved before
ElasticSearch is refreshed, duplicates can occur. This issue is
particularly likely during high-volume imports, such as when importing
hundreds of Contentlets at once.

I was able to reproduce this error locally by directly creating
Contentlets via the API endpoint, sending 100 simultaneous requests
using Postman.

The new approach for validating unique fields involves using an
additional table. This table will have a primary key created from a
hash, which combines the following elements: the ContentType's variable
name, language, Field's variable name, Field's value, and—if the
uniquePerSite field variable is set to TRUE—the site ID.

### Proposed Changes
* We don't want to remove the old approach with the ES validation, we
want to keep it if we need to come back to it to avoid any unexpected
problem with the new approach, so I am going to create a Strategy to
switch between this 2 approaches using a config property.


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R237


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R7654


https://github.com/dotCMS/core/pull/30466/files#diff-445f8d01aa4de058eaaf883e573d17ef1123b1e74a9a98120ac156b78f4c6522R17

* For our new Extra table approach we ned to save the register in the
extra table with the Contentlet's ID, the Contentlet's ID is not used
for the hash calculation but is going to be sued later to clean up the
table when a Contentlet is deleted, that is why a afterSaved method is
included in the Strategy, this method is going to be called after saved
the Contentlet


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R5528

- I am going to remove all the ES validation code, and add it in the new
ESUniqueFieldValidationStrategy class


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82L7635-L7725


https://github.com/dotCMS/core/pull/30466/files#diff-e3b9fd8560a668db0c37d00715b112bfa3fdfddc778b5dd1c4bac3b73f1fdd72R45

- I need to create a new ExtraTableUniqueFieldValidationStrategy for the
Extra table validation Strategy implementation


https://github.com/dotCMS/core/pull/30466/files#diff-efdeeb8f5e148f60415043882bbe26bafc2d643378abdb58e7b8fc944087fe52R40

- Create a Singleton util Class to provide all the method to work with
the new extra table, I don't create a Factory because in this case
really we don't have a API to match the Factory


https://github.com/dotCMS/core/pull/30466/files#diff-b0aed2eddcc36be2a2e2f46624345a83c8698db238fe4dc228c0fb13a11e344fR16


### Checklist
- [ ] Tests
- [ ] Translations
- [ ] Security Implications Contemplated (add notes if applicable)

### Additional Info
** any additional useful context or info **

### Screenshots
Original             |  Updated
:-------------------------:|:-------------------------:
** original screenshot **  |  ** updated screenshot **

---------

Co-authored-by: Will Ezell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants