Skip to content

Commit

Permalink
Issue 30277 create api factory methods to insert in the unique fields…
Browse files Browse the repository at this point in the history
… table (#30466)

We use a Lucene query to check for unique field values, but this
approach has issues due to a race condition. The ElasticSearch data
isn’t updated immediately after the Contentlet database update, so if
another Contentlet with the same unique values is saved before
ElasticSearch is refreshed, duplicates can occur. This issue is
particularly likely during high-volume imports, such as when importing
hundreds of Contentlets at once.

I was able to reproduce this error locally by directly creating
Contentlets via the API endpoint, sending 100 simultaneous requests
using Postman.

The new approach for validating unique fields involves using an
additional table. This table will have a primary key created from a
hash, which combines the following elements: the ContentType's variable
name, language, Field's variable name, Field's value, and—if the
uniquePerSite field variable is set to TRUE—the site ID.

### Proposed Changes
* We don't want to remove the old approach with the ES validation, we
want to keep it if we need to come back to it to avoid any unexpected
problem with the new approach, so I am going to create a Strategy to
switch between this 2 approaches using a config property.


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R237


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R7654


https://github.com/dotCMS/core/pull/30466/files#diff-445f8d01aa4de058eaaf883e573d17ef1123b1e74a9a98120ac156b78f4c6522R17

* For our new Extra table approach we ned to save the register in the
extra table with the Contentlet's ID, the Contentlet's ID is not used
for the hash calculation but is going to be sued later to clean up the
table when a Contentlet is deleted, that is why a afterSaved method is
included in the Strategy, this method is going to be called after saved
the Contentlet


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82R5528

- I am going to remove all the ES validation code, and add it in the new
ESUniqueFieldValidationStrategy class


https://github.com/dotCMS/core/pull/30466/files#diff-fa1ceaa19618a6b2bbc30e24c6f930b4971f417db50babb748c2e2837ba9eb82L7635-L7725


https://github.com/dotCMS/core/pull/30466/files#diff-e3b9fd8560a668db0c37d00715b112bfa3fdfddc778b5dd1c4bac3b73f1fdd72R45

- I need to create a new ExtraTableUniqueFieldValidationStrategy for the
Extra table validation Strategy implementation


https://github.com/dotCMS/core/pull/30466/files#diff-efdeeb8f5e148f60415043882bbe26bafc2d643378abdb58e7b8fc944087fe52R40

- Create a Singleton util Class to provide all the method to work with
the new extra table, I don't create a Factory because in this case
really we don't have a API to match the Factory


https://github.com/dotCMS/core/pull/30466/files#diff-b0aed2eddcc36be2a2e2f46624345a83c8698db238fe4dc228c0fb13a11e344fR16


### Checklist
- [ ] Tests
- [ ] Translations
- [ ] Security Implications Contemplated (add notes if applicable)

### Additional Info
** any additional useful context or info **

### Screenshots
Original             |  Updated
:-------------------------:|:-------------------------:
** original screenshot **  |  ** updated screenshot **

---------

Co-authored-by: Will Ezell <[email protected]>
  • Loading branch information
2 people authored and spbolton committed Nov 11, 2024
1 parent ae25474 commit 13bac8c
Show file tree
Hide file tree
Showing 12 changed files with 2,761 additions and 284 deletions.
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
package com.dotcms.content.elasticsearch.business;

import com.dotcms.analytics.content.ContentAnalyticsAPI;
import com.dotcms.api.system.event.ContentletSystemEventUtil;
import com.dotcms.api.web.HttpServletRequestThreadLocal;
import com.dotcms.business.CloseDBIfOpened;
import com.dotcms.business.WrapInTransaction;
import com.dotcms.cdi.CDIUtils;
import com.dotcms.concurrent.DotConcurrentFactory;
import com.dotcms.concurrent.lock.IdentifierStripedLock;
import com.dotcms.content.elasticsearch.business.event.ContentletArchiveEvent;
Expand All @@ -12,11 +14,9 @@
import com.dotcms.content.elasticsearch.business.event.ContentletPublishEvent;
import com.dotcms.content.elasticsearch.business.field.FieldHandlerStrategyFactory;
import com.dotcms.content.elasticsearch.constants.ESMappingConstants;
import com.dotcms.content.elasticsearch.util.ESUtils;
import com.dotcms.content.elasticsearch.util.PaginationUtil;
import com.dotcms.contenttype.business.BaseTypeToContentTypeStrategy;
import com.dotcms.contenttype.business.BaseTypeToContentTypeStrategyResolver;
import com.dotcms.contenttype.business.ContentTypeAPI;
import com.dotcms.contenttype.business.*;
import com.dotcms.contenttype.business.uniquefields.UniqueFieldValidationStrategyResolver;
import com.dotcms.contenttype.exception.NotFoundInDbException;
import com.dotcms.contenttype.model.field.BinaryField;
import com.dotcms.contenttype.model.field.CategoryField;
Expand Down Expand Up @@ -236,6 +236,8 @@
*/
public class ESContentletAPIImpl implements ContentletAPI {

private static Lazy<Boolean> FEATURE_FLAG_DB_UNIQUE_FIELD_VALIDATION = Lazy.of(() ->
Config.getBooleanProperty("FEATURE_FLAG_DB_UNIQUE_FIELD_VALIDATION", false));
private static final String CAN_T_CHANGE_STATE_OF_CHECKED_OUT_CONTENT = "Can't change state of checked out content or where inode is not set. Use Search or Find then use method";
private static final String CANT_GET_LOCK_ON_CONTENT = "Only the CMS Admin or the user who locked the contentlet can lock/unlock it";
private static final String FAILED_TO_DELETE_UNARCHIVED_CONTENT = "Failed to delete unarchived content. Content must be archived first before it can be deleted.";
Expand Down Expand Up @@ -277,6 +279,9 @@ public class ESContentletAPIImpl implements ContentletAPI {
private final BaseTypeToContentTypeStrategyResolver baseTypeToContentTypeStrategyResolver =
BaseTypeToContentTypeStrategyResolver.getInstance();


private final Lazy<UniqueFieldValidationStrategyResolver> uniqueFieldValidationStrategyResolver;

public enum QueryType {
search, suggest, moreLike, Facets
}
Expand All @@ -286,11 +291,21 @@ public enum QueryType {
private static final Supplier<String> ND_SUPPLIER = () -> "N/D";
private final ElasticReadOnlyCommand elasticReadOnlyCommand;

public static boolean getFeatureFlagDbUniqueFieldValidation() {
return FEATURE_FLAG_DB_UNIQUE_FIELD_VALIDATION.get();
}

@VisibleForTesting
public static void setFeatureFlagDbUniqueFieldValidation(final boolean newValue) {
FEATURE_FLAG_DB_UNIQUE_FIELD_VALIDATION = Lazy.of(() -> newValue);
}

/**
* Default class constructor.
*/
@VisibleForTesting
public ESContentletAPIImpl(final ElasticReadOnlyCommand readOnlyCommand) {
this.uniqueFieldValidationStrategyResolver = Lazy.of( () -> getUniqueFieldValidationStrategyResolver());
indexAPI = new ContentletIndexAPIImpl();
contentFactory = new ESContentFactoryImpl();
permissionAPI = APILocator.getPermissionAPI();
Expand All @@ -307,11 +322,20 @@ public ESContentletAPIImpl(final ElasticReadOnlyCommand readOnlyCommand) {
fileMetadataAPI = APILocator.getFileMetadataAPI();
}

private static UniqueFieldValidationStrategyResolver getUniqueFieldValidationStrategyResolver() {
final Optional<UniqueFieldValidationStrategyResolver> uniqueFieldValidationStrategyResolver =
CDIUtils.getBean(UniqueFieldValidationStrategyResolver.class);

if (!uniqueFieldValidationStrategyResolver.isPresent()) {
throw new DotRuntimeException("Could not instance UniqueFieldValidationStrategyResolver");
}
return uniqueFieldValidationStrategyResolver.get();
}

public ESContentletAPIImpl() {
this(ElasticReadOnlyCommand.getInstance());
}


@Override
public SearchResponse esSearchRaw(String esQuery, boolean live, User user,
boolean respectFrontendRoles) throws DotSecurityException, DotDataException {
Expand Down Expand Up @@ -5516,6 +5540,11 @@ private Contentlet internalCheckin(Contentlet contentlet,
contentlet = contentFactory.save(contentlet);
}

if (hasUniqueField(contentType)) {
uniqueFieldValidationStrategyResolver.get().get().afterSaved(contentlet, isNewContent);
}


contentlet.setIndexPolicy(indexPolicy);
contentlet.setIndexPolicyDependencies(indexPolicyDependencies);

Expand Down Expand Up @@ -5645,6 +5674,10 @@ private Contentlet internalCheckin(Contentlet contentlet,
return contentlet;
}

private static boolean hasUniqueField(ContentType contentType) {
return contentType.fields().stream().anyMatch(field -> field.unique());
}

private boolean shouldRemoveOldHostCache(Contentlet contentlet, String oldHostId) {
return contentlet.getBoolProperty(Contentlet.TO_BE_PUBLISH) &&
contentlet.isVanityUrl() &&
Expand Down Expand Up @@ -7632,98 +7665,16 @@ public void validateContentlet(final Contentlet contentlet, final List<Category>

// validate unique
if (field.isUnique()) {
final boolean isDataTypeNumber =
field.getDataType().contains(DataTypes.INTEGER.toString())
|| field.getDataType().contains(DataTypes.FLOAT.toString());
try {
final StringBuilder buffy = new StringBuilder(UUIDGenerator.generateUuid());
buffy.append(" +structureInode:" + contentlet.getContentTypeId());
if (UtilMethods.isSet(contentlet.getIdentifier())) {
buffy.append(" -(identifier:" + contentlet.getIdentifier() + ")");
}

buffy.append(" +languageId:" + contentlet.getLanguageId());

if (getUniquePerSiteConfig(field)) {
if (!UtilMethods.isSet(contentlet.getHost())) {
populateHost(contentlet);
}

buffy.append(" +conHost:" + contentlet.getHost());
}

buffy.append(" +").append(contentlet.getContentType().variable())
.append(StringPool.PERIOD)
.append(field.getVelocityVarName()).append(ESUtils.SHA_256)
.append(StringPool.COLON)
.append(ESUtils.sha256(contentlet.getContentType().variable()
+ StringPool.PERIOD + field.getVelocityVarName(), fieldValue,
contentlet.getLanguageId()));

final List<ContentletSearch> contentlets = new ArrayList<>();
try {
contentlets.addAll(
searchIndex(buffy.toString() + " +working:true", -1, 0, "inode",
APILocator.getUserAPI().getSystemUser(), false));
contentlets.addAll(
searchIndex(buffy.toString() + " +live:true", -1, 0, "inode",
APILocator.getUserAPI().getSystemUser(), false));
} catch (final Exception e) {
final String errorMsg =
"Unique field [" + field.getVelocityVarName() + "] with value '" +
fieldValue + "' could not be validated: " + e.getMessage();
Logger.warn(this, errorMsg, e);
throw new DotContentletValidationException(errorMsg, e);
}
int size = contentlets.size();
if (size > 0 && !hasError) {
boolean unique = true;
for (final ContentletSearch contentletSearch : contentlets) {
final Contentlet uniqueContent = contentFactory.find(
contentletSearch.getInode());
if (null == uniqueContent) {
final String errorMsg = String.format(
"Unique field [%s] could not be validated, as " +
"unique content Inode '%s' was not found. ES Index might need to be reindexed.",
field.getVelocityVarName(), contentletSearch.getInode());
Logger.warn(this, errorMsg);
throw new DotContentletValidationException(errorMsg);
}
final Map<String, Object> uniqueContentMap = uniqueContent.getMap();
final Object obj = uniqueContentMap.get(field.getVelocityVarName());
if ((isDataTypeNumber && Objects.equals(fieldValue, obj)) ||
(!isDataTypeNumber && ((String) obj).equalsIgnoreCase(
((String) fieldValue)))) {
unique = false;
break;
}

}
if (!unique) {
if (UtilMethods.isSet(contentlet.getIdentifier())) {
Iterator<ContentletSearch> contentletsIter = contentlets.iterator();
while (contentletsIter.hasNext()) {
ContentletSearch cont = contentletsIter.next();
if (!contentlet.getIdentifier()
.equalsIgnoreCase(cont.getIdentifier())) {
cve.addUniqueField(field);
hasError = true;
Logger.warn(this,
getUniqueFieldErrorMessage(field, fieldValue,
cont));
break;
}
}
} else {
cve.addUniqueField(field);
hasError = true;
Logger.warn(this, getUniqueFieldErrorMessage(field, fieldValue,
contentlets.get(0)));
break;
}
}
}
} catch (final DotDataException | DotSecurityException e) {
try {
uniqueFieldValidationStrategyResolver.get().get().validate(contentlet,
LegacyFieldTransformer.from(field));
} catch (UniqueFieldValueDuplicatedException e) {
cve.addUniqueField(field);
hasError = true;
Logger.warn(this, getUniqueFieldErrorMessage(field, fieldValue,
UtilMethods.isSet(e.getContentlets()) ? e.getContentlets().get(0) : "Unknown"));
} catch (DotDataException | DotSecurityException e) {
Logger.warn(this, "Unable to get contentlets for Content Type: "
+ contentlet.getContentType().name(), e);
}
Expand Down Expand Up @@ -7799,11 +7750,11 @@ private boolean isIgnorableField(final com.dotcms.contenttype.model.field.Field
}

private String getUniqueFieldErrorMessage(final Field field, final Object fieldValue,
final ContentletSearch contentletSearch) {
final String contentletID) {

return String.format(
"Value of Field [%s] must be unique. Contents having the same value (%s): %s",
field.getVelocityVarName(), fieldValue, contentletSearch.getIdentifier());
field.getVelocityVarName(), fieldValue, contentletID);
}

private boolean getUniquePerSiteConfig(final Field field) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
package com.dotcms.contenttype.business;

import java.util.List;

/**
* Throw if try to insert a duplicated register in unique_fiedls table
*/
public class UniqueFieldValueDuplicatedException extends Exception{

private List<String> contentletsIDS;

public UniqueFieldValueDuplicatedException(String message) {
super(message);
}

public UniqueFieldValueDuplicatedException(String message, List<String> contentletsIDS) {
super(message);
this.contentletsIDS = contentletsIDS;
}

public List<String> getContentlets() {
return contentletsIDS;
}
}
Loading

0 comments on commit 13bac8c

Please sign in to comment.