Skip to content

Commit

Permalink
Fix oplog search breaking with very long entries
Browse files Browse the repository at this point in the history
Postgres aborts queries that try to make a `tsvector` with too many
unique tokens. This patch reworks the FTS oplog query to limit each
individual field to avoid collecting too many tokens and running into
the error.

A drawback is that text after this cutoff point will not be searchable.

Fixes #557
  • Loading branch information
ColonelThirtyTwo committed Dec 5, 2024
1 parent c0860f6 commit 0bebda0
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions ghostwriter/oplog/consumers.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

# Django Imports
from django.db.models import TextField, Func, Subquery, OuterRef, Value, F
from django.db.models.functions import Cast
from django.db.models.functions import Cast, Left
from django.db.models.expressions import CombinedExpression
from django.utils.timezone import make_aware
from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank, SearchVectorField
Expand Down Expand Up @@ -164,20 +164,20 @@ def get_log_entries(self, oplog_id: int, offset: int, user: User, filter: str |
if spec.type == "json":
continue

field = Cast(CombinedExpression(
field = CombinedExpression(
F("extra_fields"),
"->>",
Value(spec.internal_name),
), TextField())
)
simple_vector_args.append(field)
if spec.type == "rich_text":
english_vector_args.append(field)

# Combine search vector
vector = TsVectorConcat(
SearchVector(*english_vector_args, config="english"),
SearchVector(*simple_vector_args, config="simple"),
)
# Create and combine search vectors.
# Limit inputs since PostgreSQL will abort the query if attempting to make a tsvector out of a huge string
vectors = [SearchVector(Left(Cast(va, TextField()), 100000), config="english") for va in english_vector_args] + \
[SearchVector(Left(Cast(va, TextField()), 100000), config="simple") for va in simple_vector_args]
vector = TsVectorConcat(*vectors)

# Build filter.
# Search using both english and simple configs, to help match both types of vectors. Also use prefix
Expand Down

0 comments on commit 0bebda0

Please sign in to comment.