Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct ReverseLowerBound behavior in MemDB by setting unique index #1074

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

chacha912
Copy link
Contributor

@chacha912 chacha912 commented Nov 21, 2024

What this PR does / why we need it:

The js-sdk test was failing with ConnectError: [internal] internal: change not found when using MemDB.
image

This occurred during DetachDocument execution where FindMinSyncedSeqInfo failed to find the change record.
image

There's a known bug in go-memdb where ReverseLowerBound behaves differently based on index uniqueness:

  • With non-unique indexes: acts as less than (<)
  • With unique indexes: correctly acts as less than or equal (<=)

Reference: hashicorp/go-memdb#96 (comment)

Set the doc_id_actor_id_server_seq index as unique since this combination of fields is guaranteed to be unique. This fixes the ReverseLowerBound behavior to properly include matching records in search results.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Additional documentation:


Checklist:

  • Added relevant tests or not required
  • Addressed and resolved all CodeRabbit review comments
  • Didn't break anything

Summary by CodeRabbit

  • New Features

    • Enhanced data integrity by enforcing uniqueness on the combination of DocID, ActorID, and ServerSeq in the tblChanges table.
  • Bug Fixes

    • Resolved potential data duplication issues by implementing unique constraints on specific fields.

Copy link

coderabbitai bot commented Nov 21, 2024

Walkthrough

The pull request modifies the index schema for the tblChanges table in the memory package. The index doc_id_actor_id_server_seq has been updated to enforce uniqueness across the combination of the fields DocID, ActorID, and ServerSeq. This change ensures that no two records can have the same values for these three fields, altering the constraints for record storage in the database.

Changes

File Path Change Summary
server/backend/database/memory/indexes.go Updated index doc_id_actor_id_server_seq in tblChanges to include Unique: true.

Possibly related PRs

  • Fix transaction in UpdateAndFindMinSyncedVersionVector #1050: The changes in this PR involve modifications to the UpdateAndFindMinSyncedVersionVector method, which may interact with the database schema and constraints, potentially affecting how records are managed in relation to the tblChanges table and its unique index.

Suggested reviewers

  • hackerwins

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 golangci-lint

level=warning msg="[lintersdb] The linter "structcheck" is deprecated (step 2) and deactivated. It should be removed from the list of disabled linters. https://golangci-lint.run/product/roadmap/#linter-deprecation-cycle"


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
server/backend/database/memory/indexes.go (1)

171-172: Consider documenting this design decision and exploring alternatives.

While this fix effectively addresses the immediate issue, consider:

  1. Adding a comment explaining why this index must be unique (to handle go-memdb's ReverseLowerBound behavior)
  2. Creating an issue to track the underlying go-memdb limitation
  3. Evaluating alternative in-memory DB libraries for the long term if this limitation causes more issues

This will help future maintainers understand the design decision and provide a path forward if similar issues arise.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 082224c and 9d9b5fd.

📒 Files selected for processing (1)
  • server/backend/database/memory/indexes.go (1 hunks)
🔇 Additional comments (1)
server/backend/database/memory/indexes.go (1)

171-172: LGTM! Setting unique index to fix ReverseLowerBound behavior.

The change correctly addresses the known go-memdb bug where ReverseLowerBound behaves differently for unique vs non-unique indexes. Setting this index as unique is safe because the combination of DocID, ActorID, and ServerSeq is guaranteed to be unique.

Let's verify the uniqueness assumption by checking the codebase for any evidence of duplicate values:

✅ Verification successful

Let me verify if there are any duplicate values for the combination of DocID, ActorID, and ServerSeq in the codebase.


Let me search for the Change struct definition and any uniqueness constraints in the database schema:


Let me search specifically for the schema definition and any code that handles the uniqueness constraint:


Setting the index as unique is correct and necessary

The change to make doc_id_actor_id_server_seq a unique index is valid because:

  1. The index is used in ReverseLowerBound operations which require unique index behavior for correct operation
  2. The combination of DocID, ActorID, and ServerSeq should indeed be unique by design:
    • ServerSeq records the chronological order of changes on the server
    • For a given document (DocID) and actor (ActorID), each change must have a unique ServerSeq
    • This forms a natural composite key that should be unique

The change fixes the bug without introducing risks, as it enforces a constraint that was already implicitly required by the system's design.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for code that might create or handle duplicate records with same DocID, ActorID, and ServerSeq

# Look for change record creation patterns
ast-grep --pattern 'Change{
  $$$
  DocID: $_,
  ActorID: $_,
  ServerSeq: $_,
  $$$
}'

# Look for any comments or tests related to duplicate handling
rg -i "duplicate.*(change|doc.*actor|server.*seq)" 

Length of output: 109193


Script:

#!/bin/bash
# Search for Change struct definitions and usages to understand uniqueness constraints
ast-grep --pattern 'type Change struct {
  $$$
  DocID $_
  ActorID $_
  ServerSeq $_
  $$$
}'

# Look for any code that creates or modifies Change records
rg -A 5 'Change\{' 

# Look for any uniqueness checks or comments about uniqueness
rg -i "(unique|duplicate).*(docid|actorid|serverseq)"

Length of output: 10485


Script:

#!/bin/bash
# Look for Change struct definition
rg -A 10 "type Change struct" 

# Look for database schema definitions and indexes
rg -A 10 "tblChanges.*Schema" 
rg -A 10 "tblChanges.*Index"

# Look for any uniqueness validation or error handling related to Change records
rg -i "(validate|check|verify|ensure).*(docid|actorid|serverseq)"

Length of output: 28134


Script:

#!/bin/bash
# Look for tblChanges schema definition
rg -A 15 "tblChanges.*Schema.*=|var.*tblChanges" server/backend/database/memory/

# Look for any code that handles uniqueness constraints or duplicate prevention
rg -B 2 -A 2 "doc_id_actor_id_server_seq" server/backend/database/memory/

Length of output: 910

Copy link

codecov bot commented Nov 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 46.79%. Comparing base (082224c) to head (9d9b5fd).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1074   +/-   ##
=======================================
  Coverage   46.79%   46.79%           
=======================================
  Files          84       84           
  Lines       12219    12219           
=======================================
  Hits         5718     5718           
  Misses       5928     5928           
  Partials      573      573           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link
Member

@hackerwins hackerwins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.

@hackerwins hackerwins merged commit bd886c5 into main Nov 21, 2024
5 checks passed
@hackerwins hackerwins deleted the fix-memdb branch November 21, 2024 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants