Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(v2): import/export, hashing, clean up #1045

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

kocubinski
Copy link
Member

@kocubinski kocubinski commented Feb 15, 2025

Part 2 of #1043
Includes: #1044

  • implement store/v2 compatiblt API for import/export
  • improve recursive hash efficiency on SaveVersion
  • minor clean up to batch operations

Summary by CodeRabbit

  • New Features

    • Introduced a new benchmarking interface with configurable options.
    • Added enhanced import capabilities and debug logging for improved observability.
    • Integrated refined metrics management across operations.
  • Refactor

    • Streamlined tree and database interactions with optimized error handling.
    • Revised internal data management and iterator logic for increased efficiency.
    • Updated service initialization and configuration for better reliability.
  • Tests

    • Expanded test coverage with new concurrent and export/import integration tests.
    • Removed outdated tests to simplify maintenance.
  • Chores

    • Upgraded Go version and toolchain settings for improved compatibility.

@kocubinski kocubinski requested a review from a team as a code owner February 15, 2025 00:26
@kocubinski kocubinski mentioned this pull request Feb 15, 2025
2 tasks
Copy link

coderabbitai bot commented Feb 15, 2025

Walkthrough

This update adds a new benchmarking command for the IAVL tree along with comprehensive enhancements throughout the codebase. New CLI commands, export/import functionality with versioning and resource management, and refined node iteration and database access (switching to SQL interactions) have been introduced. Additional improvements include enhanced logging via a debug logger, expanded metric tracking through new metrics types and Prometheus integration, updated Go version settings, and new tests (including concurrent scenarios). These changes also include refactoring of tree state management and the removal of obsolete test code.

Changes

File(s) Change Summary
v2/cmd/bench/bench.go, v2/cmd/root.go Introduced a new bench command (with subcommand std), adding flag validations, CPU profiling, snapshot loading, and Prometheus metrics support to the CLI.
v2/export.go Updated the Exporter struct and its methods to support versioning, modified return types, implemented tree cloning, and added a Close method for resource management.
v2/import.go, v2/import_test.go Added an Importer for importing nodes with batching and concurrency control; updated tests to validate the full export/import cycle with tree rebuilding.
v2/iterator.go, v2/iterator_test.go Modified node iteration functions to pass tree.sql instead of tree and streamlined test cases using default tree options.
v2/logger.go Added NewDebugLogger() to create a debug logger using slog and standard output.
v2/metrics/metrics.go Introduced StructMetrics and NilMetrics types; updated metric aggregation, query reporting, and reset methods for better metrics management.
v2/migrate/go.mod Upgraded Go version to 1.21.0 and added toolchain specification (go1.23.6) in the module configuration.
v2/multitree.go Added a new newTree method and TestBuild functionality to improve tree creation error handling and testing; refactored mount and import functions.
v2/node.go Added new accessor methods (Version, Key, Value, Height) and updated left/right node retrieval to use a SqliteDb parameter.
v2/pool_test.go Deleted obsolete tests related to NodePool functionality.
v2/sqlite.go Revised SqliteDbOptions and SqliteDb to include a metrics.Proxy for enhanced metric logging; updated node retrieval and execution methods accordingly.
v2/sqlite_batch.go Restructured the sqliteBatch struct by removing old connection fields and adding new ones (e.g., leafWrite, treeWrite, writeQueue) along with integrated metrics tracking.
v2/sqlite_test.go Updated SQLite test logic with a new commit process and added concurrent tests (Test_ConcurrentDetach, Test_ConcurrentQuery, Test_ConcurrentIndexRead) for database operations.
v2/sqlite_writer.go Enhanced newSQLWriter and saveTree methods for safer logger initialization, improved metrics tracking, refined shard ID logic, and clearer error handling.
v2/tree.go Introduced constants (metricsNamespace, leafSequenceStart), consolidated tree state with a writeQueue, added methods for leaf key generation and read-only cloning, and updated node operations.
v2/tree_test.go Streamlined testing by removing legacy memory and pruning functions, integrating metrics via MetricsProxy, and simplifying tree build tests.

Sequence Diagram(s)

Benchmark Command Flow

sequenceDiagram
    participant U as User
    participant CLI as Cobra CLI
    participant BC as benchCommand
    participant DB as IAVL Tree/DB
    participant PM as PrometheusMetricsProxy

    U->>CLI: Execute "bench std" command
    CLI->>BC: Invoke benchCommand()
    BC->>BC: Validate required flags (db path, changelog, etc.)
    BC->>BC: Set up CPU profiling if specified
    BC->>DB: Initialize tree (load snapshot or create multi-tree)
    BC->>PM: Initialize Prometheus metrics proxy
    PM->>PM: Start HTTP server at "/metrics"
    BC->>U: Return benchmark results
Loading

Export/Import Flow

sequenceDiagram
    participant T as Test_ExportImport
    participant MT as MultiTree
    participant E as Exporter
    participant I as Importer
    participant DB as Database

    T->>MT: Initialize multi-tree and mount trees
    MT->>E: Export nodes from trees
    E->>T: Return exported nodes
    T->>MT: Reinitialize multi-tree for import
    MT->>I: Create Importer for node import
    I->>DB: Batch add exported nodes
    I->>DB: Commit import process
    I->>T: Verify multi-tree hash matches expected
Loading

Suggested reviewers

  • cool-develope
  • tac0turtle

Poem

I'm a bunny with code in my paws,
Hopping through branches without any flaws.
Benchmarks, metrics, and trees so neat,
Each new change makes my little heart beat.
With carrots of code and hops of delight,
I cheer for our progress morning 'til night!
🥕🐇 Happy coding in the moonlight!

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 golangci-lint (1.62.2)

level=warning msg="The linter 'exportloopref' is deprecated (since v1.60.2) due to: Since Go1.22 (loopvar) this linter is no longer relevant. Replaced by copyloopvar."
level=warning msg="[runner] Can't run linter goanalysis_metalinter: buildir: failed to load package sqlite3: could not load export data: no export data for "github.com/bvinc/go-sqlite-lite/sqlite3""
level=error msg="Running error: can't run linter goanalysis_metalinter\nbuildir: failed to load package sqlite3: could not load export data: no export data for "github.com/bvinc/go-sqlite-lite/sqlite3""

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (29)
v2/export.go (2)

41-50: Handle potential goroutine leaks.

The anonymous goroutine writes to exporter.out and then exits. If the caller never drains exporter.out (for instance, if an unhandled error arises early), it could cause a goroutine leak. To mitigate, ensure that the caller always calls Next() until ErrorExportDone or an actual error occurs. Alternatively, consider adding context-based cancellation to shut down gracefully.


97-105: Gracefully handle nil node scenario.

There is a panic("unexpected nil node") if a nil node is read from the channel. This is presumably never expected. If user inputs or external events could cause a nil node, consider returning an error or logging a warning instead of panicking.

v2/metrics/metrics.go (2)

36-39: Consolidate references by embedding.

StructMetrics embeds *TreeMetrics and *DbMetrics, which is convenient for namespacing. Just ensure to avoid accidental overshadowing of fields. Also consider whether pointer embedding is strictly necessary, or if direct embedding of structs could simplify memory management.


148-161: Histogram bins filtering mock.

QueryReport filters out samples with d > 50*time.Microsecond. This may skew data if your use cases exceed 50µs. Ensure this threshold is intentional. If queries sometimes exceed that, the histogram is incomplete. Consider making the cutoff dynamic or user-configurable.

v2/sqlite_batch.go (2)

157-230: Optimize leaf writes and indexing.

  1. Large loops (line 167, 193, 209) insert leaves, deletes, and orphans. Each iteration re-checks commit conditions. This is correct but can be I/O heavy. Consider buffering or periodic commits for performance.
  2. CREATE UNIQUE INDEX IF NOT EXISTS leaf_idx ON leaf (version, sequence) is executed post-write. For large data sets, creating or verifying indexes can be expensive. If you expect repeated calls to saveLeaves(), consider moving index creation to schema initialization or checking if the index is already set up to reduce overhead.

239-240: Shard-based indexing.

Using shardID in queries (tree_%d, tree_idx_%d) helps scale or partition data. This can fragment the schema if shardIDs are numerous. Monitor manageability if many shards are used. Also ensure concurrency safety, especially if multiple shards commit simultaneously to the same DB file.

Also applies to: 244-244, 246-246, 250-276, 281-282

v2/import.go (2)

1-23: Consider clarifying usage of importBatchSize and concurrency.
While the overall design for importing and batching appears sound, it might help future maintainers to have comments clarifying how batches are committed concurrently and why importBatchSize is set to 400,000.


203-247: Revisit single-node stack commit logic.
The commit logic allows exactly one node in the stack, returning an error otherwise. Confirm that large imports (with multiple top-level nodes) are always grouped into a single root node before commit. If not, users might see unexpected commit errors.

v2/sqlite_test.go (3)

4-34: Document concurrency capabilities of the SQLite driver in the tests.
Lines 4–34 set up concurrency for inserting a large volume of nodes. It's helpful to note whether the underlying SQLite driver supports concurrent writes safely or if transactions must be serialized to avoid panics.


158-211: Handle potential panics gracefully in concurrent attach/detach tests.
The tests at lines 158–211 demonstrate concurrency scenarios that occasionally panic, then skip the test. If these are intentionally included as stress tests, consider using require.NoError or gracefully recovering and logging results, so test failures are more instructive.


214-267: Reassess concurrency test design for Test_ConcurrentQuery.
Currently, the advanced concurrency tests are skipped because they "panic within a panic." If concurrency is not fully supported by this driver, clarify that in code or restructure the test to avoid repeated panic conditions, focusing on concurrency behavior that is safely testable.

v2/tree_test.go (5)

32-34: Use DefaultTreeOptions() for consistency.
Lines 32–34 explicitly define TreeOptions{...} with certain fields. Check if DefaultTreeOptions() plus overrides would yield simpler code, as used elsewhere in this file.


73-84: Confirm version alignment for imported snapshots.
In TestTree_Build_Load, ensure version 10,000 from the snapshot doesn't conflict with subsequent writes leading to version 12,000. Confirm that the final tree version and stored state match expectations.


162-163: Ensure the database path is valid when using TempDir().
For lines 162–163 in Test_EmptyTree, double-check if the ephemeral directory is respected by all code paths—some drivers require special setups for ephemeral or in-memory usage.


216-218: Increase test coverage of checkpoint intervals.
Lines 216–218 set opts.CheckpointInterval = 100. Consider verifying boundary conditions (e.g., 1, fully disabled, or large intervals) in separate test cases to ensure stable handling of partial commits.


335-345: Check concurrency safety for pruning.
Pruning at lines 335–345 modifies store data while versions are still loaded. Confirm that DeleteVersionsTo() calls are safe if other goroutines read the same version or if multiple prunes happen concurrently.

v2/tree.go (8)

26-32: Centralize write operations in a dedicated struct.

Wrapping branches, leaves, orphans, and deletes into a single writeQueue struct makes the write pipeline more coherent and easier to maintain. Be sure to handle any concurrency accesses to these slices carefully if future usage becomes multithreaded.


223-223: Expand deepHash for potential eviction signaling.

Returning a boolean for eviction helps coordinate memory usage more aggressively, especially under large IAVL trees. Monitor performance overhead from increased node traversal.


240-248: Track left/right eviction and finalize node hashing.

Introducing var evictLeft, evictRight bool clarifies your approach to partial subtree evictions. Immediately calling node._hash() after children are hashed ensures node invariants but watch for potential repeated computations if node states change in subsequent calls.


271-275: Left eviction logic.

When evictLeft is true, the code safely returns the left node to the pool if it’s not dirty, then nullifies the pointer. Verify that references to leftNode aren’t needed after setting it to nil.


277-282: Right eviction logic.

Mirroring the left eviction approach for right nodes. The same caution about subsequent references applies here.


371-372: Track pool usage metrics on node insertion (first occurrence).

Incrementing pool_get counters is a solid way to measure usage frequency. However, consider whether these increments might be more granular than needed, as they occur at both left and right insert paths.


379-380: Track pool usage metrics on node insertion (second occurrence).

This duplicate increment might inflate usage stats if left and right insertions happen in quick succession. Optionally combine these increments if they represent a single operation.


714-734: Provide ReadOnlyClone functionality.

Cloning the tree into a read-only instance with a new SQLite connection can be beneficial for concurrent readers or snapshot-based queries. Carefully ensure that locked resources (like file descriptors) don’t conflict across clones.

v2/sqlite.go (2)

353-360: Obtain shard query and measure read time.

Using sql.getShardQuery ensures correct retrieval of the relevant table. The approach to measure total duration with db_get_branch increments is consistent with your metrics approach, though keep an eye on overhead for frequent shard lookups.


725-725: Return “not found” error for missing right node.

This clarifies the mismatch between expected children vs. stored data.

v2/logger.go (1)

45-47: LGTM! Consider adding a comment to document the function.

The implementation is correct and follows the standard pattern of other logger functions in the file.

Add a comment to document the function:

+// NewDebugLogger returns a new logger that outputs debug-level logs to stdout.
 func NewDebugLogger() Logger {
     return slog.New(slog.NewTextHandler(os.Stdout, &slog.HandlerOptions{Level: slog.LevelDebug}))
 }
v2/import_test.go (1)

14-76: LGTM! Consider adding cleanup for exported data.

The test comprehensively validates export/import functionality. However, consider cleaning up the exported data map to free memory after the test.

Add cleanup after the test:

 require.Equal(t, opts.UntilHash, fmt.Sprintf("%x", multiTree.Hash()))
+// Clean up exported data
+for k := range exported {
+    exported[k] = nil
+}
+exported = nil
v2/multitree.go (1)

313-456: Consider adding documentation for the TestBuild method.

While the implementation is thorough, adding documentation would help users understand the purpose and usage of this performance testing method.

Add documentation like this:

+// TestBuild performs a test build of the tree with detailed performance metrics.
+// It reports memory usage, write performance, and other metrics during the build process.
+// opts: Configuration options for the test build
+// Returns the number of leaves processed and any error encountered
 func (mt *MultiTree) TestBuild(opts *testutil.TreeBuildOptions) (int64, error) {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bce70d and 03f1ad7.

⛔ Files ignored due to path filters (1)
  • v2/cmd/gen/gen.go is excluded by !**/gen/**
📒 Files selected for processing (19)
  • v2/cmd/bench/bench.go (1 hunks)
  • v2/cmd/root.go (2 hunks)
  • v2/export.go (3 hunks)
  • v2/import.go (1 hunks)
  • v2/import_test.go (1 hunks)
  • v2/iterator.go (2 hunks)
  • v2/iterator_test.go (3 hunks)
  • v2/logger.go (2 hunks)
  • v2/metrics/metrics.go (4 hunks)
  • v2/migrate/go.mod (2 hunks)
  • v2/multitree.go (7 hunks)
  • v2/node.go (6 hunks)
  • v2/pool_test.go (0 hunks)
  • v2/sqlite.go (12 hunks)
  • v2/sqlite_batch.go (8 hunks)
  • v2/sqlite_test.go (7 hunks)
  • v2/sqlite_writer.go (2 hunks)
  • v2/tree.go (17 hunks)
  • v2/tree_test.go (11 hunks)
💤 Files with no reviewable changes (1)
  • v2/pool_test.go
🔇 Additional comments (62)
v2/export.go (4)

16-19: Ensure concurrency-safety for Exporter fields.

Currently, Exporter spawns a goroutine in lines 41-48 that writes to out and possibly to errCh. Verify that no other goroutines modify Exporter's fields (sql, out, errCh) concurrently in ways that could lead to data races. If this is guaranteed, disregard; otherwise, consider using synchronization or other concurrency-safe patterns.


21-35: Validate version consistency for cloned tree.

When version != tree.Version(), the code attempts to clone the tree and load the requested version. This is logically sound; however, ensure that the cloned tree’s LoadVersion handles missing or invalid versions appropriately. Any mismatch or error leads to early returns. Confirm that this is aligned with user requirements for partial/corrupted versions and no indefinite partial states are left behind.


59-60: Check error handling for retrieving child nodes.

Both getLeftNode and getRightNode calls can emit errors, which get piped to e.errCh. This is valid, but confirm upstream that no code path ignores these errors. Also, watch for potential partial reads if one child retrieval fails mid-traversal.

Also applies to: 66-67, 82-83, 89-90


112-114: Confirm resource cleanup after Close().

Close() calls e.sql.Close(). Verify that no further queries or writes happen afterward. Additionally, confirm that any allocated channels (out, errCh) or other open resources are drained or invalidated to prevent leftover memory usage or blocked goroutines.

v2/metrics/metrics.go (4)

23-26: Interface conformance for metrics types.

Declaring _ Proxy = &StructMetrics{} and _ Proxy = &NilMetrics{} strongly enforces interface matching. This is a good practice, ensuring compile-time safety.


48-85: Validate metric key assumptions.

IncrCounter returns if len(keys) != 2. Confirm that upstream calls always provide exactly two strings (e.g., ["db", "db_leaf_miss"]). If any unexpected usage arises, the code silently ignores it. Consider logging or erroring for easier debugging if invalid keys are passed.


180-180: Reset query metrics.

SetQueryZero effectively resets query metrics to zero. This can disrupt continuous monitoring if triggered mid-analysis. Confirm that it's called at correct intervals. If partial snapshots are desired, store them before resetting.

Also applies to: 185-192


194-206: Aggregating across StructMetrics.

Add() merges metrics from another StructMetrics. This is straightforward, but be cautious about concurrency. If multiple threads might call Add concurrently, ensure external synchronization or a lock is in place to avoid data races on slice fields and counters.

v2/sqlite_batch.go (4)

40-43: Error propagation on leafWrite statements.

All these statements (Begin(), Prepare(), Commit()) look correct, but be sure to confirm rollback or partial cleanup if an error occurs halfway through. If leafWrite.Begin() fails, none of the subsequent statements should be processed.

Also applies to: 47-47, 51-51, 55-55, 59-59, 80-80


103-104: Narrow the scope of the orphan statement.

execBranchOrphan sets at to b.version, possibly meaning the current batch version is used as an orphan reference. Confirm that this is the intended design, especially if the node key's own version differs. Mismatched versions might lead to inconsistent data in orphan table.


107-115: Tree write transaction consistency.

Similar to the leaf transaction logic, ensure that if treeWrite.Begin() fails, no partial operations continue. If treeInsert.Close() or treeOrphan.Close() fails, confirm if we need to handle the partial state or retry.

Also applies to: 121-121


188-189: Double-check node eviction callback.

if leaf.evict { b.returnNode(leaf) } and similarly for branches. This ensures memory is reclaimed or reused. Confirm that returnNode does not do more than expected. If concurrency is involved, ensure that returned nodes are not re-accessed.

Also applies to: 263-263

v2/import.go (2)

39-73: Validate tree emptiness and version assumptions.
The function checks tree.sql.loadCheckpointRange() to ensure the tree is empty, but consider verifying whether tree invariants (e.g., root pointers) are also uninitialized to guard against edge cases or partial states. Additionally, if the importer is ever initialized with version=0, confirm that indexing (e.g., i.nonces[nodeVersion]) remains safe.


143-201: Consider verifying nodeVersion ordering.
Currently, the code checks nodeVersion > i.version. Evaluate whether you should also enforce a minimum version (e.g., nodeVersion >= lastImportedVersion) to protect against out-of-sequence imports if the exporter misbehaves.

v2/sqlite_test.go (1)

269-324: Synchronize concurrent index creation with ongoing queries.
CREATE INDEX foo_idx ON foo (id) occurs while a goroutine is reading from bar. If SQLite's locking handles this gracefully, that's fine. Otherwise, consider ensuring index creation is done prior to launching long-running queries.

v2/tree_test.go (3)

53-57: Evaluate HeightFilter: 0 performance impact.
Setting HeightFilter to 0 might cause bigger on-disk footprints if partial indexing or pruning is disabled. Confirm this aligns with your data size constraints and performance expectations.


98-98: Confirm memory usage in the no-DB scenario.
Line 98 returns NewTree(nil, ...), meaning an entirely in-memory tree is used. Over large iteration sets, confirm that memory usage remains feasible since there's no backing store.


277-287: Validate versioning after reloading from disk.
When reloading at version 170 (line 287) after a previous partial ingestion, confirm no off-by-one issues with pruning or leftover state. Additional tests might be helpful to ensure correctness at these mid-version reload boundaries.

v2/tree.go (20)

14-17: Introduce constants for metrics and leaf sequence.

Defining metricsNamespace and starting leaf sequences with leafSequenceStart improves clarity and ensures consistent naming while preventing collisions in sequence generation.


37-37: Adopt a proxy-based metrics interface.

Switching from a concrete metrics pointer to a metrics.Proxy grants greater flexibility and decouples the code from a specific metrics implementation. This is a good maintainability enhancement.


58-63: Embed writeQueue and track additional fields for sequences and replay.

The new fields leafSequence, branchSequence, isReplaying, and evictionDepth neatly consolidate tree state logic. Ensure that new or existing code systematically resets or updates these fields in all relevant operations (replay, checkpoints, etc.).


80-80: Use a no-op metrics proxy by default.

Assigning &metrics.NilMetrics{} ensures that calls to the metrics interface remain valid even if the user hasn’t explicitly configured metrics. This fallback improves developer experience.


92-92: Leverage injected metrics proxy upon initialization.

The line metrics: opts.MetricsProxy supports a clean dependency injection approach for metrics. This is consistent with your move to a proxy-based metrics solution.


101-101: Initialize leaf sequence with specified start.

Assigning leafSequenceStart to leafSequence helps maintain proper key spacing for leaves. This approach is well-defined for separating branch keys from leaf keys.


170-170: Reset sequences before each save version.

Calling tree.resetSequences() at version increment ensures that fresh leaf and branch sequences are used for new nodes in the upcoming version. This avoids sequence overlap or reuse across versions.


233-237: Evict low-height leaves based on heightFilter.

These lines force immediate eviction when tree.heightFilter > 0, which might be beneficial for fine-grained memory control. Ensure that edge cases (e.g., heightFilter == 0 or negative) are handled correctly.


249-250: Checkpoint logic for higher-version nodes.

This block covers additional logic for checkpointing nodes with versions above the last checkpoint. It’s a good safeguard that ensures older versions remain unaltered.


253-254: Recursively hash left node in checkpointing.

By calling tree.deepHash(node.leftNode, depth+1), you confirm the correct versioned hash for the left subtree. Make sure to handle nil or partially loaded nodes in future expansions.


256-257: Recursively hash right node for checkpointing.

Hashing the right subtree in a similar manner ensures consistency across all branches.


260-262: Accumulate updated branches in checkpoint mode.

Appending to tree.branches for nodes past the last checkpoint is key to capturing new changes in checkpoint data.


266-267: Flag node for eviction under eviction depth.

Setting node.evict = true triggers node cleanup if depth >= tree.evictionDepth. Carefully confirm that subsequent read operations can handle an evicted node gracefully.


284-284: Return eviction status.

Returning evict ensures consistent upstream handling of memory cleanup. This is a clear and concise approach.


393-393: Create a new leaf node when inserting to the right.

Explicitly calling tree.NewLeafNode(key, value) clarifies the creation path and ensures new leaf metadata is consistently applied.


401-401: Add node to orphan tracking before mutation.

Calling tree.addOrphan(node) helps capture the node’s previous state. This is essential for consistent rollback or deletion logic.


403-403: Conditional replay hash assignment.

Assigning node.hash = value under tree.isReplaying is a clever approach to skip normal hashing while restoring historical states.


411-412: Optionally drop leaf values.

Clearing node.value if !tree.storeLeafValues can reduce storage overhead, though make sure no code path relies on node.value post creation.


416-416: Return updated node and a boolean for set operations.

Returning (node, true, nil) clarifies that an update occurred. This keeps method signatures consistent across your tree mutation logic.


470-470: Increment delete counter on removal.

tree.metrics.IncrCounter(1, metricsNamespace, "tree_delete") helps track the frequency of node deletions. This is useful for monitoring churn in the tree.

v2/sqlite.go (10)

28-29: Include logger and metrics proxy in DbOptions.

Embedding configurable Logger and Metrics in SqliteDbOptions offers transparent customization for logging and instrumentation without forcing specific implementations.


69-71: Default to a no-op metrics if none is provided.

These lines ensure the code won’t panic when metrics are omitted, simplifying usage for minimal or test setups.


74-76: Inject a no-op logger when none is given.

Similar to metrics, providing a fallback NewNopLogger() spares the rest of the code from nil checks.


145-145: Assign metrics proxy to SqliteDb.

This aligns with your broader move to metrics.Proxy across the codebase, simplifying instrumentation usage.


313-316: Measure duration in getLeaf and increment counters.

Tracking query times and incrementing db_get_leaf fosters better visibility into query performance. Worth verifying you’re not double-counting in any higher-level instrumentation.


351-351: Introduce getNode for branches.

Splitting leaf vs. branch fetch logic fosters clarity.


710-713: Use a dedicated function to detect leaf keys (isLeafSeq).

This check clarifies the difference between leaf and branch node sequences, preventing accidental cross-handling of node types.


715-717: Guard against child lookups on leaf nodes.

Returning an error if someone attempts to traverse children of a leaf is a good safeguard.


719-723: Fetch right child based on sequence type.

Switching logic to getLeaf or getNode according to isLeafSeq affirms your type distinction.


748-749: Use descriptive error message for left-node retrieval.

Raising an error with context (node_key, height, path) helps debugging node retrieval issues.

v2/cmd/root.go (2)

4-4: Import the new bench command.

Adding "github.com/cosmos/iavl/v2/cmd/bench" extends the CLI with benchmarking capabilities.


23-23: Expose bench command via root CLI.

Placing bench.Command() in the main CLI expands user options for performance testing without requiring separate tooling.

v2/iterator_test.go (1)

16-17: LGTM! Good use of DefaultTreeOptions().

The change to use DefaultTreeOptions() improves maintainability and consistency across tests.

Also applies to: 230-231, 246-246

v2/iterator.go (2)

119-120: LGTM! Consistent database access pattern in stepAscend.

The changes correctly update the node retrieval calls to use the SQL database directly, aligning with the new architecture.

Also applies to: 127-128


170-171: LGTM! Consistent database access pattern in stepDescend.

The changes maintain consistency with the stepAscend method by using direct SQL database access for node retrieval.

Also applies to: 178-179

v2/multitree.go (2)

35-37: LGTM! Proper initialization of metrics.

The code ensures metrics are always initialized by providing a default NilMetrics implementation.


123-138: LGTM! Well-encapsulated tree creation logic.

The newTree method properly encapsulates tree creation logic, ensuring consistent initialization and error handling.

v2/sqlite_writer.go (2)

444-445: LGTM! Comprehensive metrics implementation.

The changes properly track database write operations with both timing and count metrics.

Also applies to: 469-471


446-452: LGTM! Proper checkpoint handling.

The code correctly handles checkpointing by retrieving the next shard ID when required.

v2/node.go (2)

67-81: LGTM! Well-designed accessor methods.

The new methods provide a clean interface for accessing node properties, improving code readability and maintainability.


119-147: LGTM! Consistent database access pattern.

The getLeftNode and getRightNode methods correctly use the SqliteDb parameter for database access, aligning with the new architecture.

v2/migrate/go.mod (3)

3-3: Updated Go Version

The module now specifies go 1.21.0. This update ensures that the codebase can leverage the latest language improvements and performance optimizations available in Go 1.21.0.


5-5: Specified Toolchain

The addition of toolchain go1.23.6 enforces a consistent build environment by explicitly setting the toolchain version. This is helpful for ensuring reproducibility across different developer setups and CI systems.


58-58: Formatting Consistency

The trailing newline and overall formatting adjustments in the require block are in line with standard Go module practices.

Comment on lines +13 to +23
leafWrite *sqlite3.Conn
treeWrite *sqlite3.Conn
version int64
writeQueue *writeQueue
storeLatestLeaves bool
returnNode func(*Node)
shardID int64

size int64
logger Logger
metrics metrics.Proxy
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Check uninitialized fields for sqliteBatch.

Several new fields (version, writeQueue, metrics, etc.) are introduced. Validate that each is always set before usage. Particularly, writeQueue and metrics must not be nil if used. Potential nil pointer dereference or unintended behavior could occur otherwise.

Comment on lines +233 to 234
return len(b.writeQueue.branches) > 0
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Checkpoint check is simplified.

isCheckpoint() returns len(b.writeQueue.branches) > 0. If checkpoint logic or multi-factor conditions are needed (e.g. orphan branches, large sets of leaves), consider centralizing predictable checkpoint conditions to avoid accidental partial checkpoints.

Comment on lines +81 to +89
// waitBatch waits for previous batch (if any) to finish.
func (i *Importer) waitBatch() error {
var err error
if i.inflightCommit != nil {
err = <-i.inflightCommit
i.inflightCommit = nil
}
return err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Return errors from concurrent commits during Close().
Close() discards potential errors returned via the inflight commit channel. If a commit fails, the caller never receives that error. Consider returning or logging any commit error to avoid silent failures.

 func (i *Importer) Close() error {
     if i.inflightCommit != nil {
-        <-i.inflightCommit
+        inflightErr := <-i.inflightCommit
+        if inflightErr != nil {
+            return inflightErr
+        }
         i.inflightCommit = nil
     }
     i.batch = nil
     i.tree = nil
-    return
+    return nil
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// waitBatch waits for previous batch (if any) to finish.
func (i *Importer) waitBatch() error {
var err error
if i.inflightCommit != nil {
err = <-i.inflightCommit
i.inflightCommit = nil
}
return err
}
func (i *Importer) Close() error {
if i.inflightCommit != nil {
inflightErr := <-i.inflightCommit
if inflightErr != nil {
return inflightErr
}
i.inflightCommit = nil
}
i.batch = nil
i.tree = nil
return nil
}

Comment on lines +91 to +130
// writeNode writes the node content to the storage.
func (i *Importer) writeNode(node *Node) error {
node._hash()

buf := bufPool.Get().(*bytes.Buffer)
buf.Reset()
defer bufPool.Put(buf)

if err := node.WriteBytes(buf); err != nil {
return err
}

bytesCopy := make([]byte, buf.Len())
copy(bytesCopy, buf.Bytes())

if node.isLeaf() {
i.batch.writeQueue.leaves = append(i.batch.writeQueue.leaves, node)
} else {
i.batch.writeQueue.branches = append(i.batch.writeQueue.branches, node)
}

i.batchSize++
if i.batchSize >= importBatchSize {
err := i.waitBatch()
if err != nil {
return err
}
result := make(chan error)
i.inflightCommit = result
go func(q *writeQueue) {
_, leafErr := i.batch.saveLeaves()
_, branchErr := i.batch.saveBranches()
result <- errors.Join(leafErr, branchErr)
}(i.batch.writeQueue)
i.batch.writeQueue = &writeQueue{}
i.batchSize = 0
}

return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid capturing i.batch inside goroutine to reduce data race risk.
The goroutine references i.batch.saveLeaves() and i.batch.saveBranches() directly instead of the local parameter q. Overwriting i.batch.writeQueue on the main thread while the goroutine operates could introduce data races. Pass the necessary references into the goroutine to ensure concurrency safety.

 go func(q *writeQueue) {
-    _, leafErr := i.batch.saveLeaves()
-    _, branchErr := i.batch.saveBranches()
+    tempBatch := i.batch
+    _, leafErr := tempBatch.saveLeaves()
+    _, branchErr := tempBatch.saveBranches()
     result <- errors.Join(leafErr, branchErr)
 }(i.batch.writeQueue)

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +150 to +158
func (p *prometheusMetricsProxy) SetGauge(val float32, keys ...string) {
k := keys[1]
switch k {
case "working_size":
p.workingSize.Set(float64(val))
case "working_bytes":
p.workingBytes.Set(float64(val))
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add bounds check for array access.

The function assumes keys[1] exists without checking the length of the keys slice.

 func (p *prometheusMetricsProxy) SetGauge(val float32, keys ...string) {
+    if len(keys) < 2 {
+        return
+    }
     k := keys[1]
     switch k {
     case "working_size":
         p.workingSize.Set(float64(val))
     case "working_bytes":
         p.workingBytes.Set(float64(val))
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (p *prometheusMetricsProxy) SetGauge(val float32, keys ...string) {
k := keys[1]
switch k {
case "working_size":
p.workingSize.Set(float64(val))
case "working_bytes":
p.workingBytes.Set(float64(val))
}
}
func (p *prometheusMetricsProxy) SetGauge(val float32, keys ...string) {
if len(keys) < 2 {
return
}
k := keys[1]
switch k {
case "working_size":
p.workingSize.Set(float64(val))
case "working_bytes":
p.workingBytes.Set(float64(val))
}
}

Comment on lines +138 to +143
go func() {
err := http.ListenAndServe(":2112", nil)
if err != nil {
panic(err)
}
}()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle HTTP server errors gracefully.

The current implementation panics on HTTP server errors. Consider logging the error and implementing a more graceful shutdown.

 go func() {
-    err := http.ListenAndServe(":2112", nil)
-    if err != nil {
-        panic(err)
-    }
+    if err := http.ListenAndServe(":2112", nil); err != nil && err != http.ErrServerClosed {
+        logger.Error("prometheus metrics server failed", "error", err)
+    }
 }()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
go func() {
err := http.ListenAndServe(":2112", nil)
if err != nil {
panic(err)
}
}()
go func() {
if err := http.ListenAndServe(":2112", nil); err != nil && err != http.ErrServerClosed {
logger.Error("prometheus metrics server failed", "error", err)
}
}()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant