Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-20867 Implement retry topic to handle persistent transient RPC Client errors #6385

Merged
merged 19 commits into from
Nov 14, 2024

Conversation

LWogan
Copy link
Contributor

@LWogan LWogan commented Nov 11, 2024

Design: https://github.com/corda/platform-eng-design/pull/658
API: corda/corda-api#1710

The current mediator messaging pattern in Corda can encounter an retry loop when transient errors are received from other Corda workers. This retry loop blocks flow topic partitions from progressing and it has been observed that the corda cluster affected can become permanently unstable due to the effects of consumer lag. This pattern is used by the flow worker to perform synchronous HTTP calls to various workers, including verification, token, crypto, uniqueness, and persistence workers.

To address this issue, a separate Kafka topic is dedicated to handling retries. This will allow the primary ingestion topics to continue processing unaffected flows, while introducing finite retry logic for flows impacted by transient errors.

Additionally AVRO version is bumped to fix a vulnerability

@corda-jenkins-ci02
Copy link
Contributor

corda-jenkins-ci02 bot commented Nov 11, 2024

Jenkins build for PR 6385 build 14

Build Successful:
Jar artifact version produced by this PR: 5.2.1.0-alpha-1731589624233
Helm chart version produced by this PR: 5.2.1-alpha.1731589624233
Helm chart pushed to: oci://corda-os-docker-dev.software.r3.com/helm-charts/pr-6385/corda
Helm chart Polaris score: 82

@LWogan LWogan requested a review from conalsmith-r3 November 14, 2024 12:42
Copy link

sonarcloud bot commented Nov 14, 2024

@LWogan LWogan merged commit 2434c4a into release/os/5.2 Nov 14, 2024
5 checks passed
@LWogan LWogan deleted the lorcan/CORE-20867/retry-topic-update branch November 14, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants