Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve exception handling when master node is missing remote_cluster_client role #121149

Open
valeriy42 opened this issue Jan 29, 2025 · 1 comment
Labels
>bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Coordination Meta label for Distributed Coordination team

Comments

@valeriy42
Copy link
Contributor

Elasticsearch Version

8.18.0, 9.0.0

Installed Plugins

No response

Java Version

bundled

OS Version

Linux

Problem Description

When the master node doesn't have the remote_cluster_client role, and CCS is used, then RemoteClusterLicenseChecker.remoteClusterAliases will return no_such_remote_cluster_exception, which is misleading.

There's probably a bug in ClusterNameExpressionResolver. We should be reporting this as an IllegalArgumentException as we do in RemoteClusterService.

Steps to Reproduce

  1. Create a cluster where the master doesn't have remote_cluster_client, while the ML node has (probably ECK config)
  2. Create a remote cluster with the sample index
  3. Create an anomaly detection jobs using CCS from the remote index
  4. Open the job and start the datafeed. The error occurs on the datafeed start.

Logs (if relevant)

Here is an exemplary error stack trace:

POST _ml/datafeeds/datafeed-test-job/_start?error_trace

{
"error": {
"root_cause": [
{
"type": "no_such_remote_cluster_exception",
"reason": "no such remote cluster: [my-remote-cluster-alias]",
"stack_trace": """org.elasticsearch.transport.NoSuchRemoteClusterException: no such remote cluster: [my-remote-cluster-alias]
at org.elasticsearch.cluster.metadata.ClusterNameExpressionResolver.resolveClusterNames(ClusterNameExpressionResolver.java:42)
at org.elasticsearch.license.RemoteClusterLicenseChecker.lambda$remoteClusterAliases$1(RemoteClusterLicenseChecker.java:278)
at java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:289)
at java.util.stream.DistinctOps$1$2.accept(DistinctOps.java:174)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:215)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:197)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:727)
at org.elasticsearch.license.RemoteClusterLicenseChecker.remoteClusterAliases(RemoteClusterLicenseChecker.java:280)
at org.elasticsearch.xpack.ml.action.TransportStartDatafeedAction.lambda$masterOperation$3(TransportStartDatafeedAction.java:233)
at org.elasticsearch.xpack.ml.action.TransportStartDatafeedAction.lambda$masterOperation$4(TransportStartDatafeedAction.java:283)
at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257)
at org.elasticsearch.xpack.ml.job.persistence.JobConfigProvider.parseJobLenientlyFromSource(JobConfigProvider.java:763)
at org.elasticsearch.xpack.ml.job.persistence.JobConfigProvider$1.onResponse(JobConfigProvider.java:175)
at org.elasticsearch.xpack.ml.job.persistence.JobConfigProvider$1.onResponse(JobConfigProvider.java:166)
at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:33)
at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:203)
at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:197)
at org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onResponse(ActionListenerImplementations.java:336)
at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:33)
at org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:97)
at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:49)
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1500)
at org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:434)
at org.elasticsearch.transport.InboundHandler$2.doRun(InboundHandler.java:391)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.lang.Thread.run(Thread.java:1575)
@valeriy42 valeriy42 added >bug needs:triage Requires assignment of a team area label Team:Distributed Coordination Meta label for Distributed Coordination team and removed needs:triage Requires assignment of a team area label labels Jan 29, 2025
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Distributed Coordination Meta label for Distributed Coordination team labels Jan 29, 2025
@valeriy42 valeriy42 added :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Coordination Meta label for Distributed Coordination team and removed needs:triage Requires assignment of a team area label labels Jan 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Coordination Meta label for Distributed Coordination team
Projects
None yet
Development

No branches or pull requests

2 participants