You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When deploying a new TMail using Redis event bus from scratch, /healthcheck fails with 500 error code.
Error log:
{
"timestamp": "2025-02-20T04:59:56.453Z",
"level": "ERROR",
"thread": "qtp829000452-191",
"mdc": {
"host": "localhost:8000",
"verb": "GET",
"action": "/healthcheck",
"protocol": "webadmin"
},
"logger": "spark.http.matching.GeneralError",
"message": "",
"context": "default",
"exception": "com.github.fge.lambdas.ThrownByLambdaException: java.io.IOException at com.github.fge.lambdas.predicates.ThrowingPredicate.test(ThrowingPredicate.java:27) at java.base/java.util.stream.ReferencePipeline$2$1.accept(Unknown Source) at java.base/java.util.stream.ReferencePipeline$3$1.accept(Unknown Source) at java.base/java.util.stream.ReferencePipeline$3$1.accept(Unknown Source) at java.base/java.util.stream.Streams$StreamBuilderImpl.tryAdvance(Unknown Source) at java.base/java.util.stream.Streams$ConcatSpliterator.tryAdvance(Unknown Source) at java.base/java.util.stream.ReferencePipeline.forEachWithCancel(Unknown Source) at java.base/java.util.stream.AbstractPipeline.copyIntoWithCancel(Unknown Source) at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source) at java.base/java.util.stream.FindOps$FindOp.evaluateSequential(Unknown Source) at java.base/java.util.stream.AbstractPipeline.evaluate(Unknown Source) at java.base/java.util.stream.ReferencePipeline.findAny(Unknown Source) at org.apache.james.events.RabbitEventBusConsumerHealthCheck.check(RabbitEventBusConsumerHealthCheck.java:74) at org.apache.james.events.RabbitEventBusConsumerHealthCheck.lambda$check$0(RabbitEventBusConsumerHealthCheck.java:60) at com.github.fge.lambdas.functions.FunctionChainer.doApply(FunctionChainer.java:20) at com.github.fge.lambdas.functions.ThrowingFunction.apply(ThrowingFunction.java:17) at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106) at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99) at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onNext(FluxRetryWhen.java:178) at reactor.core.publisher.MonoSubscribeOn$SubscribeOnSubscriber.onNext(MonoSubscribeOn.java:146) at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571) at reactor.core.publisher.MonoSubscribeOn$SubscribeOnSubscriber.trySchedule(MonoSubscribeOn.java:189) at reactor.core.publisher.MonoSubscribeOn$SubscribeOnSubscriber.onSubscribe(MonoSubscribeOn.java:134) at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:53) at reactor.core.publisher.Mono.subscribe(Mono.java:4568) at reactor.core.publisher.MonoSubscribeOn$SubscribeOnSubscriber.run(MonoSubscribeOn.java:126) at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84) at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Suppressed: com.rabbitmq.client.AlreadyClosedException: channel is already closed due to channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no queue 'mailboxEvent-workQueue-org.apache.james.events.GroupRegistrationHandler$GroupRegistrationHandlerGroup' in vhost 'tmail', class-id=50, method-id=10) at com.rabbitmq.client.impl.AMQChannel.processShutdownSignal(AMQChannel.java:437) at com.rabbitmq.client.impl.ChannelN.startProcessShutdownSignal(ChannelN.java:295) at com.rabbitmq.client.impl.ChannelN.close(ChannelN.java:624) at com.rabbitmq.client.impl.ChannelN.close(ChannelN.java:557) at com.rabbitmq.client.impl.ChannelN.close(ChannelN.java:550) at com.rabbitmq.client.impl.recovery.AutorecoveringChannel.lambda$close$0(AutorecoveringChannel.java:74) at com.rabbitmq.client.impl.recovery.AutorecoveringChannel.executeAndClean(AutorecoveringChannel.java:102) at com.rabbitmq.client.impl.recovery.AutorecoveringChannel.close(AutorecoveringChannel.java:74) at org.apache.james.events.RabbitEventBusConsumerHealthCheck.lambda$check$0(RabbitEventBusConsumerHealthCheck.java:59) ... 20 common frames omitted Suppressed: java.lang.Exception: #block terminated with an error at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:104) at reactor.core.publisher.Mono.block(Mono.java:1779) at org.apache.james.webadmin.routes.HealthCheckRoutes.validateHealthChecks(HealthCheckRoutes.java:128) at spark.ResponseTransformerRouteImpl$1.handle(ResponseTransformerRouteImpl.java:47) at spark.http.matching.Routes.execute(Routes.java:61) at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:134) at spark.embeddedserver.jetty.JettyHandler.doHandle(JettyHandler.java:50) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1598) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:516) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) ... 1 common frames omitted Caused by: java.io.IOException: null at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:140) at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:136) at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:158) at com.rabbitmq.client.impl.ChannelN.queueDeclarePassive(ChannelN.java:1033) at com.rabbitmq.client.impl.ChannelN.consumerCount(ChannelN.java:1052) at com.rabbitmq.client.impl.recovery.AutorecoveringChannel.consumerCount(AutorecoveringChannel.java:377) at org.apache.james.events.RabbitEventBusConsumerHealthCheck.lambda$check$1(RabbitEventBusConsumerHealthCheck.java:73) at com.github.fge.lambdas.predicates.PredicateChainer.doTest(PredicateChainer.java:21) at com.github.fge.lambdas.predicates.ThrowingPredicate.test(ThrowingPredicate.java:23) ... 34 common frames omitted"
}
Reason: RabbitEventBusConsumerHealthCheck rely on the hardcode James GroupRegistrationHandlerGroup therefore expects the queue mailboxEvent-workQueue-org.apache.james.events.GroupRegistrationHandler$GroupRegistrationHandlerGroup to exist.
However, the Redis event bus relies on its own TmailGroupRegistrationHandler which results in the mailboxEvent-workQueue-org.apache.james.events.TmailGroupRegistrationHandler$GroupRegistrationHandlerGroup.
And /healthcheck webadmin endpoint fails with 500 error as a consequence, likely because RabbitEventBusConsumerHealthCheck asserts the James group queue always exists.
I took the chance to review deeper the Redis event bus. I spotted that when we use RabbitMQAndRedisEventBus, we create these unused queues by starting RabbitMQEventBus and register some dedicated listeners:
RabbitMQAndRedisEventBus should use the same group as RabbitMQEventBus which results in the same group queue name as James, which makes sense IMO as the group handling is the same.
Refactor the Guice module JMAPEventBusModule, RabbitMQEventBusModule so we can split the RabbitMQEventBus starting part. Therefore when we use RabbitMQAndRedisEventBus, we won't start RabbitMQEventBus and create un-used queues and un-used consumers.
DoD
In DistributedServerWithRedisEventBusKeysTest/healthcheck pass.
The text was updated successfully, but these errors were encountered:
Why
When deploying a new TMail using Redis event bus from scratch,
/healthcheck
fails with 500 error code.Error log:
Reason:
RabbitEventBusConsumerHealthCheck
rely on the hardcode James GroupRegistrationHandlerGroup therefore expects the queuemailboxEvent-workQueue-org.apache.james.events.GroupRegistrationHandler$GroupRegistrationHandlerGroup
to exist.However, the Redis event bus relies on its own
TmailGroupRegistrationHandler
which results in themailboxEvent-workQueue-org.apache.james.events.TmailGroupRegistrationHandler$GroupRegistrationHandlerGroup
.And
/healthcheck
webadmin endpoint fails with 500 error as a consequence, likely becauseRabbitEventBusConsumerHealthCheck
asserts the James group queue always exists.I took the chance to review deeper the Redis event bus. I spotted that when we use
RabbitMQAndRedisEventBus
, we create these unused queues by startingRabbitMQEventBus
and register some dedicated listeners:jmapEvent-workQueue-org.apache.james.events.GroupRegistrationHandler$GroupRegistrationHandlerGroup
emailAddressContactEvent-workQueue-org.apache.james.events.GroupRegistrationHandler$GroupRegistrationHandlerGroup
And...
ScheduledReconnectionHandler
is checking the James group queues, not the Tmail group queues cf https://github.com/linagora/tmail-backend/blob/master/tmail-backend/guice/distributed/src/main/java/com/linagora/tmail/ScheduledReconnectionHandler.java#L322.How
I propose to refactor a bit:
RabbitMQAndRedisEventBus
should use the same group asRabbitMQEventBus
which results in the same group queue name as James, which makes sense IMO as the group handling is the same.JMAPEventBusModule
,RabbitMQEventBusModule
so we can split theRabbitMQEventBus
starting part. Therefore when we useRabbitMQAndRedisEventBus
, we won't startRabbitMQEventBus
and create un-used queues and un-used consumers.DoD
In
DistributedServerWithRedisEventBusKeysTest
/healthcheck
pass.The text was updated successfully, but these errors were encountered: