Uploaded image for project: 'Spring Data Redis'
  1. Spring Data Redis
  2. DATAREDIS-1056

RedisCommandTimeoutException on disrupted connection doesn't make the client recover

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.1.11 (Lovelace SR11)
    • Fix Version/s: None
    • Component/s: Lettuce Driver
    • Labels:
      None

      Description

      I have an annoying issue related to a RedisCommandTimeoutException that happens because of a disrupted connection and the client is not able to properly recover. That means, any request that uses that connection will receive the error until the client doesn't discard the affected connection. The only solution, is to restart the application or wait that the connection gets discarded because of the idle timeout, if any, or the client finally realizes that the connection is not healthy anymore.

      In my opinion, what happens is that the connection gets disrupted but not from a TCP point of view, making the client believe the connection is still healthy. Therefore, the recovery, could happen quite a while later, that of course is not acceptable. Said so, I am just wondering if any of the solution below are possible:

      • Spring and the underneath lettuce offer some kind of configuration to validate a connection before sending the command or a background process that does that type of job. A kind of validation query.
      • Discard the connection if a specific exception is raised implementing something custom.

      Just a little bit of context:

      • Spring wefblux using spring boot 2.1.9
      • Lettuce 5.1.8
      • Commons Pool2 2.6.2
      • Redis Server 4.0.14 (using SSL). The server doesn't run in the same private network where the application runs but it is in the same zone. It is a cloud service that maybe makes the connection not totally reliable, and it is a basic service, just for development and testing. But in my opinion this shouldn't affect a connection pool manager that manages properly the connections.

      Configuration:

      spring:
        redis:
          url: ${REDIS_URL:redis://127.0.0.1:6379}
          database: ${REDIS_DATABASE:0}
          timeout: 10s
          lettuce:
            pool:
              max-active: 50
              max-idle: 10
              max-wait: 5s
              min-idle: 5
              time-between-eviction-runs: 5m
      

       The connection pool works fine, I verified with netstat a pool of min 5 connections that get discarded when the idle object evictor thread starts and then re-activated.

      In addition to the configuration above I configured another option for the client (pingBeforeActivateConnection):

      @Bean
      	public LettuceClientConfigurationBuilderCustomizer lettuceClientConfigurationBuilderCustomizer() {
      		ClientOptions clientOptions = ClientOptions.builder()
      				.timeoutOptions(TimeoutOptions.enabled())
      				.pingBeforeActivateConnection(true).build();
      		return clientConfigurationBuilder -> clientConfigurationBuilder
      				.clientOptions(clientOptions);
      	}
      

      Exception:

      org.springframework.dao.QueryTimeoutException: Redis command timed out; nested exception is io.lettuce.core.RedisCommandTimeoutException: Command timed out after 10 second(s)
      	at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:70) ~[spring-data-redis-2.1.11.RELEASE.jar!/:2.1.11.RELEASE]
      	at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41) ~[spring-data-redis-2.1.11.RELEASE.jar!/:2.1.11.RELEASE]
      	at org.springframework.data.redis.connection.lettuce.LettuceReactiveRedisConnection.lambda$translateException$1(LettuceReactiveRedisConnection.java:283) ~[spring-data-redis-2.1.11.RELEASE.jar!/:2.1.11.RELEASE]
      	at reactor.core.publisher.Flux.lambda$onErrorMap$24(Flux.java:6237) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
      	at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:88) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
      	at reactor.core.publisher.MonoFlatMapMany$FlatMapManyInner.onError(MonoFlatMapMany.java:247) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
      	at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:126) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
      	at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:126) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
      	at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:87) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
      	at io.lettuce.core.RedisPublisher$ImmediateSubscriber.onError(RedisPublisher.java:905) ~[lettuce-core-5.1.8.RELEASE.jar!/:na]
      	at io.lettuce.core.RedisPublisher$State.onError(RedisPublisher.java:703) ~[lettuce-core-5.1.8.RELEASE.jar!/:na]
      	at io.lettuce.core.RedisPublisher$RedisSubscription.onError(RedisPublisher.java:349) ~[lettuce-core-5.1.8.RELEASE.jar!/:na]
      	at io.lettuce.core.RedisPublisher$SubscriptionCommand.onError(RedisPublisher.java:815) ~[lettuce-core-5.1.8.RELEASE.jar!/:na]
      	at io.lettuce.core.RedisPublisher$SubscriptionCommand.completeExceptionally(RedisPublisher.java:809) ~[lettuce-core-5.1.8.RELEASE.jar!/:na]
      	at io.lettuce.core.protocol.CommandExpiryWriter.lambda$potentiallyExpire$0(CommandExpiryWriter.java:167) ~[lettuce-core-5.1.8.RELEASE.jar!/:na]
      	at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) ~[netty-common-4.1.39.Final.jar!/:4.1.39.Final]
      	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127) ~[netty-common-4.1.39.Final.jar!/:4.1.39.Final]
      	at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66) ~[netty-common-4.1.39.Final.jar!/:4.1.39.Final]
      	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) ~[netty-common-4.1.39.Final.jar!/:4.1.39.Final]
      	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.39.Final.jar!/:4.1.39.Final]
      	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.39.Final.jar!/:4.1.39.Final]
      	at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
      Caused by: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 10 second(s)
      	at io.lettuce.core.ExceptionFactory.createTimeoutException(ExceptionFactory.java:51) ~[lettuce-core-5.1.8.RELEASE.jar!/:na]
      	... 8 common frames omitted
      

       

        Attachments

          Activity

            People

            • Assignee:
              mp911de Mark Paluch
              Reporter:
              cmario cmario
              Last updater:
              Christoph Strobl
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: