Uploaded image for project: 'Spring AMQP'
  1. Spring AMQP
  2. AMQP-750

Add an option to treat PossibleAuthenticationFailureException as non-fatal

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Complete
    • Affects Version/s: None
    • Fix Version/s: 2.0 M5, 1.7.4
    • Component/s: RabbitMQ
    • Labels:

      Description

      Using RabbitMQ, we have unknown network issues that sometimes trigger PossibleAuthenticationFailureExceptions in spring-rabbit but are not actual authentication failures, the client can successfully reconnect after retrying.
      The issue is that since these exceptions are considered fatal, the client doesn't retry and the container is just stopped.

      It would be useful to keep actual AuthenticationFailureExceptions as fatal but be able to configure whether PossibleAuthenticationFailureException are fatal or not, like it's already done for QueuesNotAvailableException.

        Activity

        Hide
        abilan Artem Bilan added a comment -

        1. How can you know that "the client can successfully reconnect after retrying" if "client doesn't retry and the container is just stopped"?

        2. Would you mind sharing some StackTrace, logs on the matter? Having that info we might be able to identify the problem and introduce a new exception type which won't be fatal for that reason.

        Thanks

        Show
        abilan Artem Bilan added a comment - 1. How can you know that "the client can successfully reconnect after retrying" if "client doesn't retry and the container is just stopped"? 2. Would you mind sharing some StackTrace, logs on the matter? Having that info we might be able to identify the problem and introduce a new exception type which won't be fatal for that reason. Thanks
        Hide
        rkasbachpictet Romain Kasbach added a comment - - edited

        1. I've made that assumption because :

        • restarting my application solves the problem
        • other quasi-identical applications using the same version of spring-rabbit with same credentials remain connected the the same RabbitMQ server when this happens to one of them

        2. Sure, he's one:

        [2017-05-19T08:38:10.475+0200] (inner bean)#32dbc145-4 ERROR o.s.a.r.l.SimpleMessageListenerContainer - - Consumer received fatal exception on startup
        org.springframework.amqp.rabbit.listener.exception.FatalListenerStartupException: Authentication failure
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:476)
        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1280)
        at java.lang.Thread.run(Thread.java:745)
        Caused by: org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.PossibleAuthenticationFailureException: Possibly caused by authentication failure
        at org.springframework.amqp.rabbit.support.RabbitExceptionTranslator.convertRabbitAccessException(RabbitExceptionTranslator.java:65)
        at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:309)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:547)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createBareChannel(CachingConnectionFactory.java:500)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.getCachedChannelProxy(CachingConnectionFactory.java:474)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.getChannel(CachingConnectionFactory.java:467)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.access$1500(CachingConnectionFactory.java:97)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$ChannelCachingConnectionProxy.createChannel(CachingConnectionFactory.java:1084)
        at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils$1.createChannel(ConnectionFactoryUtils.java:95)
        at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.doGetTransactionalResourceHolder(ConnectionFactoryUtils.java:144)
        at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.getTransactionalResourceHolder(ConnectionFactoryUtils.java:76)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:472)
        ... 2 common frames omitted
        Caused by: com.rabbitmq.client.PossibleAuthenticationFailureException: Possibly caused by authentication failure
        at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:342)
        at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:813)
        at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:725)
        at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:296)
        ... 12 common frames omitted
        Caused by: com.rabbitmq.client.ShutdownSignalException: connection error
        at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:67)
        at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:37)
        at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:367)
        at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:234)
        at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:212)
        at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:327)
        ... 15 common frames omitted
        Caused by: java.io.EOFException: null
        at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290)
        at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
        at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139)
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:542)
        ... 1 common frames omitted
        [2017-05-19T08:38:10.482+0200] (inner bean)#32dbc145-4 ERROR o.s.a.r.l.SimpleMessageListenerContainer - - Stopping container from aborted consumer
        

        Also, I've seen this change suggested there : https://groups.google.com/d/msg/rabbitmq-users/x_Sh316o_MA/dFSv-pGL2xIJ

        Thanks for your help

        Show
        rkasbachpictet Romain Kasbach added a comment - - edited 1. I've made that assumption because : restarting my application solves the problem other quasi-identical applications using the same version of spring-rabbit with same credentials remain connected the the same RabbitMQ server when this happens to one of them 2. Sure, he's one: [2017-05-19T08:38:10.475+0200] (inner bean)#32dbc145-4 ERROR o.s.a.r.l.SimpleMessageListenerContainer - - Consumer received fatal exception on startup org.springframework.amqp.rabbit.listener.exception.FatalListenerStartupException: Authentication failure at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:476) at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1280) at java.lang.Thread.run(Thread.java:745) Caused by: org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.PossibleAuthenticationFailureException: Possibly caused by authentication failure at org.springframework.amqp.rabbit.support.RabbitExceptionTranslator.convertRabbitAccessException(RabbitExceptionTranslator.java:65) at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:309) at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:547) at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createBareChannel(CachingConnectionFactory.java:500) at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.getCachedChannelProxy(CachingConnectionFactory.java:474) at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.getChannel(CachingConnectionFactory.java:467) at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.access$1500(CachingConnectionFactory.java:97) at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$ChannelCachingConnectionProxy.createChannel(CachingConnectionFactory.java:1084) at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils$1.createChannel(ConnectionFactoryUtils.java:95) at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.doGetTransactionalResourceHolder(ConnectionFactoryUtils.java:144) at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.getTransactionalResourceHolder(ConnectionFactoryUtils.java:76) at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:472) ... 2 common frames omitted Caused by: com.rabbitmq.client.PossibleAuthenticationFailureException: Possibly caused by authentication failure at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:342) at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:813) at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:725) at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:296) ... 12 common frames omitted Caused by: com.rabbitmq.client.ShutdownSignalException: connection error at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:67) at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:37) at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:367) at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:234) at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:212) at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:327) ... 15 common frames omitted Caused by: java.io.EOFException: null at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290) at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95) at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139) at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:542) ... 1 common frames omitted [2017-05-19T08:38:10.482+0200] (inner bean)#32dbc145-4 ERROR o.s.a.r.l.SimpleMessageListenerContainer - - Stopping container from aborted consumer Also, I've seen this change suggested there : https://groups.google.com/d/msg/rabbitmq-users/x_Sh316o_MA/dFSv-pGL2xIJ Thanks for your help
        Hide
        abilan Artem Bilan added a comment -

        Check this, please, if that isn't your case: https://jira.spring.io/browse/INT-2651.

        But I see your about Gary's comment a couple years ago:

        We could add an option to treat PossibleAuthenticationFailureException as non-fatal (and keep trying). Feel free to open up a JIRA issue and we'll take a look.

        Show
        abilan Artem Bilan added a comment - Check this, please, if that isn't your case: https://jira.spring.io/browse/INT-2651 . But I see your about Gary's comment a couple years ago: We could add an option to treat PossibleAuthenticationFailureException as non-fatal (and keep trying). Feel free to open up a JIRA issue and we'll take a look.
        Hide
        abilan Artem Bilan added a comment -

        I think we will take a look what we can do on the matter, but meanwhile here is some workaround for your.
        That FatalListenerStartupException is handled like:

        catch (FatalListenerStartupException ex) {
        	logger.error("Consumer received fatal exception on startup", ex);
        	this.startupException = ex;
        	// Fatal, but no point re-throwing, so just abort.
        	aborted = true;
        	publishConsumerFailedEvent("Consumer received fatal exception on startup", true, ex);
        }
        ...
         
        protected final void publishConsumerFailedEvent(String reason, boolean fatal, Throwable t) {
        	if (this.applicationEventPublisher != null) {
        		this.applicationEventPublisher
        				.publishEvent(t == null ? new ListenerContainerConsumerTerminatedEvent(this, reason) :
        						new ListenerContainerConsumerFailedEvent(this, reason, t, fatal));
        	}
        }
        

        I think you can listener for that ListenerContainerConsumerFailedEvent in some ApplicationListener and restart the container according your logic.

        Show
        abilan Artem Bilan added a comment - I think we will take a look what we can do on the matter, but meanwhile here is some workaround for your. That FatalListenerStartupException is handled like: catch (FatalListenerStartupException ex) { logger.error( "Consumer received fatal exception on startup" , ex); this .startupException = ex; // Fatal, but no point re-throwing, so just abort. aborted = true ; publishConsumerFailedEvent( "Consumer received fatal exception on startup" , true , ex); } ...   protected final void publishConsumerFailedEvent(String reason, boolean fatal, Throwable t) { if ( this .applicationEventPublisher != null ) { this .applicationEventPublisher .publishEvent(t == null ? new ListenerContainerConsumerTerminatedEvent( this , reason) : new ListenerContainerConsumerFailedEvent( this , reason, t, fatal)); } } I think you can listener for that ListenerContainerConsumerFailedEvent in some ApplicationListener and restart the container according your logic.
        Hide
        rkasbachpictet Romain Kasbach added a comment - - edited

        Check this, please, if that isn't your case: https://jira.spring.io/browse/INT-2651.

        I've checked, there is no other <rabbit:connection-factory/> in my configuration. Also the behaviour does'nt seem to be the same as described since in my case the application always starts and connects successfully to the RabbitMQ server. It only fails sporadically during runtime.

        I think you can listener for that ListenerContainerConsumerFailedEvent in some ApplicationListener and restart the container according your logic.

        I will have a look into this, thanks for the advice.

        Show
        rkasbachpictet Romain Kasbach added a comment - - edited Check this, please, if that isn't your case: https://jira.spring.io/browse/INT-2651 . I've checked, there is no other <rabbit:connection-factory/> in my configuration. Also the behaviour does'nt seem to be the same as described since in my case the application always starts and connects successfully to the RabbitMQ server. It only fails sporadically during runtime. I think you can listener for that ListenerContainerConsumerFailedEvent in some ApplicationListener and restart the container according your logic. I will have a look into this, thanks for the advice.
        Hide
        sylvain.laurent Sylvain LAURENT added a comment -

        Thanks for the fix. I have 2 thoughts though:

        • Shouldn't the new flag be set to false by default? This would allow to have production-ready settings by default. The issue raised by the reporter did occur for me in production, this is not hypothetical.
        • any chance of back porting this on 1.x? Once again, as this is a production issue it would be nice not to have to wait for a major release (and a major spring boot release too)
        Show
        sylvain.laurent Sylvain LAURENT added a comment - Thanks for the fix. I have 2 thoughts though: Shouldn't the new flag be set to false by default? This would allow to have production-ready settings by default. The issue raised by the reporter did occur for me in production, this is not hypothetical. any chance of back porting this on 1.x? Once again, as this is a production issue it would be nice not to have to wait for a major release (and a major spring boot release too)
        Hide
        abilan Artem Bilan added a comment -

        We are backporting this, at least a setter for the property to 1.7.x.

        Nevertheless we can't make it as false by default. That is not for what this exception has been designed.

        That is not so difficult to to configure this property though.

        We don't have particular plans to release the next 1.7.4, so meanwhile, please, consider to use a workaround with the ListenerContainerConsumerFailedEvent mentioned above.

        Also would be great to have a StackTrace from you on the matter with the hope to understand how you are getting this problem.

        Thanks

        Show
        abilan Artem Bilan added a comment - We are backporting this, at least a setter for the property to 1.7.x . Nevertheless we can't make it as false by default. That is not for what this exception has been designed. That is not so difficult to to configure this property though. We don't have particular plans to release the next 1.7.4 , so meanwhile, please, consider to use a workaround with the ListenerContainerConsumerFailedEvent mentioned above. Also would be great to have a StackTrace from you on the matter with the hope to understand how you are getting this problem. Thanks
        Show
        abilan Artem Bilan added a comment - Backported to 1.7.x as https://github.com/spring-projects/spring-amqp/commit/0cdce83763934f2c0db3cde87b95f00473a3f650
        Hide
        sylvain.laurent Sylvain LAURENT added a comment -

        Also would be great to have a StackTrace from you on the matter with the hope to understand how you are getting this problem.

        see Romain's stacktrace above. It occurred in production with a 2-node RabbitMQ cluster, as we restarted the "primary" node (hence the EOFException).

        Thanks for the backport, I'll wait for 1.7.4 rather than playing with ListenerContainerConsumerFailedEvent.

        Show
        sylvain.laurent Sylvain LAURENT added a comment - Also would be great to have a StackTrace from you on the matter with the hope to understand how you are getting this problem. see Romain's stacktrace above . It occurred in production with a 2-node RabbitMQ cluster, as we restarted the "primary" node (hence the EOFException ). Thanks for the backport, I'll wait for 1.7.4 rather than playing with ListenerContainerConsumerFailedEvent.

          People

          • Assignee:
            abilan Artem Bilan
            Reporter:
            rkasbachpictet Romain Kasbach
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: