Uploaded image for project: 'Spring Data Commons'
  1. Spring Data Commons
  2. DATACMNS-518

Spring Data Infinite Loop in HashMap in PreferredConstructor and CustomConversions

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.6.2 (Babbage SR1)
    • Component/s: Core
    • Labels:
      None
    • Environment:

      Description

      See also post at newsgroup:

      http://forum.spring.io/forum/spring-projects/data/750072-spring-data-infinite-loop-in-hashmap-in-preferredconstructor-and-customconversions

      Spring Data Infinite Loops in Production

      This post is related to an issue we have started seeing in production at the company where I work. I am trying to figure out where to take this next. Should I proceed to filing a bug report, or do I need to try and collect more information?

      It started happening a few weeks ago. We would see a box CPU suddenly go to 100% indefinitely until the app was redeployed. Fortunately, we did take thread dumps, and in one case a heap dump. We were able to see that there is a recurring infinite loop in two places in Spring Data.

      This has happened a total of 3 times now. Given the number of nodes we run in PROD, INT and QA (some number > 30), and the length of time we have been in production on this stack, we would characterize it as a rare occurrence. But it worries us.

      One could ask "what changed in the application to make this start happening?". Ack. We are in our busy season right now, so a lot of commits went into our application for the weeks prior to the first occurrence. And given the rare nature of the problem, it would be difficult to really identify even the time range of commits that we would need to evaluate. Root Cause

      In all 3 cases, the infinite loop is coming from concurrent updates to the thread-unsafe HashMap class. We see it happening within two Spring Data classes:

      • org.springframework.data.mapping.PreferredConstructor
      • org.springframework.data.mongodb.core.convert.CustomConversions

      I could explain why HashMap is getting stuck in infinite loops, but several others have already done a great job doing it, so I refer you there:

      Stack Trace Snippets

      These stack traces come from our thread dumps. Each stack is seen multiple times when the problem occurs.

      PreferredConstructor
      "http-bio-8080-exec-235" daemon prio=10 tid=0x00007fb19c0a9800 nid=0x5337 runnable [0x00007fb228953000]
      java.lang.Thread.State: RUNNABLE
      at java.util.HashMap.getEntry(HashMap.java:469)
      at java.util.HashMap.get(HashMap.java:421)
      at org.springframework.data.mapping.PreferredConstruc tor.isConstructorParameter(PreferredConstructor.ja va:120)
      at org.springframework.data.mapping.model.BasicPersis tentEntity.isConstructorArgument(BasicPersistentEn tity.java:98)
      at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter$1.doWithPersistentProperty(Mappi ngMongoConverter.java:252)
      at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter$1.doWithPersistentProperty(Mappi ngMongoConverter.java:249)
      at org.springframework.data.mapping.model.BasicPersis tentEntity.doWithProperties(BasicPersistentEntity. java:257)
      at

      CustomConversions

      "http-bio-8080-exec-75" daemon prio=10 tid=0x00007f3688018000 nid=0x6c2e runnable [0x00007f376e9ef000]
      java.lang.Thread.State: RUNNABLE
      at java.util.HashMap.getEntry(HashMap.java:469)
      at java.util.HashMap.get(HashMap.java:421)
      at org.springframework.data.mongodb.core.convert.Cust omConversions.getCustomReadTarget(CustomConversion s.java:317)
      at org.springframework.data.mongodb.core.convert.Cust omConversions.hasCustomReadTarget(CustomConversion s.java:281)
      at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter.read(MappingMongoConverter.java: 200)
      at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter.readCollectionOrArray(MappingMon goConverter.java:791)
      at

      More Information

      I posted a lot more information about this issue here:

      Please let me know if there are things I can do to help diagnose this further. Thanks!

        Issue Links

          Activity

          Hide
          thomasd Thomas Darimont added a comment -

          CHM in CustomConversions

          Show
          thomasd Thomas Darimont added a comment - CHM in CustomConversions
          Hide
          thomasd Thomas Darimont added a comment -

          Hello Peter,

          thanks for reporting this. The issue in CustomConversions should have be fixed with an update to Babbage SR3 where we replaced the Map with a CHM.

          However the issue in PreferredConstructor is not yet fixed - since we're still using a plain map there. I'll have a look into this.
          In the mean time, could you give this a try with a newer version (e.g. Codd SR2 or Dijkstra GA)?

          Cheers,
          Thomas

          Show
          thomasd Thomas Darimont added a comment - Hello Peter, thanks for reporting this. The issue in CustomConversions should have be fixed with an update to Babbage SR3 where we replaced the Map with a CHM. However the issue in PreferredConstructor is not yet fixed - since we're still using a plain map there. I'll have a look into this. In the mean time, could you give this a try with a newer version (e.g. Codd SR2 or Dijkstra GA)? Cheers, Thomas
          Hide
          thomasd Thomas Darimont added a comment - - edited

          Instead of simply switching to a CHM in PC we choose to just guard the writes to the underlying hm with a ReadWriteLock. Potentially occuring (seldom) multiple writes are not an issue since they don't result in an infinite-loop.

          @Peter would you mind giving the attached PR a try?

          Show
          thomasd Thomas Darimont added a comment - - edited Instead of simply switching to a CHM in PC we choose to just guard the writes to the underlying hm with a ReadWriteLock. Potentially occuring (seldom) multiple writes are not an issue since they don't result in an infinite-loop. @Peter would you mind giving the attached PR a try?
          Hide
          lairdpeter Peter Laird added a comment -

          Thanks Thomas, sounds good. I will test out the PR - though it will take me a couple of days to get to it.

          Show
          lairdpeter Peter Laird added a comment - Thanks Thomas, sounds good. I will test out the PR - though it will take me a couple of days to get to it.
          Hide
          lairdpeter Peter Laird added a comment -

          Thomas - I have integrated your fix into our application, and have deployed it to several environments. Because of some other work going on, I can't widely deploy it yet, but over the next week it will roll through our other environments. I will alert you if we see anything amiss. Thanks for working on this.

          Show
          lairdpeter Peter Laird added a comment - Thomas - I have integrated your fix into our application, and have deployed it to several environments. Because of some other work going on, I can't widely deploy it yet, but over the next week it will roll through our other environments. I will alert you if we see anything amiss. Thanks for working on this.

            People

            • Assignee:
              thomasd Thomas Darimont
              Reporter:
              lairdpeter Peter Laird
              Last updater:
              Oliver Gierke
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Agile