Affects Version/s: 1.6.2 (Babbage SR1)
Environment:- Java webapp hosted as an Amazon EC2 Elasticbeanstalk app
- JVM: java version “1.7.0_51" Java(TM) SE Runtime Environment (build 1.7.0_51-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
- OS: Amazon Linux 3.2.12-3.2.4.amzn1.x86_64
- MongoDB: 2.4.6 (hosted on another box)
Pull Request URL:
Sprint:48 - Dijkstra GA, 49 - Dijkstra SR1
See also post at newsgroup:
Spring Data Infinite Loops in Production
This post is related to an issue we have started seeing in production at the company where I work. I am trying to figure out where to take this next. Should I proceed to filing a bug report, or do I need to try and collect more information?
It started happening a few weeks ago. We would see a box CPU suddenly go to 100% indefinitely until the app was redeployed. Fortunately, we did take thread dumps, and in one case a heap dump. We were able to see that there is a recurring infinite loop in two places in Spring Data.
This has happened a total of 3 times now. Given the number of nodes we run in PROD, INT and QA (some number > 30), and the length of time we have been in production on this stack, we would characterize it as a rare occurrence. But it worries us.
One could ask "what changed in the application to make this start happening?". Ack. We are in our busy season right now, so a lot of commits went into our application for the weeks prior to the first occurrence. And given the rare nature of the problem, it would be difficult to really identify even the time range of commits that we would need to evaluate. Root Cause
In all 3 cases, the infinite loop is coming from concurrent updates to the thread-unsafe HashMap class. We see it happening within two Spring Data classes:
I could explain why HashMap is getting stuck in infinite loops, but several others have already done a great job doing it, so I refer you there:
Stack Trace Snippets
These stack traces come from our thread dumps. Each stack is seen multiple times when the problem occurs.
"http-bio-8080-exec-235" daemon prio=10 tid=0x00007fb19c0a9800 nid=0x5337 runnable [0x00007fb228953000]
at org.springframework.data.mapping.PreferredConstruc tor.isConstructorParameter(PreferredConstructor.ja va:120)
at org.springframework.data.mapping.model.BasicPersis tentEntity.isConstructorArgument(BasicPersistentEn tity.java:98)
at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter$1.doWithPersistentProperty(Mappi ngMongoConverter.java:252)
at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter$1.doWithPersistentProperty(Mappi ngMongoConverter.java:249)
at org.springframework.data.mapping.model.BasicPersis tentEntity.doWithProperties(BasicPersistentEntity. java:257)
"http-bio-8080-exec-75" daemon prio=10 tid=0x00007f3688018000 nid=0x6c2e runnable [0x00007f376e9ef000]
at org.springframework.data.mongodb.core.convert.Cust omConversions.getCustomReadTarget(CustomConversion s.java:317)
at org.springframework.data.mongodb.core.convert.Cust omConversions.hasCustomReadTarget(CustomConversion s.java:281)
at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter.read(MappingMongoConverter.java: 200)
at org.springframework.data.mongodb.core.convert.Mapp ingMongoConverter.readCollectionOrArray(MappingMon goConverter.java:791)
I posted a lot more information about this issue here:
Please let me know if there are things I can do to help diagnose this further. Thanks!