During profiling the application, it was found that around 19% of time was spent in the various BeanFactory methods.
See JFR snapshot:
Our application has over 10K beans and around 3K beans loaded at startup (~10% of them during and remaining immediately after ApplicationContext refresh).
The current implementation of DefaultListableBeanFactory::doGetBeanNamesForType invokes AbstractBeanFactory::predictBeanType twice for every type lookup (one via isFactoryBean and the other during isTypeMatch)
As the doGetBeanNamesForType implementation iterates over all the available bean definitions, in our case, the number of calls to predictBeanType quickly escalate to 10K * 3K * 2 = 60M.
The existing cache inside of DefaultListableBeanFactory does help for subsequent lookups though.
I initially tried to minimize the number of calls to predictBeanType by adding a caching layer however the code in doGetBeanNamesForType seemed a bit hard to understand given the various corner cases it deals with.
It gave 10 seconds improvements however the lookup in the cache started showing up as the new hotspot.
Since the iteration over all the 10K bean definitions for every type lookup seemed to be the root cause, after several attempts, this is what I came up with (See PR).
The implementation goes over all the bean definitions and caches the type to name mapping once. It grabs the super types as well to avoid supplying a partial list of bean names in some cases.
Here is the JFR Hot Methods after applying the change:
This has helped speed up our application by around 25 seconds with no other change.
So far, it has been working fine for us on all our automated tests.
I would like to get your opinion on the change and if you think this would cause any long term breakages as Spring evolves.
Also it would be great if the core implementation itself can support more efficient lookups for types with sizeable beans in the registry.