Uploaded image for project: 'Spring Framework'
  1. Spring Framework
  2. SPR-11890

Spring-specific index file for component candidate classes

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Complete
    • Affects Version/s: None
    • Fix Version/s: 5.0 M2
    • Component/s: None
    • Labels:
      None
    • Last commented by a User:
      true

      Description

      Instead of or in addition to externally provided index arrangements (such as on JBoss), Spring could define its own index file: e.g. a "META-INF/spring.components" file listing pre-generated component candidate names (one per line) in order to shortcut the candidate component identification step in the classpath scanner. That file could get generated by a platform-specific deployer or simply by an application build task.

      Such a list in an index file wouldn't have to be definitive: Spring would still evaluate each and every one of those classes individually via ASM but wouldn't bother with evaluating any other classes in such a jar file. We could also consider using such a file for Spring-driven JPA entity scanning.

        Issue Links

          Activity

          Hide
          cemokoc cemo koc added a comment -

          Is this approach decreasing startup times dramatically, isn't it? This approach can be supported by some build plugins and a very fast startup time can be achieved.

          Show
          cemokoc cemo koc added a comment - Is this approach decreasing startup times dramatically, isn't it? This approach can be supported by some build plugins and a very fast startup time can be achieved.
          Hide
          snicoll Stéphane Nicoll added a comment -

          I have a first prototype that generates a META-INF/spring.components file at build time for the current project. At compilation time, the source model is introspected and JPA entities and Spring Components are flagged. I've done some experiment with Juergen yesterday on the JPA support (reading entities from the index rather than scanning the classpath). For small projects (less than 200 classes), the difference is insignificant. But if there is a lot of noise (lots of classes in the package(s) targeted for component scan), the startup time with the index is constant and increase significantly with the scan.

          Loading the index is quite cheap but I'd like to investigate an option where it's loaded only once (and on-demand). There some hacking in this branch

          Show
          snicoll Stéphane Nicoll added a comment - I have a first prototype that generates a META-INF/spring.components file at build time for the current project. At compilation time, the source model is introspected and JPA entities and Spring Components are flagged. I've done some experiment with Juergen yesterday on the JPA support (reading entities from the index rather than scanning the classpath). For small projects (less than 200 classes), the difference is insignificant. But if there is a lot of noise (lots of classes in the package(s) targeted for component scan), the startup time with the index is constant and increase significantly with the scan. Loading the index is quite cheap but I'd like to investigate an option where it's loaded only once (and on-demand). There some hacking in this branch
          Hide
          snicoll Stéphane Nicoll added a comment -

          Commit dc160f6 contains a first attempt at producing and using the index within the framework code base. Two areas are covered for the moment:

          • JPA entities (and friends) detection
          • Regular component (typically @Component) detection

          The implementation is completely transparent: you use @ComponentScan the regular way and if an index is available it will be used rather than traditional scanning.

          Performance wise, it's not super impressive:

          • Project with 200 entities, 200 components and 1000 irrelevant classes: 4120ms (scan) vs. 3902ms (index) - 5.3%
          • Project with 200 entities, 200 components and 5000 irrelevant classes: 4697ms (scan) vs. 4229ms (index) - 10%

          That's the mean of 3 separate executions of a full (Spring Boot) application bootstrap (i.e. including the hibernate bootstrap). The advantage of the index is that the time required to scan candidates is basically constant regardless of the size of the project.

          One thing that the implementation doesn't support is the use of custom include filters: the index only knows about a limited set of candidates so as soon as a custom include is set, we have to ressort to classpath scanning. In the app that was tested, several traditional classpath scanning were operated even though the index is available:

          • EnableSpringDataWebSupport is looking for SpringDataWebConfigurationMixin
          • RepositoryConfigurationDelegate is looking for RepositoryFactorySupport
          • RepositoryConfigurationSourceSupport is looking for Repository and RepositoryDefinition

          The first two are looking in Spring Data's own package space so it's not that bad since the size of the package is constrained. Still, Spring Data may generate some meta-data file they could lookup rather than scanning the classpath for components.

          For the latter, Repository is supported but RepositoryDefinition isn't. This may be a case for moving that annotation processor to Spring Boot where Spring Data is a dependency. That will not let Spring Data use it though. Another option would be to refer to those Data types in Spring Framework (the annotation processor knows about FQN anyway) but it's a bit weird to say the least.

          Show
          snicoll Stéphane Nicoll added a comment - Commit dc160f6 contains a first attempt at producing and using the index within the framework code base. Two areas are covered for the moment: JPA entities (and friends) detection Regular component (typically @Component ) detection The implementation is completely transparent: you use @ComponentScan the regular way and if an index is available it will be used rather than traditional scanning. Performance wise, it's not super impressive: Project with 200 entities, 200 components and 1000 irrelevant classes: 4120ms (scan) vs. 3902ms (index) - 5.3% Project with 200 entities, 200 components and 5000 irrelevant classes: 4697ms (scan) vs. 4229ms (index) - 10% That's the mean of 3 separate executions of a full (Spring Boot) application bootstrap (i.e. including the hibernate bootstrap). The advantage of the index is that the time required to scan candidates is basically constant regardless of the size of the project. One thing that the implementation doesn't support is the use of custom include filters: the index only knows about a limited set of candidates so as soon as a custom include is set, we have to ressort to classpath scanning. In the app that was tested, several traditional classpath scanning were operated even though the index is available: EnableSpringDataWebSupport is looking for SpringDataWebConfigurationMixin RepositoryConfigurationDelegate is looking for RepositoryFactorySupport RepositoryConfigurationSourceSupport is looking for Repository and RepositoryDefinition The first two are looking in Spring Data's own package space so it's not that bad since the size of the package is constrained. Still, Spring Data may generate some meta-data file they could lookup rather than scanning the classpath for components. For the latter, Repository is supported but RepositoryDefinition isn't. This may be a case for moving that annotation processor to Spring Boot where Spring Data is a dependency. That will not let Spring Data use it though. Another option would be to refer to those Data types in Spring Framework (the annotation processor knows about FQN anyway) but it's a bit weird to say the least.
          Hide
          snicoll Stéphane Nicoll added a comment -

          After yet another review with Juergen Hoeller I've improved the current proposal. A new Indexed annotation has been created and can instruct the annotation processor what to do. This would allow Spring Data to put Indexed on their own annotation types (typically RepositoryDefinition) to index them with that stereotype. Also, all type-level javax.* annotations are now indexed. This provides a generic solution for JPA, CDI and other needs (Spring Boot typically use classpath scanning to get WebServlet and WebFilter types).

          The next step is to change AnnotationTypeFilter to return the annotation type of interest. That way and if no other custom TypeFilter annotation types are present, the underlying implementation can still use the index, using that new getter to actually query the index. This can be tricky as each and every include filters must have a dedicated index type. Since we have access to the source code, we could actually verify that and fallback on classpath scanning if that's not the case.

          @Component is meta-annotated with @Indexed and @Repository should probably follow for that reason. Then the Spring Data codebase would work transparently.

          It was also identified that it would be nice if the ASM MetadataReaderFactory could be cached at the context level. That way, all the components that perform scanning and still use ASM to exclude candidates can benefit from a share cache rather that parsing the same model over and over again.

          Show
          snicoll Stéphane Nicoll added a comment - After yet another review with Juergen Hoeller I' ve improved the current proposal . A new Indexed annotation has been created and can instruct the annotation processor what to do. This would allow Spring Data to put Indexed on their own annotation types (typically RepositoryDefinition ) to index them with that stereotype. Also, all type-level javax.* annotations are now indexed. This provides a generic solution for JPA, CDI and other needs (Spring Boot typically use classpath scanning to get WebServlet and WebFilter types). The next step is to change AnnotationTypeFilter to return the annotation type of interest. That way and if no other custom TypeFilter annotation types are present, the underlying implementation can still use the index, using that new getter to actually query the index. This can be tricky as each and every include filters must have a dedicated index type. Since we have access to the source code, we could actually verify that and fallback on classpath scanning if that's not the case. @Component is meta-annotated with @Indexed and @Repository should probably follow for that reason. Then the Spring Data codebase would work transparently. It was also identified that it would be nice if the ASM MetadataReaderFactory could be cached at the context level. That way, all the components that perform scanning and still use ASM to exclude candidates can benefit from a share cache rather that parsing the same model over and over again.
          Hide
          snicoll Stéphane Nicoll added a comment -

          This feature has now been merged to master. There are still work to do in other projects, typically by adding the @Indexed annotation where it makes sense. We now support AssignableFilter (by placing the @Indexed annotation on an interface or a parent class and AnnotationTypeFilter by placing that annotation on an annotation. All exclude filters are obviously supported since the index does not have to handle that directly.

          Juergen Hoeller could you please review and backport @Indexed to 4.3.x so that we have a chance to integrate that sooner?

          Starting a Spring Boot application with all kinds of use cases now transparently uses only the index, except when CustomRepositoryImplementationDetector#detectCustomImplementation uses a regexp to figure out if a custom repository implementation exists for the project. I've had a chat with Ollie and we're still investigating if there's a solution to that.

          There is another occurrence of ClassPathScanningCandidateComponentProvider in Spring boot to identify the @SpringBootApplication of the current module. We could surely add @Indexed there and be done with it but since this detection is local and is also used a lot in unit tests, I am wondering if that's worth it. Users having performance issue should probably resort to specify the application class manually. We should make sure to document that properly as well.

          Show
          snicoll Stéphane Nicoll added a comment - This feature has now been merged to master. There are still work to do in other projects, typically by adding the @Indexed annotation where it makes sense. We now support AssignableFilter (by placing the @Indexed annotation on an interface or a parent class and AnnotationTypeFilter by placing that annotation on an annotation. All exclude filters are obviously supported since the index does not have to handle that directly. Juergen Hoeller could you please review and backport @Indexed to 4.3.x so that we have a chance to integrate that sooner? Starting a Spring Boot application with all kinds of use cases now transparently uses only the index, except when CustomRepositoryImplementationDetector#detectCustomImplementation uses a regexp to figure out if a custom repository implementation exists for the project. I've had a chat with Ollie and we're still investigating if there's a solution to that. There is another occurrence of ClassPathScanningCandidateComponentProvider in Spring boot to identify the @SpringBootApplication of the current module. We could surely add @Indexed there and be done with it but since this detection is local and is also used a lot in unit tests, I am wondering if that's worth it. Users having performance issue should probably resort to specify the application class manually. We should make sure to document that properly as well.
          Hide
          olivergierke Oliver Gierke added a comment -

          The reason for our own scanning is that we need to find out whether we have to wire a custom repository implementation to the proxy or not. If we relied on the users declaring the implementation class as Spring bean, we can't be sure that bean definition is already available in the registry and thus we can't reliably decide whether to configure a runtime bean reference or not.

          If the container provided means to create an optional bean reference by name, we could we could drop our own scanning.

          Show
          olivergierke Oliver Gierke added a comment - The reason for our own scanning is that we need to find out whether we have to wire a custom repository implementation to the proxy or not. If we relied on the users declaring the implementation class as Spring bean, we can't be sure that bean definition is already available in the registry and thus we can't reliably decide whether to configure a runtime bean reference or not. If the container provided means to create an optional bean reference by name, we could we could drop our own scanning.
          Hide
          snicoll Stéphane Nicoll added a comment -

          A first cut is available on master, see links for further tuning of this feature in future 5.0 milestones.

          Show
          snicoll Stéphane Nicoll added a comment - A first cut is available on master, see links for further tuning of this feature in future 5.0 milestones.
          Hide
          ptahchiev Petar Tahchiev added a comment -

          Hey guys, is there any API to interact with the generated index? I want to select all the classes that have the javax.persistence.Entity annotation and I don't want to do it through reflection. I was hoping as those files are listed in the spring.components file I could query that index. How can I do that?

          Show
          ptahchiev Petar Tahchiev added a comment - Hey guys, is there any API to interact with the generated index? I want to select all the classes that have the javax.persistence.Entity annotation and I don't want to do it through reflection. I was hoping as those files are listed in the spring.components file I could query that index. How can I do that?

            People

            • Assignee:
              snicoll Stéphane Nicoll
              Reporter:
              juergen.hoeller Juergen Hoeller
              Last updater:
              Juergen Hoeller
            • Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since last comment:
                14 weeks, 6 days ago