Spring Integration
  1. Spring Integration
  2. INT-2662

AbstractInboundFileSynchronizingMessageSource can't process same file twice

    Details

    • Type: Refactoring Refactoring
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      ... because it has AcceptOnceFileListFilter hard-coded. And this can not be configured in any way.

        Issue Links

          Activity

          Hide
          Shawn Guo added a comment -

          Sure, especially for FtpInboundFileSynchronizingMessageSource.

          It cause int-ftp:inbound-channel-adapter can't process same file twice.

          Reproduce steps:
          delete remote file is on in int-ftp:inbound-channel-adapter;
          upload FileA to FTP directory;
          int-ftp:inbound-channel-adapter download FileA and transfer to subsquent pipeline;
          upload same file FileA to FTP directory twice;
          int-ftp:inbound-channel-adapter download FileA but can not transfer to subsquent pipeline;
          because FileReadingMessageSource think fileA already be scanned.(AcceptOnceFileListFilter)

          Shawn Guo

          Show
          Shawn Guo added a comment - Sure, especially for FtpInboundFileSynchronizingMessageSource. It cause int-ftp:inbound-channel-adapter can't process same file twice. Reproduce steps: delete remote file is on in int-ftp:inbound-channel-adapter; upload FileA to FTP directory; int-ftp:inbound-channel-adapter download FileA and transfer to subsquent pipeline; upload same file FileA to FTP directory twice; int-ftp:inbound-channel-adapter download FileA but can not transfer to subsquent pipeline; because FileReadingMessageSource think fileA already be scanned.(AcceptOnceFileListFilter) Shawn Guo
          Hide
          Gunnar Hillert added a comment - - edited

          This seems to be an issue in 2 areas:

          1)

          The FTP/SFTP Inbound Adapter will use a AbstractInboundFileSynchronizingMessageSource, which allows you to define an AbstractInboundFileSynchronizer. The AbstractInboundFileSynchronizer is responsible for synchronizing the remote file location with the local-directory. If you don't specify a filter, then all files are taking into consideration.
          However, its synchronizeToLocalDirectory() method calls copyFileToLocalDirectory() and if the file with the name already exists, it does NOTHING (Even if the file itself changed).

          2)

          The second problem is in the AbstractInboundFileSynchronizingMessageSource. While you can specify a filter for retrieving remote files with the local temp directory, you CANNOT specify the filters that are responsible for retrieving the file from the local temp directory.

          When AbstractInboundFileSynchronizingMessageSource#receive() is called, it calls fileSource.receive(). BUT its filters are hard-coded by AbstractInboundFileSynchronizingMessageSource#buildFilter() --> AcceptOnceFileListFilter<File>() and RegexPatternFileListFilter(completePattern);

          Therefore, even if the remote file with a different timestamp/changed contents makes it to the local-temp-directory, it would still not being picked up.

          While you could argue that this is not a bug (We don't handle file-contents changes), it would probably be a useful feature to fix these 2 issues.

          This raises some interesting questions for improving filtering strategies - and thus further Jiras:

          When detecting duplication/file changes, should we add some other filtering strategies such as filtering based on file hashing? That would allow us to detect file changes more effectively (I understand that this may bot be the best solution for large files)

          Also, should we provide Inbound FTP/SFTP Adapter that don't require synchronization to the local file system? For example for security reasons, the remote file shall never touch the hard-disk unencrypted. Rather the remote file is streamed to the local machine and we subsequently route the Stream to downstream components, e.g. an Encrypting Transformer etc. before hitting the metal.

          Show
          Gunnar Hillert added a comment - - edited This seems to be an issue in 2 areas: 1) The FTP/SFTP Inbound Adapter will use a AbstractInboundFileSynchronizingMessageSource , which allows you to define an AbstractInboundFileSynchronizer . The AbstractInboundFileSynchronizer is responsible for synchronizing the remote file location with the local-directory . If you don't specify a filter, then all files are taking into consideration. However, its synchronizeToLocalDirectory() method calls copyFileToLocalDirectory() and if the file with the name already exists, it does NOTHING (Even if the file itself changed). 2) The second problem is in the AbstractInboundFileSynchronizingMessageSource . While you can specify a filter for retrieving remote files with the local temp directory, you CANNOT specify the filters that are responsible for retrieving the file from the local temp directory. When AbstractInboundFileSynchronizingMessageSource#receive() is called, it calls fileSource.receive(). BUT its filters are hard-coded by AbstractInboundFileSynchronizingMessageSource#buildFilter() --> AcceptOnceFileListFilter<File>() and RegexPatternFileListFilter(completePattern) ; Therefore, even if the remote file with a different timestamp/changed contents makes it to the local-temp-directory, it would still not being picked up. While you could argue that this is not a bug (We don't handle file-contents changes), it would probably be a useful feature to fix these 2 issues. This raises some interesting questions for improving filtering strategies - and thus further Jiras: When detecting duplication/file changes, should we add some other filtering strategies such as filtering based on file hashing? That would allow us to detect file changes more effectively (I understand that this may bot be the best solution for large files) Also, should we provide Inbound FTP/SFTP Adapter that don't require synchronization to the local file system? For example for security reasons, the remote file shall never touch the hard-disk unencrypted. Rather the remote file is streamed to the local machine and we subsequently route the Stream to downstream components, e.g. an Encrypting Transformer etc. before hitting the metal.
          Hide
          Pavel Tcholakov added a comment -

          With the newly introduced pseudo-transaction support in Spring Integration 2.2.x, it would be great to be able to "roll back" by leaving the file on its source server. This is currently not a viable option thanks to the hard-coded AcceptOnceFileListFilter.

          Show
          Pavel Tcholakov added a comment - With the newly introduced pseudo-transaction support in Spring Integration 2.2.x, it would be great to be able to "roll back" by leaving the file on its source server. This is currently not a viable option thanks to the hard-coded AcceptOnceFileListFilter.
          Hide
          Gary Russell added a comment -

          The ability to specify a custom filter is now in 3.0.0.BUILD-SNAPSHOT; see INT-2892.

          Show
          Gary Russell added a comment - The ability to specify a custom filter is now in 3.0.0.BUILD-SNAPSHOT; see INT-2892 .
          Hide
          Gary Russell added a comment -

          Resolved in M2 - see INT-2892.

          Show
          Gary Russell added a comment - Resolved in M2 - see INT-2892 .
          Hide
          Gary Russell added a comment -

          Note that there is a work-around; configure the (s)ftp adapter to send its files to nullChannel and configure a separate file inbound adapter (where the local filter can be configured).

          Show
          Gary Russell added a comment - Note that there is a work-around; configure the (s)ftp adapter to send its files to nullChannel and configure a separate file inbound adapter (where the local filter can be configured).

            People

            • Assignee:
              Gary Russell
              Reporter:
              Ilja
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: