Spring Batch
  1. Spring Batch
  2. BATCH-1799

Exception in flush of file output ItemWriters does not abort a step/job

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Complete
    • Affects Version/s: None
    • Fix Version/s: 2.2.0, 2.2.0 - Sprint 6
    • Component/s: Infrastructure
    • Labels:
      None
    • Environment:
      reproduced with [SpringBatch 2.1.5/ Spring 3.0.2] and [SpringBatch 2.1.8/ Spring 3.0.2].

      Description

      Scenario:
      Using a FlatFileItemWriter to write into a file on full diks/memorystick. (note: there must be enough space on the disk/memorystik to create the file during the call of open() ).

      What would I expect:
      The step and also the job should fail, since the data could not be written into the fail, because missing space.

      What happens:
      The IOException is simply logged, but the step is not failing.

      What is the result:
      The written file is corrupt, since not complete. A restart is not possible, since the failing step actually ends with state COMPLETED.

      What causes the problem:
      Described in http://forum.springsource.org/showthread.php?115739-DiskFull-IOException-does-not-result-in-a-failed-job-when-writing-to-a-file

        Activity

        Hansjoerg Wingeier created issue -
        Dave Syer made changes -
        Field Original Value New Value
        Assignee Robert Kasanicky [ robert.kasanicky ]
        Hide
        Robert Kasanicky added a comment -

        I'm not exactly sure how the TransationAwareBufferedWriter was intended to work for error scenarios. Perhaps the catch is it was meant to implement TransationSynchronization.afterCommit() rather than TransationSynchronization.afterCompletion()(the first one propagates exceptions to the caller while the latter doesn't)?

        Show
        Robert Kasanicky added a comment - I'm not exactly sure how the TransationAwareBufferedWriter was intended to work for error scenarios. Perhaps the catch is it was meant to implement TransationSynchronization.afterCommit() rather than TransationSynchronization.afterCompletion() (the first one propagates exceptions to the caller while the latter doesn't)?
        Robert Kasanicky made changes -
        Assignee Robert Kasanicky [ robert.kasanicky ] Dave Syer [ david_syer ]
        Dave Syer made changes -
        Priority Major [ 3 ] Minor [ 4 ]
        Remaining Estimate 0d [ 0 ]
        Fix Version/s 2.2.0 [ 12109 ]
        Component/s Infrastructure [ 10251 ]
        Summary DiskFull IOException does not abort a step/job Exception in ItemStream.open() does not abort a step/job
        Original Estimate 0d [ 0 ]
        Assignee Dave Syer [ david_syer ] Robert Kasanicky [ robert.kasanicky ]
        Hide
        Hansjoerg Wingeier added a comment -

        In my opinion the new title does not reflect the actual problem. An exception in ItemStream.open did always end in an aborted step/job. This was never an issue. The real problem is the way Robert described it in his comment. The actual write is not done in the transaction context, because it happens in TransationSynchronization.afterCompletion. I guess the title should rather be something like "actual write of TransationAwareBufferedWriter does not happen in transaction context".

        Show
        Hansjoerg Wingeier added a comment - In my opinion the new title does not reflect the actual problem. An exception in ItemStream.open did always end in an aborted step/job. This was never an issue. The real problem is the way Robert described it in his comment. The actual write is not done in the transaction context, because it happens in TransationSynchronization.afterCompletion. I guess the title should rather be something like "actual write of TransationAwareBufferedWriter does not happen in transaction context".
        Dave Syer made changes -
        Summary Exception in ItemStream.open() does not abort a step/job Exception in flush of file output ItemWriters does not abort a step/job
        Hide
        Dave Syer added a comment -

        OK, I get it. Sorry, and thanks for pointing out the error.

        Show
        Dave Syer added a comment - OK, I get it. Sorry, and thanks for pointing out the error.
        Hide
        Robert Kasanicky added a comment -

        @Hansjoerg: assuming you can easily recreate the problem, can you try out whether writing out the buffer in TransactionSynchronization.afterCommit() fixes the problem for you?

        Show
        Robert Kasanicky added a comment - @Hansjoerg: assuming you can easily recreate the problem, can you try out whether writing out the buffer in TransactionSynchronization.afterCommit() fixes the problem for you?
        Hide
        Hansjoerg Wingeier added a comment -

        @Roboert: ok, I'll try to find some time to do it this week. I'll let you know as soon as I have the rsults.

        Show
        Hansjoerg Wingeier added a comment - @Roboert: ok, I'll try to find some time to do it this week. I'll let you know as soon as I have the rsults.
        Hide
        Robert Kasanicky added a comment -

        Hansjoerg's update and sample project copied from email:

        I had time this morning to have a look at the problem and to write a small test in order to reproduce it. Please have a look at the attached Eclipse/STS project. In order to reproduce the problem, just point the output resource to a disk/memory stick with less than 5MB of free space. I actually wanted to put it directly in the JIRA, however, it seems I don’t have the privileges to add attachments.

        The project also includes a patched version of TransactionAwareBufferedWriter where I initiated the flush of the writer in the afterCommit method. This works in the way, that the job is indeed terminated with status FAILED. However, according to the JavaDoc of afterCommit, it is called after the transaction was successfully commited. Which means, that the position counters/pointers of the reader and writer are already updated and therefore persisted in the executionContext of the job/step. Hence, a restart would fail, respectively would be inconsistent, since the pointers are already positioned after the failing chunk and not at their beginning.

        I also tried to make the flush in the beforeCommit method. However, the result in this case was an Unknown-exitstate with the message, that a Restart is not possible.

        If you have further questions, don’t hesitate to contact me.

        Show
        Robert Kasanicky added a comment - Hansjoerg's update and sample project copied from email: I had time this morning to have a look at the problem and to write a small test in order to reproduce it. Please have a look at the attached Eclipse/STS project. In order to reproduce the problem, just point the output resource to a disk/memory stick with less than 5MB of free space. I actually wanted to put it directly in the JIRA, however, it seems I don’t have the privileges to add attachments. The project also includes a patched version of TransactionAwareBufferedWriter where I initiated the flush of the writer in the afterCommit method. This works in the way, that the job is indeed terminated with status FAILED. However, according to the JavaDoc of afterCommit, it is called after the transaction was successfully commited. Which means, that the position counters/pointers of the reader and writer are already updated and therefore persisted in the executionContext of the job/step. Hence, a restart would fail, respectively would be inconsistent, since the pointers are already positioned after the failing chunk and not at their beginning. I also tried to make the flush in the beforeCommit method. However, the result in this case was an Unknown-exitstate with the message, that a Restart is not possible. If you have further questions, don’t hesitate to contact me.
        Robert Kasanicky made changes -
        Attachment diskfull.zip [ 19097 ]
        Hide
        Dave Syer added a comment -

        See also BATCH-1864

        Show
        Dave Syer added a comment - See also BATCH-1864
        Michael Minella made changes -
        Fix Version/s 2.2.0 Backlog [ 13607 ]
        Fix Version/s 2.2.0 [ 12109 ]
        Michael Minella made changes -
        Fix Version/s 2.2.0 [ 12109 ]
        Michael Minella made changes -
        Rank Ranked higher
        Michael Minella made changes -
        Fix Version/s 2.2.0 - Sprint 2 [ 13612 ]
        Fix Version/s 2.2.0 [ 12109 ]
        Fix Version/s 2.2.0 Backlog [ 13607 ]
        Michael Minella made changes -
        Assignee Robert Kasanicky [ robert.kasanicky ] Michael Minella [ mminella ]
        Michael Minella made changes -
        Fix Version/s 2.2.0 Backlog [ 13607 ]
        Fix Version/s 2.2.0 - Sprint 2 [ 13612 ]
        Michael Minella made changes -
        Fix Version/s 2.2.0 - Sprint 5 [ 13699 ]
        Fix Version/s 2.2.0 Backlog [ 13607 ]
        Michael Minella made changes -
        Fix Version/s 2.2.0 - Sprint 6 [ 13700 ]
        Fix Version/s 2.2.0 - Sprint 5 [ 13699 ]
        Hide
        Michael Minella added a comment -
        Show
        Michael Minella added a comment - Pull request: https://github.com/SpringSource/spring-batch/pull/69
        Michael Minella made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 2.2.0 [ 12109 ]
        Resolution Complete [ 8 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        401d 6h 49m 1 Michael Minella 16/Nov/12 11:42 AM

          People

          • Assignee:
            Michael Minella
            Reporter:
            Hansjoerg Wingeier
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: