Spring Batch
  1. Spring Batch
  2. BATCH-1767

OptimisticLockingFailureException updating step execution after commit failure

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1.7
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
      Spring 3.0.5
      Spring Batch 2.1.7
      Eclipselink 1.1.1

      Description

      It appears that if the commit fails, spring batch will get an OptimisticLockingFailureException when it tries to revert the changes to the step execution. In my particular case, I have a callback through EclipseLink to update history tables before a transaction is committed. If a failure occurs during this callback, the commit fails.

      From looking through the code and the attached log file, the step execution is updated and committed before the main transaction is committed. When the commit fails, the old values for the step execution (including version) are updated to the values before the chuck started. When control returns to AbstractStep.execute(), the OptimisticLockingFailureException is thrown when the step execution is updated with the failed status because the new version had already been committed to the database.

      2011-07-06 17:40:29,494 ERROR SimpleAsyncTaskExecutor-1 [org.springframework.batch.core.step.AbstractStep] Encountered an error saving batch meta data. This job is now in an unknown state and should not be restarted.
      org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=3225 with wrong version (35), where current version is 36
      at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
      at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
      at sun.reflect.GeneratedMethodAccessor130.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
      at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
      at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
      at $Proxy77.update(Unknown Source)
      at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:244)
      at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
      at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
      at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
      at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
      at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
      at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:91)
      at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:89)
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      at java.lang.Thread.run(Thread.java:662)

        Activity

        Hide
        Philippe Mouawad added a comment -

        Hello,
        After further analysis, I don't understand why copy method when restoring also copies the version number of oldVersion to stepExecution ?

        Could you shed some light on this ?

        Thank you
        Regards

        Show
        Philippe Mouawad added a comment - Hello, After further analysis, I don't understand why copy method when restoring also copies the version number of oldVersion to stepExecution ? Could you shed some light on this ? Thank you Regards
        Hide
        Michael Minella added a comment -

        I added a comment about the pull request. In short, moving the copy to where it is won't work because the StepExecution could have already been modified in the step. There are a couple tests that are failing with that update that illustrate this.

        With regards to why the copy method restores the version number...why wouldn't it? The intent is that there was a rollback so the state should match what was there before the rollback.

        Show
        Michael Minella added a comment - I added a comment about the pull request. In short, moving the copy to where it is won't work because the StepExecution could have already been modified in the step. There are a couple tests that are failing with that update that illustrate this. With regards to why the copy method restores the version number...why wouldn't it? The intent is that there was a rollback so the state should match what was there before the rollback.
        Hide
        Philippe Mouawad added a comment - - edited

        Thanks for feedback. Further testing this morning on real world project showed me fix did not correct all cases, my real world case is still broken although attached test case was fixed by it.

        Does it mean you confirm this bug is a real one ? Test project attached reproduces it and my real world also does.
        Is there some plan to fix it ? within which delay ?

        It seems anyway that the stepexecution is modified somewhere before restoring which generates the case.

        Show
        Philippe Mouawad added a comment - - edited Thanks for feedback. Further testing this morning on real world project showed me fix did not correct all cases, my real world case is still broken although attached test case was fixed by it. Does it mean you confirm this bug is a real one ? Test project attached reproduces it and my real world also does. Is there some plan to fix it ? within which delay ? It seems anyway that the stepexecution is modified somewhere before restoring which generates the case.
        Hide
        Kristof Buts added a comment -

        @Michael Minella,
        I'm afraid there is another (hidden) issue with this code:
        When running a batch multithreaded, one thread can commit stuff from another thread which can be rollbacked at the end. Shouldn't changes to the stepExecution object be recorded and executed against the actual stepExecution object at commit time instead of instantly applying to the real StepExecution object and reverting on rollback afterwards (just wondering) ?

        Show
        Kristof Buts added a comment - @Michael Minella, I'm afraid there is another (hidden) issue with this code: When running a batch multithreaded, one thread can commit stuff from another thread which can be rollbacked at the end. Shouldn't changes to the stepExecution object be recorded and executed against the actual stepExecution object at commit time instead of instantly applying to the real StepExecution object and reverting on rollback afterwards (just wondering) ?
        Hide
        Gérald Quintana added a comment -

        We has this OptimisticLockingFailureException, when we noticed this log just before: "Commit failed while step execution data was already updated. Reverting to old version." . Turning org.springframework.batch.core.step.tasklet and org.springframework.transaction loggers level to debug helped us find a bug in our tasklet code.

        In our case:

        • The tasklet produces wrong data but the chunk is property processed and no exception is raised here
        • Chunk transaction commit triggers a deferred database constraint which prevents commit (it may also occur with JPA where transaction commit triggers flushing)
        • This exception is not logged (maybe because of the RetryTemplate even if retry is not configured)
        • The TransactionManager decides to call doRollbackOnCommitException
        • The ChunkTransactionCallback transaction synchronization is called with status=ROLLBACKED
        Show
        Gérald Quintana added a comment - We has this OptimisticLockingFailureException, when we noticed this log just before: "Commit failed while step execution data was already updated. Reverting to old version." . Turning org.springframework.batch.core.step.tasklet and org.springframework.transaction loggers level to debug helped us find a bug in our tasklet code. In our case: The tasklet produces wrong data but the chunk is property processed and no exception is raised here Chunk transaction commit triggers a deferred database constraint which prevents commit (it may also occur with JPA where transaction commit triggers flushing) This exception is not logged (maybe because of the RetryTemplate even if retry is not configured) The TransactionManager decides to call doRollbackOnCommitException The ChunkTransactionCallback transaction synchronization is called with status=ROLLBACKED

          People

          • Assignee:
            Dave Syer
            Reporter:
            Quinton McCombs
          • Votes:
            7 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated: