Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.0.4
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      I have no need of a skip limit and yet Spring Batch forces me to impose an artificial one. I can set that limit suitably sky high, but this is not an accurate representation of my client's business requirements.

      My ideal solution would be for the skip limit to not be required in xml and for Spring Batch to assume that unless ALL records are skipped that the batch job succeeded.

        Activity

        Hide
        Dave Syer added a comment -

        We can certainly consider this as a new feature, but I would say in general that skipping is for exceptions, and if you expect an infinite number of them then it is not an exception, it is a business validation condiition that you should deal with in business logic. The devil as usual will be in the detail.

        Show
        Dave Syer added a comment - We can certainly consider this as a new feature, but I would say in general that skipping is for exceptions, and if you expect an infinite number of them then it is not an exception, it is a business validation condiition that you should deal with in business logic. The devil as usual will be in the detail.
        Hide
        Caoilte O'Connor added a comment -

        Sorry I really really can't see your viewpoint on this.

        Anyway I look at it I would only be reimplementing the exact same functionality, albeit with an unlimited failure option.

        These are the skip features I need in any business validation condition handling framework,

        • wrapping readers, writers and processors
        • keep count of the number of skips
        • stopping items which fail from being processed further (ie skip them on retries).
        • contract to only call once per skip

        But anyway why is skipping for exceptions?

        In your own documentation you talk about a list of vendors. Say I set the skippable number of bad vendors to 10.

        On monday I could import 11 vendors, ten of which were bad.
        On tuesday I could import 200 vendors, 20 of which were bad.

        The skippable number doesn't make sense in either situation. It's potentially too high in one and potentially too low in the other.

        You could improve the situation by making it proportional but in reality I expect that if the business has marked a problem as skippable they don't care how many vendors have that problem (unless it is all of them).

        Sorry, I hope that's not too hot tempered.

        Show
        Caoilte O'Connor added a comment - Sorry I really really can't see your viewpoint on this. Anyway I look at it I would only be reimplementing the exact same functionality, albeit with an unlimited failure option. These are the skip features I need in any business validation condition handling framework, wrapping readers, writers and processors keep count of the number of skips stopping items which fail from being processed further (ie skip them on retries). contract to only call once per skip But anyway why is skipping for exceptions? In your own documentation you talk about a list of vendors. Say I set the skippable number of bad vendors to 10. On monday I could import 11 vendors, ten of which were bad. On tuesday I could import 200 vendors, 20 of which were bad. The skippable number doesn't make sense in either situation. It's potentially too high in one and potentially too low in the other. You could improve the situation by making it proportional but in reality I expect that if the business has marked a problem as skippable they don't care how many vendors have that problem (unless it is all of them). Sorry, I hope that's not too hot tempered.
        Hide
        Dave Syer added a comment -

        Your example is not about an infinite number of skips, it's about a limit which is set proportionally rather than as an absolute number. I would be much more receptive to talking about how to achieve that - you can do it with a custom SkipPolicy (it's not supported in the XML namespace right now but there's nothing stopping you from doing it).

        Show
        Dave Syer added a comment - Your example is not about an infinite number of skips, it's about a limit which is set proportionally rather than as an absolute number. I would be much more receptive to talking about how to achieve that - you can do it with a custom SkipPolicy (it's not supported in the XML namespace right now but there's nothing stopping you from doing it).
        Hide
        Emerson Farrugia added a comment -

        Dave, I think the problem is that the skip-limit is completely arbitrary, and setting one just seems unnatural in certain situations.

        In my current use case, I want to skip bad call records encountered by my item reader and let my job execution complete successfully. I'll deal with the bad items later, having logged them with a SkipListener. I have no idea how many bad call records there will be. Usually none, sometimes ten, sometimes a hundred, sometimes more, it depends on a ton of factors. The thing is, if I set the skip-limit too low, my only recourse when the job fails is to increase it and rerun the job. If I set the limit just right or too high, I don't get errors and I can start dealing with the individual skipped records.

        With that in mind, I have two questions. The first is, why do you differentiate between a job failing because it hits its skip-limit and a job which passed but has X number of skipped records? The second is, if the number is arbitrary, why force the skip-limit at all?

        Show
        Emerson Farrugia added a comment - Dave, I think the problem is that the skip-limit is completely arbitrary, and setting one just seems unnatural in certain situations. In my current use case, I want to skip bad call records encountered by my item reader and let my job execution complete successfully. I'll deal with the bad items later, having logged them with a SkipListener. I have no idea how many bad call records there will be. Usually none, sometimes ten, sometimes a hundred, sometimes more, it depends on a ton of factors. The thing is, if I set the skip-limit too low, my only recourse when the job fails is to increase it and rerun the job. If I set the limit just right or too high, I don't get errors and I can start dealing with the individual skipped records. With that in mind, I have two questions. The first is, why do you differentiate between a job failing because it hits its skip-limit and a job which passed but has X number of skipped records? The second is, if the number is arbitrary, why force the skip-limit at all?
        Hide
        Lucas Ward added a comment -

        'The first is, why do you differentiate between a job failing because it hits its skip-limit and a job which passed but has X number of skipped records? The second is, if the number is arbitrary, why force the skip-limit at all?'

        While I can't speak for Dave, my answer would be because most people want to know when the data can no longer be considered 'good'. In Caolite's example, let's say that 99% of the records are skipped, at what point does that go from being acceptable to being a bad input file? In most environments I've ever supported, having a bad input file would signal a particular error code, which would cause a certain page to go out, and someone would manually look at the file, which would usually start some type of process with the person/entity that sent the file to have them resend, since it's obviously bad. I'm not personally against an infinite scenario, but I have trouble understanding the very high percentage failure scenarios. In my experience, the data that was labeled 'good' is probably bad as well for some other more sinister reason.

        I think the real problem here, is that volumes can never be known, thus coming up with some number that represents the acceptable failure record limit is impossible. Personally, I think the best case would be the ability to set a percentage of records that could fail. Of course, the problem there is defining how many records there are total in any given input. The only two I know of that it could even be possible with is fix width flat file input and database input. Maybe there's a more creative solution though?

        Show
        Lucas Ward added a comment - 'The first is, why do you differentiate between a job failing because it hits its skip-limit and a job which passed but has X number of skipped records? The second is, if the number is arbitrary, why force the skip-limit at all?' While I can't speak for Dave, my answer would be because most people want to know when the data can no longer be considered 'good'. In Caolite's example, let's say that 99% of the records are skipped, at what point does that go from being acceptable to being a bad input file? In most environments I've ever supported, having a bad input file would signal a particular error code, which would cause a certain page to go out, and someone would manually look at the file, which would usually start some type of process with the person/entity that sent the file to have them resend, since it's obviously bad. I'm not personally against an infinite scenario, but I have trouble understanding the very high percentage failure scenarios. In my experience, the data that was labeled 'good' is probably bad as well for some other more sinister reason. I think the real problem here, is that volumes can never be known, thus coming up with some number that represents the acceptable failure record limit is impossible. Personally, I think the best case would be the ability to set a percentage of records that could fail. Of course, the problem there is defining how many records there are total in any given input. The only two I know of that it could even be possible with is fix width flat file input and database input. Maybe there's a more creative solution though?
        Hide
        Caoilte O'Connor added a comment -

        Off the top of my head. One scenario that I deal with regularly is taking updates of ecommerce orders. Sometimes the order doesn't exist. The reason for this is it's a multichannel system and not all orders are placed with the website. Generally the client's OMS cannot easily strip out the orders that aren't ecommerce so we just throw a skippable exception and move on. Some days the non-ecommerce orders out number the ecommerce orders. It's an uncomfortable fact that if we ever get 99,999 non-ecommerce orders in one day the batch job will arbitrarily fall over.

        We could find a way to strip out those orders but it would be a lot of work.

        It's not even fair to say that all of those errors could obscure other problems becase we have zero tolerance for any actual bad data error. We use skippable batch exceptions only for lines that we don't and cannot deal with and which we know the client doesn't care about.

        Show
        Caoilte O'Connor added a comment - Off the top of my head. One scenario that I deal with regularly is taking updates of ecommerce orders. Sometimes the order doesn't exist. The reason for this is it's a multichannel system and not all orders are placed with the website. Generally the client's OMS cannot easily strip out the orders that aren't ecommerce so we just throw a skippable exception and move on. Some days the non-ecommerce orders out number the ecommerce orders. It's an uncomfortable fact that if we ever get 99,999 non-ecommerce orders in one day the batch job will arbitrarily fall over. We could find a way to strip out those orders but it would be a lot of work. It's not even fair to say that all of those errors could obscure other problems becase we have zero tolerance for any actual bad data error. We use skippable batch exceptions only for lines that we don't and cannot deal with and which we know the client doesn't care about.
        Hide
        Lucas Ward added a comment -

        The scenario you describe sounds like a perfect use case for 'filtering'. If the ItemProcessor returns null, a record is considered to be 'filtered'. The filterCount is incremented on the StepExecution, and the next record is read. I think this better fits your scenario, since you actually are filtering out records that aren't applicable. That really is the difference between skipping a record and filtering. Skipping is really intended for records that can't be handled at all. In the case of a file this is something that can't be parsed. Something like this needs an upper limit, because if you can't even parse a certain number of records, there's obviously something wrong. Filtering, on the other hand, is usually reserved for cases where a record is perfectly good, but shouldn't be handled for business reasons, and thus has no need of a limit.

        I think the real problem here is that we haven't been very good at explaining the difference between the two, and I'm willing to bet a lot of people don't even know filtering exists as an option. It was a definite gap we recognized, and has only since been added in version 2.0. We should find a way to make this more prominent, so that people know they have both options.

        Show
        Lucas Ward added a comment - The scenario you describe sounds like a perfect use case for 'filtering'. If the ItemProcessor returns null, a record is considered to be 'filtered'. The filterCount is incremented on the StepExecution, and the next record is read. I think this better fits your scenario, since you actually are filtering out records that aren't applicable. That really is the difference between skipping a record and filtering. Skipping is really intended for records that can't be handled at all. In the case of a file this is something that can't be parsed. Something like this needs an upper limit, because if you can't even parse a certain number of records, there's obviously something wrong. Filtering, on the other hand, is usually reserved for cases where a record is perfectly good, but shouldn't be handled for business reasons, and thus has no need of a limit. I think the real problem here is that we haven't been very good at explaining the difference between the two, and I'm willing to bet a lot of people don't even know filtering exists as an option. It was a definite gap we recognized, and has only since been added in version 2.0. We should find a way to make this more prominent, so that people know they have both options.
        Hide
        Hafiz A Haq added a comment -

        Currently I am writing the skipped records into a response file using a skip listener. If I set filter to true would I be able to write the filtered record onto the response file? Could you please point to me the right direction?

        If it's not possible to write the filtered record onto a file, infinite skip limit is a must. I would vote for it once I get clarity on the above.

        Thanks and Regards,
        Hafiz

        Show
        Hafiz A Haq added a comment - Currently I am writing the skipped records into a response file using a skip listener. If I set filter to true would I be able to write the filtered record onto the response file? Could you please point to me the right direction? If it's not possible to write the filtered record onto a file, infinite skip limit is a must. I would vote for it once I get clarity on the above. Thanks and Regards, Hafiz

          People

          • Assignee:
            Dave Syer
            Reporter:
            Caoilte O'Connor
          • Votes:
            7 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated: