Uploaded image for project: 'Spring XD'
  1. Spring XD
  2. XD-2939

All Modules are undeployed on Zookeeper Connection Loss / GC Pause

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: Major
    • Resolution: Deferred
    • Affects Version/s: 1.1.1
    • Fix Version/s: None
    • Component/s: Runtime
    • Labels:
      None
    • Story Points:
      0
    • Rank (Obsolete):
      9223372036854775807

      Description

      We are currently running single node mode and experiencing the same problem as described here: http://stackoverflow.com/questions/28170864/spring-xd-jobs-automatic-undeployment-on-zookeeper-time-out-in-xd-singlenode-mo

      I've turned on GC logs and can see that there is a 29.7 second GC pause around the time when this happens. We've already set the Zookeeper timeouts (as suggested in the stackoverflow question) - without effect - we can just see, that after the configured timeout the ConnectionLoss errors start to appear.

      Sorry for the priorization - for us this currently is a major issue since we are running in singlenode mode (as a starter) and our system goes down once a day. Would this behavior change if we switch to distributed mode ?

      I know that a GC pause of 29 secs is really long, however, I've already seen such pauses for batch systems pretty often. Long running jobs tend to move objects to older generations and sometimes there isn't much of a chance to do something against it. So I guess it's worth considering this in the behavior of XD ?

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            prietzler Peter Rietzler
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: