Details
-
Type:
Bug
-
Status: Done
-
Priority:
Major
-
Resolution: Complete
-
Affects Version/s: 1.1 GA
-
Component/s: Runtime
-
Labels:None
-
Epic Link:
-
Story Points:5
-
Rank (Obsolete):9223372036854775807
-
Pull Request URL:
-
Sprint:Sprint 44
Description
How to reproduce:
1. Run xd-singlenode (for which setting the Spark master URL to 'local' is a requirement). Use more than 1 worker thread. e.g. local[4]
2. Deploy the word-count example
3. Create a stream
stream create spark-streaming-word-count --definition "http | word-count | log" --deploy
4. Send data
xd:>http post --data "a b c d e f g"
xd:>http post --data "a b c"
5.Observe the result
2015-02-24 15:12:46,018 1.2.0.SNAP INFO Executor task launch worker-3 sink.spark-streaming-word-count - (e,1)
2015-02-24 15:12:46,018 1.2.0.SNAP INFO Executor task launch worker-1 sink.spark-streaming-word-count - (d,1)
2015-02-24 15:12:46,019 1.2.0.SNAP INFO Executor task launch worker-2 sink.spark-streaming-word-count - (b,1)
2015-02-24 15:12:46,020 1.2.0.SNAP INFO Executor task launch worker-1 sink.spark-streaming-word-count - (g,1)
2015-02-24 15:13:40,020 1.2.0.SNAP INFO Executor task launch worker-1 sink.spark-streaming-word-count - (a,1)
2015-02-24 15:13:40,020 1.2.0.SNAP INFO Executor task launch worker-2 sink.spark-streaming-word-count - (b,1)
2015-02-24 15:13:40,021 1.2.0.SNAP INFO Executor task launch worker-3 sink.spark-streaming-word-count - (c,1)
(the last three results are coming from the second invocation))
Note: there seems to be a correlation between the number of values emitted and the number of workers, as, in all the attempts, there aren't more values emitted than the number of workers.