I’m looking for a tool where python and java language can be used for processing real time messages from Kafka. Kafka receives new message every second and I would need for N consumers to read it always in a message by message manner (never process batch of messages at once and each message is processed by each of the consumers). Is the spark streaming suitable for the use case? It is required that the final solution is highly-available, meaning the computation script (kafka consumer) shall have no down-time or as little as possible. Is this possible to be achieved in Spark, so I wouldn’t have to care for the synchronization across multiple instances of the algorithm? If the messages would be arriving faster than second wise, is it possible to achieve sub-second latency with spark streaming? Apache Storm might be more natural choice for the sub-second stream-processing, it also supports python, so not entirely sure what the pros and cons are.
Generally speaking, my use case is:
- Process each message from kafka by each consumer
- Schedule some methods of the deployed scripts to be executed regularly. E.g. a method which runs each hour and updates the state of the running script
- Script needs to be highly available, meaning if one instance goes down, the other one becomes alive automatically
✓ Extra quality
ExtraProxies brings the best proxy quality for you with our private and reliable proxies
✓ Extra anonymity
Top level of anonymity and 100% safe proxies – this is what you get with every proxy package
✓ Extra speed
1,ooo mb/s proxy servers speed – we are way better than others – just enjoy our proxies!
USA proxy location
We offer premium quality USA private proxies – the most essential proxies you can ever want from USA
99,9% servers uptime
No usage restrictions
Perfect for SEO
We are working 24/7 to bring the best proxy experience for you – we are glad to help and assist you!