Monday, 11. October 2021., 15:55
Getting correct insight into the needs and desires of your customer is a key advantage for any business. This process, traditionally the domain of offline, expert-assisted data mining, is becoming ever more commoditized and affordable through the modern techniques of machine learning. The trend is now reaching a level where you must either jump on board or be left in the dust of your competition.The key new challenge is making the trained prediction model usable in real time, while the user is interacting with your software. There is still a divide between the world of data scientists working in an environment optimized for fast trial-and-error runs on historical data and getting the finished models to perform at production scale, in real time.In this talk I will show one approach which allows you to write a low-latency, auto-parallelized and distributed stream processing pipeline in Java that seamlessly integrates with a data scientist's work taken in almost unchanged form from their Python development environment.The talk includes a live demo using the command line and going through some Python and Java code snippets.