Bringing Machine Learning to Elasticsearch with and Treasure Data

Bringing Machine Learning to Elasticsearch with and Treasure Data

Data analytics is a battle between order and chaos.

As organizations build up their data infrastructure, we often see different, divergent pipelines sprouting up to accommodate respective needs.   This leads to data silos as well as analytics silos:  one team builds its own data pipeline for themselves while another team builds a separate, disjointed one.

By way of example, let’s consider log data.   DevOps teams reach for Splunk or Elasticsearch for log analysis.   They set up log collectors (Splunk forwarders for Splunk, Logstash or Fluentd for Elasticsearch) and visualize/analyze them in Python, R, or Spark.  Occasionally, their data engineers (yet another group!) build a pipeline for them so that they can access structured log data in databases.   In either case, most data scientists hardly interact with Splunk or Elasticsearch.

Lack of a shared toolchain results in poor analytics practices, duplicated efforts and weaker insights from data.  For example, suppose a DevOps engineer wants to detect abusive users impacting the system’s stability, and simple rule-based heuristics isn’t cutting it.  This is a perfect opportunity to apply data science to a DevOps problem.   Data scientists can build statistical models to identify abusive users using anomaly detection algorithms and reduce false alarms so that DevOps engineers can sleep better at night.   This kind of collaboration, however is rare – because of those very data silos and lack of communication!

So how do we make our data analytics efforts more effective and useful?  In last month’s joint webinar with, we explored one potential solution.  By complementing (DevOps engineers’ tool) with Treasure Data (Data scientist/analysts’ service), one can gain deeper insights into operational data.

If your team is looking to improve operational intelligence by bringing a bit of data science into DevOps, I encourage you to check it out.

John Hammink
John Hammink
John Hammink is Chief Evangelist for Treasure Data. An 18-year veteran of the technology and startup scene, he enjoys travel to unusual places, as well as creating digital art and world music.
Related Posts
  • How to Create Four Different Customer Journey Maps (And Why You Might Need Them All)The most successful marketers understand how their customers arrive at a decision to buy—as well as how and where to meet a customer and become a trusted guide for the rest of their journey. Getting to kn...
  • The Data Nerd’s Guide to eTail EastAugust 19-22, eTail East comes to Boston. For retailers and marketers who are fascinated by data—and that should be all of us—this show offers such an embarrassment of riches, it's easy to get option paraly...
  • Artificial Intelligence: 10 Best AI Conferences and Shows in Q3-Q4 2019Whether you're an extrovert who enjoys handing out your business card or an introvert who prefers thoughtfully taking in the keynotes, AI shows and conferences represent a great opportunity to get the insights ...