Four Analytics Problems…That Our Customers Solved
When companies decide it’s time to be “data-driven,” there are a number of common pits into which they tend to fall. One we see quite often is that they build their own internal data application platform.
Want to build up your own data stack from scratch? You’ll have to:
- Choose and build your own workflow manager;
- Choose, Set up and host your own query engine (Spark? Hive? Presto?)
- Choose, Set up and possibly host your own data storage (Redshift? EMR? S3 (flat files)? or..?)
- Choose, Set up and configure your own data ingestion (Kafka? Kinesis?)
Why not choose a cloud-based solution where all of this is ready to go?
Since we’re not just a pipeline (we’re a Data Application Platform – with automatic scalability multi-tenancy, and VPC security built right in), we save our customers time and money on building up their own data stack.
Grindr – Waking up from SDK Fatigue and Data Loss
As the first ever geolocation-based dating app, Grindr quickly grew to under 1 million active users in just under three years. However, the marketing team, needed real usage data for the 198 countries they now operated in, and, at 30,000 API calls per second and 5.4 chat messages per hour, Grindr hit a number of bottlenecks.
First off, their original RabbitMQ pipelines began to lose data during periods of heavy usage, and datasets quickly scaled beyond the size limits of a single machine Spark cluster.
What’s more, multiple data collection SDKs running in the app at the same time started to cause instability and crashes, leading to a lot of frustrated Grindr users. The team needed a single way to capture data reliably from all of its sources.
With Treasure Data, they were able to to eliminate these problems.
You can read more about Grindr’s team beat SDK fatigue and data loss, to deliver real usage data (in real time) here.
Mobfox – Scaling the connection pool
Mobfox, one of the largest mobile advertising platforms in Europe, had a problem. A couple of them actually. As they hit the limit of their sharded MySQL approach, they experienced:
- Difficulty sizing servers for high peak-to-average write loads
- Huge, unwieldy connection pools to a central DB server
- High volume write loads causing buffer overruns during garbage collection
However, by partnering with Treasure Data – and making a few quick changes to their analytics pipeline – they were able to eliminate these bottlenecks entirely.
Learn more about how Mobfox aced their connection pool, volume and scaling problems with Treasure Data.
Wish – A/B testing…from A to Z
Wish, a personalized shopping mall on your phone, needed a scalable data infrastructure right out of the box, and so they went with Treasure Data at the very beginning. And they needed A/B testing – fast – because how else would analysts understand the majority of shoppers – from different demographics – who use the platform?
In just under 18 months, the ecommerce platform grew to 15 Million daily active users. In just two hours, Wish’s co-founder integrated Treasure Data’s open source data collector — fluentd — with their backend servers to capture enriched HTTP request logs in Treasure Data’s cloud data lake.
Read all about how Wish built up their data pipeline to support A/B testing from the start.
Muji – merging Point of Sale and Web data for customer loyalty
With 414 physical stores in Japan, 344 brick-and-mortar stores worldwide, and an ecommerce site, Muji, a global lifestyle retailer, had a problem. How to combine data from Point of Sale (and disparate systems at that!) and ecommerce sites to get unified customer analytics?
So, how did Muji do it?Go here to find out more.
Want to get the best out of your data analytics infrastructure with the minimum of expense, hassle, and configuration? Try out Treasure Data.
Our universal data connectors can route almost any data source to almost any data destination. Want an integration we don’t support yet? Ask us!