Skip to main content

Failure Attitude

In my experience, there are two types of companies when it comes to big data: those who don’t want anything to do with it (because they think it’s for somebody else) and those who desperately want to implement, but don’t know where to start.  Regardless of where your company falls on that spectrum, there are several attitudes I encounter regularly that can kill a big data project faster than anything else.  Identifying and neutralizing these attitudes is key to getting a project off the ground and into implementation.

We are not a data company.

Every company is now a data company, and you’d better wake up to that fact. Data is everywhere and a part of everything, and I cannot think of a single industry or business that couldn’t benefit from understanding more about their customers, their sales cycles, demand for their product or service or their production inefficiencies. Just because you don’t yet know how big data could benefit your company, doesn’t mean it won’t. I am currently working with a bus and coach company that had very traditional views and didn’t think data mattered to them. Now they are collecting and analyzing telematics data from their vehicles to improve driving behaviour, as well as optimize routes and maintenance intervals. They have also started to better understand their customers by collecting and analyzing time and location stamped data on ticket purchases.

Too expensive.

This is a flat-out myth, because you can get this started by using relatively cheap cloud services and open-source software. People also believe that in order to start using big data, they need to bring in expensive data scientists as full time employees. The truth is, a good consultant can get you set up and an analyst can help you understand your data long before you need to bring in a full-time data scientist. The same bus and coach company I mentioned above is now storing their data on cloud-based Hadoop clusters that are rented, which means low entry costs. The company has also started to partner with a local university to analyze their data and develop better algorithms.
We have to collect as much data as possible.
This attitude simply leads to data obesity. In fact, in my experience, when clients ask for more data it’s because they don’t know what they need. To avoid this, start by determining the business problem the data will help you to solve. Once you have identified how data can add value to the business you go from there and find the data you need, rather than the other way around. One of my large retail clients put a hold on obsessive data collection by challenging their data team to build the smallest possible data set that would help answer their most important business questions. This shifted the focus away from a ‘lets collect everything we can’ attitude towards one where data is only collected if there is a clear business reason to do so.

We already have more data than we need.

It is true that most companies are already overwhelmed by the amount of data in their business and the thought of collecting more fills many managers with dread. However, the proliferation of data means that there are so many new data sources we can use and what’s more, many of those data sets can be accessed for free. A great example comes from a zoo, which was able to significantly improve their visitor and revenue predictions by pulling in free whether forecast data from the national weather service. Smart analysts will always ask what additional information could we use to solve our business problems.

It’s only something Silicon Valley companies do.

OK, I’m not sure many still believe that, but it goes back to the first point: nearly every industry can benefit from data. Even the most traditional of companies are turning to big data. Take Midwest farm machinery manufacturer John Deere as an example, the business is now collecting data from sensors on their machines and probes in the soil to give farm managers insights about how much fertilizer to use, how to save money on fuel and the level of crop they can expect. John Deere has become a big data company. Other traditional companies are starting to do the same where trucking companies use data to plan more efficient routes, real estate companies predict booms and busts in the market and motor insurance companies use their customer’s smart phone to track how well they really drive.

Everyone else is already ahead of us.

Putting your head in the sand now is not going to make it any better in the future. Adoption rates of big data technologies have gone up year on year and the speed at which new companies are joining the big data movement is accelerating. Even though the adoption curve of big data is growing steeper by the day and most companies have expressed their intention to use big data, the majority of companies are still in pre-implementation or pilot stages. In other words, you might be in better company than you think.

Our customers aren’t asking for it.

Chances are, even if they’re not asking for “big data” in so many words, they’re asking for the kind of information and analysis it can provide. If they’re looking for things like a more personalized service, comparative pricing, optimized supply chains or flexible maintenance cycles they’re asking for the things that only data can help you deliver. And the hard truth is: if you don’t provide it, someone else will. A wonderful example is Babolat, one of the oldest Racket Sport companies in the World, which now sells an innovative tennis racket that collects data from sensors within the racket and sends it wirelessly to your smart phone so that players can get some immediate analysis of their play. Customers demand smarter products and services, which is why smart TVs, smart phones, smart watches, and even smart diapers and smart yoga mats will be outselling their dumber counterparts.

In fact, these are just a few of the negative attitudes I’ve encountered when someone in a company is uncertain about implementing big data technologies. These misconceptions can only be overcome with education and concrete examples of how big data can benefit business and society.

Comments

Popular posts from this blog

Python and Parquet Performance

In Pandas, PyArrow, fastparquet, AWS Data Wrangler, PySpark and Dask. This post outlines how to use all common Python libraries to read and write Parquet format while taking advantage of  columnar storage ,  columnar compression  and  data partitioning . Used together, these three optimizations can dramatically accelerate I/O for your Python applications compared to CSV, JSON, HDF or other row-based formats. Parquet makes applications possible that are simply impossible using a text format like JSON or CSV. Introduction I have recently gotten more familiar with how to work with  Parquet  datasets across the six major tools used to read and write from Parquet in the Python ecosystem:  Pandas ,  PyArrow ,  fastparquet ,  AWS Data Wrangler ,  PySpark  and  Dask . My work of late in algorithmic trading involves switching between these tools a lot and as I said I often mix up the APIs. I use Pandas and PyArrow for in-RAM comput...

Kubernetes Configuration Provider to load data from Secrets and Config Maps

Using Kubernetes Configuration Provider to load data from Secrets and Config Maps When running Apache Kafka on Kubernetes, you will sooner or later probably need to use Config Maps or Secrets. Either to store something in them, or load them into your Kafka configuration. That is true regardless of whether you use Strimzi to manage your Apache Kafka cluster or something else. Kubernetes has its own way of using Secrets and Config Maps from Pods. But they might not be always sufficient. That is why in Strimzi, we created Kubernetes Configuration Provider for Apache Kafka which we will introduce in this blog post. Usually, when you need to use data from a Config Map or Secret in your Pod, you will either mount it as volume or map it to an environment variable. Both methods are configured in the spec section or the Pod resource or in the spec.template.spec section when using higher level resources such as Deployments or StatefulSets. When mounted as a volume, the contents of the Secr...

Andriod Bug

A bug that steals cash by racking up charges from sending premium rate text messages has been found in Google Play.  Security researchers have identified 32 apps on Google Play that harbour the bug called BadNews. A security firm Lookout, which uncovered BadNews, said that the malicious program lays dormant on handsets for weeks to escape detection.  The malware targeted Android owners in Russia, Ukraine, Belarus and other countries in eastern Europe. 32 apps were available through four separate developer accounts on Google Play. Google has now suspended those accounts and it has pulled all the affected apps from Google Play, it added. Half of the 32 apps seeded with BadNews are Russian and the version of AlphaSMS it installed is tuned to use premium rate numbers in Russia, Ukraine, Belarus, Armenia and Kazakhstan.