V-Squared Data Strategy Consulting
Machine Learning

I hear a lot of complaints about machine learning from business clients. It’s too complicated is one of them. Machine learning is complex by nature. The more that complexity spills over to users and executive management, the more turned off to machine learning they become. I’ve seen it happen in real time. Eyes glaze over in presentations. Users are left in confusion. Support staff spend a week Googling to understand the training they’ll need for the tech stack they’ll be supporting.

That reaction is starting to leave a bad taste in business’s mouth when it comes to machine learning. There is a lot of frustration in the machine learning community too. Believe me, I’ve been there when senior decision makers admit they haven’t read any of the supporting materials or trying to explain linear algebra to developers who’ve long forgotten their math education. I look at it this way; we’re the ones trying to get them to adopt a new technology, our technology. Shouldn’t we make it as simple as possible? With our grasp of the concepts, shouldn’t we be able to explain it to anyone?  If you disagree, by all means, light the fires in the comments section.

Dropping the Jargon & Becoming an Educator

This one sounds easy but it’s not. Jargon isn’t just the buzzword bingo terms like “the intersection of big data and finance” or “prescriptive analytics.” Ladies and gentlemen of the machine learning community, there are maybe 25K to 30K people on earth who know what a perceptron is. Our simplest neural network is often the most complex work term your audience will hear this year. Using terms like “convolutional neural network” or “regression” count as jargon to non-machine learning audiences.

I was in a marketing meeting where a data scientist started using terms like “regression” and “cohort analysis” without first explaining them. Almost immediately, people started reaching for their phones or checking email on their laptops. The whole room was lost in the space of 30 seconds.

We need to introduce concepts then connect them to terms. Term first almost always loses the room. We need to educate people on concepts in plain language and repeatedly reinforce the term connection. Once you hear them start using the term correctly, you’re good to start using it. Our role as educators cannot be under sold.

Machine Learning As A Story

Just as we’ve become excellent data storytellers, we also need to tell stories about the methodology. I use a factory floor analogy. Each piece of the machine learning architecture is a machine. A machine takes inputs, does some sort of assembly task, and outputs either a finished product or a piece for the next machine. Machines have defect rates; sometimes what comes out is faulty and here’s our quality control process.

I’ve seen a great presentation using the analogy of a route on a map. Freeways were long analytics processes and streets were quick data transformations. Picking up passengers was additional data sources. Conversations in the car were data sharing between processes.

The target audience wants to leave a presentation with a story to tell. Give them a simple story to explain a complex system and you’ve succeeded. Give them follow up materials to allow them to deep dive if they want to but don’t make it a prerequisite of future stories. Understand each role in the room and let them know what pieces of the story are most relevant to them.

Simplify the Tech Stack

I’ve been brought in to simplify four really bad spaghetti tech stacks. I’ve seen Python and R used at an all Java shop. I’ve seen open source soup: 4 different brands of NoSQL, Hadoop and Spark, with 4 different programming languages. I’ve heard, “We don’t really know why that’s there but when we tried to uninstall it everything crashed.” I’ve seen the comments in code, “//Something magic happens here. Do not touch.” I’ve seen machine learning with JavaScript. I’ve seen horrors but nothing compared with the stories of the IT groups who had to support it all.

When we architect machine learning solutions we need to pretend that the person responsible for supporting or maintaining it is a serial killer who knows where we live. I try to keep the tech stack as close to what is already in place as possible. I know C# and Java aren’t the first choice for machine learning but they have a lot of upside to balance out the lack of open source support. I know it can be done in JavaScript, but that doesn’t mean it should be. I understand how cool each flavor of NoSQL is but that doesn’t mean we should use all of them.

I’m poking fun at myself as much as the rest of the community but there’s a real cost associated with spaghetti stacks. ROI is a huge concern around machine learning. Very few machine learning projects have demonstrated the high returns promised by the technology. A big part of the cost overrun is maintenance and support. Sometimes it’s cheaper for the business to scrap the initial effort and start from scratch. Getting the green light for a second machine learning initiative is an uphill climb when the first one was 3X budget and delivered a year late.

Resolving the complaints about machine learning products is what will keep businesses investing in machine learning. As we enter the hype cycle, it’s important to be focused on what our clients and users expect from machine learning. Complexity is a barrier to use and ROI. It takes work to remove the complexity but it’s part of delivering a complete product.

Your Name: *
Your E-mail: *
Share E-mail: *
Message: *

Thank you for sharing this post.

Your recipient
will get your email shortly.