The Data Science Lifecycle (DSLC) is the cornerstone of any mature data science capabilities. What’s been published on this topic is more marketing materials than methodology. Companies with a clear lifecycle can expect consistent, monetizable work products from their data science teams.
My take on the DSLC is shaped by eight years both building data science products and working with companies to build data science capabilities. I have a different, more detailed take on many of the key elements because I have needed to implement them as a data scientist and build them as a data strategy consultant. I come at this with two goals:
- Give data scientists a repeatable process that results in viable data products.
- Give the business a process they can manage that allows them to monetize data science and machine learning capabilities.
From a business perspective, it allows a consistent team approach to data science and machine learning where each role is well defined. That has advantages when it comes to hiring, strategy planning, and project management. It’s a managed process which allows for oversight with the goal of either revenue or cost savings clearly in mind.
The DSLC Package covers 4 phases in detail:
- Data Science
- Data Engineering
Each phase includes entry and exit criteria as well as clearly defined progress gates.