The Data Science Process

Below is a diagram of the GABDO Process Model that I created and introduce in my book, AI for People and Business. Data scientists usually follow a process similar to this, especially when creating models using machine learning and related techniques. The GABDO Process Model consists of five iterative phases—goals, acquire, build, deliver, optimize—hence, represented by the acronym GABDO. Each phase is iterative because any phase can loop back to one or more phases before. Feel free to check out the book if you’d like to learn more about the process and its details.

Process Tasks
Goals Identify goals
Identify opportunites
Create hypothesis
Acquire Identify datar
Acquire data
Prepare data
Build Explore
Select
Prepare data
Deliver Identify data
Acquire data
Train, validate,test
Improve
Optimize Monitor
Analyze
Improve

That’s the process in a nutshell.

Summary

Harvard was right about data scientists. It’s an extremely important and high-demand role that can have significant impact on a business’ ability to achieve its goals, whether they are financial, operational, strategic, and so on. Company’s collect a ton of data, and much of the time it’s neglected or underutilized. This data, through meaningful information extraction and discovery of actionable insights, can be used to make critical business decisions and drive significant business change. It can also be used to optimize customer success and subsequent acquisition, retention, and growth. As mentioned, data scientists can have a major positive impact on a business’ success, and sometimes inadvertently cause financial loss, which is one of the many reasons why hiring a top notch data scientist is critical.

Now, it is time to test your knowledge!