Data Science — Make or Buy?

Companies are increasingly using data science and data analytics solutions to use the sea of data to benefit their own business.

The question that is asked in connection with data science is therefore no longer primarily a question of whether, but rather of how. This is also confirmed by the results of recent studies: These include main reasons, which prevent companies from using these advanced technologies, in particular a lack of know-how and a lack of personnel.

Anyone who wants to benefit from the far-reaching advantages is therefore faced with the fundamental question of how to obtain the necessary skills. There are basically two options for this: First, the internal development of data science resources (make) or, second, the purchase of external data science services (buy). It's not an easy decision. With the following article, we would like to take a closer look at the two options and focus in particular on the benefits and challenges.

OPTION 1: MAKE

The Make decision means developing the required data science skills internally. This can be done either by training existing employees or by hiring data experts. The plural is deliberately used here when it comes to the word data expert. Because of the complexity of data science initiatives, a team of specialists often works together on such a project. In addition to the data scientist, this includes, for example, the data engineer, the machine learning engineer and the software developer.

MAKE DECISION: WHAT SPEAKS FOR IT?

The development of internal data science resources has clear advantages. The main reasons include the following:

INTERNAL COMPETENCY DEVELOPMENT

Over the long term, the company builds up competencies that can be called upon almost any time. Data science skills can thus be used in all areas of the company and, assuming a good company network, across the board.

BUSINESS KNOW-HOW

Your own employees know the business as well as an external person will rarely be able to. This is particularly advantageous when implementing use cases that require very specific domain knowledge.

STRATEGIC ANCHORING

Since data science resources are often scarce, it makes sense to have a data strategist in the company who can prioritize internal requirements and the associated use of resources in accordance with the data strategy. This starts with questions about the most important data sources in the company and extends to prioritizing specific use cases to be implemented. In addition, getting off to a good start with data science requires the right tool (both IT infrastructure and technology stack). Here, too, it is an advantage if the right tools are selected strategically by a person responsible.

Regardless of the benefits, building up data science capacities internally also poses certain challenges.

MAKE DECISION: WHAT MAKES SETTING UP A DATA SCIENCES TEAM SO DIFFICULT?

In order to achieve the desired value of having your own data science team, there are a few hurdles to overcome, which we would like to briefly outline below.

SHORTAGE OF SKILLED WORKERS AND LESS EXPERIENCED STAFF

Hiring data experts and building your own data science resources is no easy task given the shortage of specialists. How the number of vacancies for IT specialists broke a new record in 2019 more than 124,000 open jobs. Specialists primarily include software developers and data scientists. So anyone who wants to build up their own data science capacities is in a battle for talent.

It is even more difficult to recruit experienced data experts with several years of professional experience. The reason? The job profiles are simply still relatively new and universities and colleges are only now beginning to train data scientists.

On the other hand, training and continuing education of existing employees offers an alternative that requires an investment of time, but not necessarily a large financial outlay. In this context, it is often the case of Gartner coined term for the development of subject matter experts into so-called citizen data scientists. The term describes people who have a basic understanding of working with data as well as a certain amount of statistical and mathematical knowledge, and are therefore able to take on certain analytical tasks of a data scientist. Citizen data scientists may not replace data scientists, but they can play a complementary role and, above all, work more productively with data experts.

THE RIGHT TEAM COMPOSITION

When setting up and assembling your data science resources, you should also consider the existing IT infrastructure and the framework conditions that have already been created in your company. Because all of these things have an impact on the roles required and the size of the team.

For example, if your company already has sophisticated data management and you have already set up a well-maintained data warehouse, the basic requirements and the starting point for the work of a data scientist have already been created. If this is not the case, the data is often still decentralized in the respective source systems. Before a data scientist can get started, the appropriate framework conditions must first be created. This task requires appropriate experts, who you can of course hire, but this in turn increases the size of the team and thus the costs.

INTEGRATION OF THE DATA SCIENCE TEAM IN THE COMPANY

Anyone who decides to set up their own data science team is also faced with the question of how the team should be integrated into the company. There are a number of options for this, ranging from the completely decentralized integration of experts in the respective department to the establishment of a central unit, for example in the form of a Center of Excellence (CoE). Whether centralized or decentralized integration is the right option depends on various factors. The recommendations often differ here and it is important to weigh up the advantages and disadvantages of the various procedures individually for your company.

The decision to set up your own data science resources is primarily a strategic one. Therefore, the type of integration and the required expertise in the form of the various data experts ultimately depend largely on the goal pursued, i.e. the data strategy of your entire company.

OPTION 2: BUY

The alternative to having your own department is purchasing external services. The options here range from purchasing standard software as a finished product to individual software, which is developed according to your needs. Of course, this also includes consulting and data science projects, which are carried out once to answer a specific question.

BUY DECISION: WHAT'S THE CASE?

Purchasing external services provides quick access to the far-reaching benefits of data science solutions, usually at a fraction of the costs and, above all, time that in-house development would require. However, the reasons for the buy approach are much more diverse. We have summarized the most important ones briefly and concisely:

EXPERTISE AND EXPERIENCE

First of all, a major advantage is that external service providers have the sought-after data experts and can bring the ideal team and therefore the right mix of required competencies to every project. In addition, external service providers can usually draw on many years of project experience with various customers and thus have a range of best practices.

LEARN FROM EXPERTS

By working with external experts, you can learn a lot both on a technical and methodological level, for example regarding the development and evaluation of use cases or the moderation of data workshops.

SPORADIC DEMAND

Certain, usually very specific competencies, such as setting up a data lake or even in the area of deep learning, are often only needed sporadically. Instead of building up these competencies internally, it is more economical to buy them externally.

TIME UNTIL USE

Another advantage of the buy option is the very short start-up time. Projects can start almost immediately and the onboarding process when implementing standardized solutions often takes place within a few weeks.

FASTER ROI THROUGH SCALING

Especially when starting out in data science, initial efforts are often required, such as setting up a data warehouse or data lake, before the value-adding implementation of data science use cases can begin. In order to achieve faster ROI, use case implementation can be scaled with an external service provider, so that, for example, one use case is implemented internally and one or more can be implemented externally in parallel.

LOWER INVESTMENT RISK

In contrast to the make option, the buy approach involves a significantly lower capital commitment, as no major investment in personnel, tech stack and infrastructure is required.

External service providers are the ideal option, especially for getting started and making initial contact with data science. With the help of service providers, use cases can be quickly developed and certain structures can be set up. By implementing projects that are often small and clear at the beginning, experience and trust can be built up and acceptance of the topic in your company can be created. Here you can find out how these projects are then mastered in the company:

Blog post about data science projects ➞

BUY DECISION: WHAT CHALLENGES ARE THERE TO CONSIDER?

The buy approach also poses certain challenges that should not be ignored. Specifically, the following two factors are decisive here:

CHOOSE THE RIGHT PARTNER

Choosing the right data science service provider is the biggest and most important decision and therefore a challenging task. A service provider or provider should not only have excellent data science skills, but ideally also have comprehensive industry and technical expertise. It is therefore advisable to ask for reference projects in order to find out who the service provider has worked with in the past, what specific challenges have been solved and what results have been achieved.

When purchasing external data science services, the following always applies: The better the service provider knows your company, your strategic goals, your internal processes, etc., the better the chances of success. A good service provider therefore takes time to exchange ideas and listens carefully in order to implement your wishes and ideas in the best possible way.

ONLY CONDITIONAL DEVELOPMENT OF COMPETENCIES

One disadvantage of the buy approach is that the company's own competencies are naturally less developed and therefore there is a certain dependence on external partners. However, it should be noted here that good service providers pay great attention to knowledge transfer and promote knowledge development through targeted support.

CONCLUSION

As with so many things, there is no generally valid answer to the make or buy question. The simple answer is: If data science plays a central role in corporate strategy and is part of the core business, then make is the way to go. In most cases, however, it is a question of implementing solutions that serve to improve internal business processes or support activities that are not part of the core business. In these cases, buying and using already proven products is usually the better option.

The long answer to the question is: The path to the right solution depends largely on your strategic goals, your requirements and your time and budget constraints. To determine the best option for your company, you should therefore consider the following factors, among others:

  • Your business problem to be solved
  • The analytical maturity level of your company
  • Your availability of employees with appropriate knowledge
  • The urgency of implementing the solution
  • Your available budget

And our final recommendation? It doesn't always have to be a make OR buy decision. Our experience has shown that the make AND buy approach often proves to be successful: This means building internal competencies that support daily needs and draw on external support for more complex data science projects.