Alexei ChernobrovovConsultant on Analytics and Data Monetization

Be a Data-Driven: Who is a Data Strategist and what a Data Strategist does

Since 2010, a new specialty has emerged in the field of Data Science: Data Strategist. This article covers topics related to Data Strategist, the benefits they bring to business, what they need to know and be able to do, and how they differ from CDOs, Data Scientists, Architects, Engineers, and Data Analysts.

Who is a data strategist and why is he needed?

Nowadays, every business understands the value of data, striving to become a data-driven company. Businesses are introducing Chief Data Officer (CDO) positions, hiring Data Scientists, and looking for Data Analysts. However, all of these professionals do not cover the gap between the narrowly focused tasks of the data pipeline, which is growing as the degree of digitalization of business increases. While there is a common understanding of information as a corporate asset, the local meaning of the term depends on the context in which it is used. Data for Data Analyst is a set of metrics and indicators from various business applications, for Data Engineer is an object of storage and transfer between different information systems, and for Data Scientist is a dataset for ML models. Similarly, the Data Architect and Data Steward each look at information from their technical perspective, designing storage and data lakes or ensuring the quality and purity of datasets.

Thus, none of these data specialists has a common vision of the situation, focusing only on their part of the data pipeline. While the general task of data management is the responsibility of the CDO, the Data Director is, above all, a top manager who solves managerial issues without going into too much technical detail. Therefore, the organization of the corporate data pipeline requires continuous maintenance of information at every stage of its processing, from collection to presentation in the dashboards of BI systems (Fig. 1). That's exactly what a Data Strategist does, knowing the subject area at every point in the data pipeline, as well as the organizational capabilities to connect them seamlessly. We can say that a Data Strategist transforms the business requirements for data into a product (insight), combining the roles of System Analyst, Project Manager, and Technologist in the information collection and data processing operations [1].


Figure. 1. Data pipeline and information processing tasks at each of its stages


Data Strategist vs Data Scientist and other Data Experts: similarities and differences

Therefore, the Data Strategist does not compete with the Analyst, Researcher, Engineer, Steward, or Director of Data, but rather complements the work of each of these roles. The Data Strategist implements effective design and continuous maintenance of the data pipeline, ensuring the continuous flow of information, considering business needs and technical details.

First of all, having a data strategy is transforming the typical data pipeline. The Data Strategist begins designing the pipeline by asking what data and insights are important for short-, mid- and long-term business development. While the cost of storing and processing data is getting cheaper by the day, the value of information increases as the business becomes digitized, meaning when increasingly more processes become data-driven. Therefore, data strategy must correlate with business strategy, prioritizing potential opportunities and risks. Thus, a Data Strategist has to think like a business operator [1].

The Data Strategist also has the responsibility of being an Analyst who gets to understand who to ask and what questions to ask to identify business needs metrics and satisfaction. He knows how data becomes metrics and then turns into insights. The Data Strategist is well-versed in the processes of collecting and transforming data, preparing it for ML modeling, and interpreting the results. He is aware of all the bottlenecks of this pipeline, has a professional understanding of each data storage and processing platform, knows the features of datasets and tools to work with them. However, the Data Strategist, unlike the Data Engineer, does not directly customize the software for ETL/ELT processes, does not design data warehouses and Data Lake like the Big Data Architect, and does not develop new ML algorithms like the Data Scientist. Besides, the data strategist navigates the regulatory aspects of working with information. For example, he works with lawyers to develop corporate policies for handling personal data in accordance with GDPR and local laws in individual countries. You can read more about personal data and how to protect it here.

Thus, the Data Strategist, like the CDO, has an organizational role, coordinating the transfer of data from one narrow domain to another. However, as noted above, the Data Director addresses these issues at a higher level of abstraction, while the Data Strategist dives down to the level of technical detail. In particular, the Data Strategist can discriminate between a typical data processing case and a unique one to reduce the overhead of developing a new software solution by unifying the individual steps in the pipeline. In doing so, Data Strategist takes into account the potential usefulness of such a solution for the business, the complexity of its technical implementation, and the possible risks [2]. However, unlike CDO, a Data Strategist is not involved in the design of the enterprise architecture, although he can advise it in terms of information support of business processes. Data Strategist, like other data specialists ( Engineer, Analyst, Architect, Researcher) is functionally subordinate to CDO, performing the role of a line employee, not a manager.

At the operational level, the Data Strategist is at the same level as the DataOps engineer, working with a set of practices to continually integrate data across processes, teams, and systems to improve corporate governance or industry collaboration (Fig. 2). This is accomplished through Big Data technologies of distributed information collection and processing, centralized analytics, and flexible data access policies that take into account confidentiality, usage restrictions, and integrity compliance. However, the work of a DataOps engineer is the operational process of automating and monitoring data management throughout its lifecycle [3]. The Data Strategist's work involves a strategic view of Data Governance processes, taking into account current conditions and future business perspectives, and therefore is more aligned to project cycles than DataOps.

Figure 2. What Data Strategist works with

The Data Strategist’s main functions and responsibilities

The main goal of a Data Strategist's professional practice is to transform data into business insights through the continuous maintenance of information flows and effective management of data sets using Big Data technologies and Data Science methods, as mentioned earlier. The key task is to design data pipelines, taking into account the medium and long-term prospects of data-driven business development. Data Strategist responsibilities, therefore, are as follows:

  • Expert evaluation and trend analysis in the domain and areas of Big Data, Data Science, Business Intelligence, Machine Learning, and Artificial Intelligence;
  • Identifying data requirements and processes based on business needs;
  • Identifying correlations of market and government regulations and trends with corporate challenges, opportunities, constraints, and potential risks;
  • Identifying data storage, processing, and presentation needs through interviews, brainstorming sessions, and workshops with business experts, users, and executives;
  • Developing strategic roadmaps, project plans, and detailed action plans for data processes;
  • Supporting projects to implement new technologies and tools for collecting, processing, and using data;
  • Design and implementation of data processing pipelines;
  • Analyzing, optimizing, and re-engineering data processes;
  • Managing teamwork with other Data specialists: Engineers, Scientists, Analysts, Developers, Administrators, and Users.

Knowledge and skills of a Data Strategist

On the surface, the competencies of a Data Strategist overlap with the knowledge, skills, and abilities of an Architect, Engineer, Analyst, and Data Scientist. All of these professionals need a technical background, a systems mindset, and analytical skills. However, a Data Strategist has a broader set of skills than his or her colleagues mentioned above. To draw an analogy with the medical profession, a Data Strategist fulfills the role of a therapist, while an Analyst, Engineer, and Researcher are narrow specialists. Therefore, domain knowledge and understanding of data processes from a business perspective are much more important for a Data Strategist than, for example, being able to write an effective ML algorithm or a HiveQL query into a Hadoop Data Lake. Nevertheless, it's worth highlighting the required basics of what a Data Strategist should know and be able to do (Fig. 3):

  • The Big Data technology stack, from Apache Hadoop to Kafka Streams to at least at the level of acquaintance with the purpose and basic functionality of a particular framework;
  • Data Science and Data Mining techniques, from statistical analysis to neural networks and other machine learning algorithms, to understand the nature of the data used and plan the processes of their preparation and use;
  • Architectural models of enterprise warehouses, lakes, and databases to effectively design information loading and unloading pipelines: Lambda, Kappa, SQL, NoSQL, and NewSQL;
  • Data and information systems integration techniques: consolidation, federalization, distribution, SOA (Service Oriented Architecture);
  • Software engineering, especially systems analysis and project management techniques, including Agile approaches;
  • Techniques for defining and identifying business needs, from interviews to brainstorming, as well as methods for assessing the potential benefits and risks of a future solution.

Finally, the so-called soft skills as teamwork skills, strategic vision, and effective communication are especially important for a Data Strategist. A Data Strategist first of all deals with people, and then with data, acting as a universal translator between business and technical specialists at every point in the data pipeline [4].

Fig. 3. Data Strategist's knowledge areas

To sum up

The digitalization of today's world is leading to the emergence of new professions at the intersection of technology and business. The Data Strategist is one of them. And while the world has already recognized the need for Data Strategists, the demand for them is still less than for Data Analysts and Researchers. By the time this article was published in April 2020, Russian job sites did not provide a single relevant result for the query " Data Strategist".  Nevertheless, foreign sites offer many job ads for this position with an annual salary of 50 to 160 thousand dollars, equivalent to 330 thousand to one million Russian rubles at the current exchange rate. At the same time, the demand is formed by large business companies with a developed IT infrastructure, large amounts of data, a high level of managerial maturity, and a degree of digitalization. Due to the fact that western trends are coming to Russia with some delay, it is possible to predict the interest of Russian HR-managers in Data Strategists in the nearest 5 years.

However, some companies prefer to " bring up" their specialist rather than hire externally due to the importance of domain knowledge [5]. Thus, the position of Data Strategist can be a new step in the professional career of a Data Scientist or Data Analyst, who wants to work with information at a new level [6].

Nevertheless, today, most domestic companies have just embarked on the path of their digital development. Therefore, it is premature to say that a Data Strategist is already in dire need of every business. The necessity for such specialists comes to the companies at 4-5 levels of managerial maturity according to the CMMI model, when all business processes and procedures of work with data are debugged and automated. If a company just starts to systematize its activities and try to monetize production data, an in-house Analyst or even an external consultant can do the job of describing the data and developing the Data Governance processes.