|
Graphical Representation Series
|
In a diagram adapted from an older article [1], Brent Dykes, the author of
"Effective Data Storytelling" [2], makes a parallel between Data Analytics and
marathon running, considering that an organization must pass through the
depicted milestones, the percentages representing how many organizations reach
the respective milestones:
It's a nice visualization and the metaphor makes sense given that running a
marathon requires a long-term strategy to address the gaps between the current
and targeted physical/mental form and skillset required to run a marathon,
respectively for approaching a set of marathons and each course individually.
Similarly, implementing a Data Analytics initiative requires a Data Strategy
supposed to address the gaps existing between current and targeted state of art,
respectively the many projects run to reach organization's goals.
It makes sense, isn't it? On the other side
the devil lies in details and frankly the diagram raises several
questions when is compared with practices and processes existing in
organizations. This doesn't mean that the diagram is wrong, just that it
doesn't seem to reflect entirely the reality.
The percentages represent author's perception of how many organizations reach
the respective milestones, probably in an repeatable manner (as there are
several projects). Thus, only 10% have a data strategy, 100% collect data, 80%
of them prepare the data, while at the opposite side only 15% communicate
insight, respectively 5% act on information.
Considering only the milestones the diagram looks like a funnel and a
capability maturity model (CMM). Typically, the CMMs are more complex than
this, evolving with technologies' capabilities. All the mentioned milestones
have a set of capabilities that increase in complexity and that usually help
differentiated organization's maturity. Therefore, the model seems too simple
for an actual categorization.
Typically,
data collection
has a specific scope resuming to surveys, interviews and/or research. However,
the definition can be extended to the storage of data within organizations.
Thus, data collection as the gathering of raw data is mainly done as part of
their value supporting processes, and given the degree of digitization of
data, one can suppose that most organizations gather data for the different
purposes, even if only a small part are maybe digitized.
Even if many organizations build data warehouses, marts, lakehouses, mashes or
whatever architecture might be en-vogue these days, an important percentage of
the reporting needs are covered by standard reports or reporting tools that
access directly the source systems without data preparation or even data
visualization. The first important question is what is understood by data
analytics? Is it only the use of machine learning and statistical analysis?
Does it resume only to pattern and insight finding or does it includes also
what is typically considered under the Business Intelligence umbrella?
Pragmatically thinking, Data Analytics should consider BI capabilities as well
as its an extension of the current infrastructure to consider analytic
capabilities. On the other side Data Warehousing and BI are considered
together by DAMA as part of their Data Management methodology. Moreover,
organizations may have a Data Strategy and a BI strategy, respectively a Data
Analytics strategy as they might have different goals, challenges and bodies
to support them. To make it even more complicated, an organization might even
consider all these important topics as part of the Data or even Information
Governance, or consider BI or Analytics without Data Management.
So, a Data Strategy might or might not address Data Analytics at all. It's a
matter of management philosophy, organizational structure, politics and other
factors. Probably, having a strayegy related to data should count. Even if a
written and communicated data-related strategy is recommended for all medium
to big organizations, only a small percentage of them have one, while small
organizations might ignore the topic completely.
At least in the past, data analysis and its various subcomponents was
performed before preparing and visualizing the data, or at least in parallel
with data visualization. Frankly, it's a strange succession of steps. Or does
it refers to exploratory data analysis (EDA) from a statistical perspective,
which requires statistical experience to model and interpret the facts?
Moreover, data exploration and discovery happen usually in the early stages.
The most puzzling step is the last one - what does the author intended with
it? Ideally, data should be actionable, at least that's what one says about
KPIs, OKRs and other metrics. Does it make sense to extend Data Analytics into
the decision-making process? Where does a data professional's responsibilities
end and which are those boundaries? Or does it refer to the actions that need
to be performed by data professionals?
The natural step after communicating insight is for the management to take
action and provide feedback. Furthermore, the decisions taken have impact on
the artifacts built and a reevaluation of the business problem, assumptions
and further components is needed.
The many steps of analytics projects are iterative, some iterations
affecting the Data Strategy as well. The diagram shows the process as linear, which is not the case.
For sure there's an interface between Data Analytics and Decision-Making and
the processes associated with them, however there should be clear boundaries.
E.g.,
it's a data professional's responsibility to make sure that the
data/information is actionable and eventually advise upon it, though whether
the entitled people act on it is a management topic. Not acting upon an information is also a decision. Overstepping
boundaries can put the data professional into a strange situation in which he
becomes responsible and eventually accountable for an action not taken, which
is utopic.
The final question - is the last mile representative for the analytical
process? The challenge is not the analysis and communication of data but of
making sure that the feedback processes work and the changes are addressed
correspondingly, that value is created continuously from the data analytics
infrastructure, that data-related risks and opportunities are addressed as
soon they are recognized.
As any model, a diagram doesn't need to be correct to be useful and
might not be even wrong in the right context and argumentation. A data
analytics CMM might allow better estimates and comparison between
organizations, though it can easily become more complex to use. Between the
two models lies probably a better solution for modeling the data analytics
process.
Resources:
[1] Brent Dykes (2022) "Data Analytics Marathon: Why Your
Organization Must Focus On The Finish", Forbes (link)
[2] Brent Dykes (2019) Effective Data Storytelling: How to
Drive Change with Data, Narrative and Visuals (link)