|  | 
| Graphical Representation Series | 
In a diagram adapted from an older article [1], Brent Dykes, the author of "Effective Data Storytelling" [2], makes a parallel between Data Analytics and marathon running, considering that an organization must pass through the depicted milestones, the percentages representing how many organizations reach the respective milestones:
| Data Analytics Marathon [1] | 
  It makes sense, isn't it? On the other side
  the devil lies in details and frankly the diagram raises several
  questions when is compared with practices and processes existing in
  organizations. This doesn't mean that the diagram is wrong, just that it
  doesn't seem to reflect entirely the reality. 
  The percentages represent author's perception of how many organizations reach
  the respective milestones, probably in an repeatable manner (as there are
  several projects). Thus, only 10% have a data strategy, 100% collect data, 80%
  of them prepare the data, while at the opposite side only 15% communicate
  insight, respectively 5% act on information.
  Considering only the milestones the diagram looks like a funnel and a
  capability maturity model (CMM). Typically, the CMMs are more complex than
  this, evolving with technologies' capabilities. All the mentioned milestones
  have a set of capabilities that increase in complexity and that usually help
  differentiated organization's maturity. Therefore, the model seems too simple
  for an actual categorization.  
  Typically, data collection
  has a specific scope resuming to surveys, interviews and/or research. However,
  the definition can be extended to the storage of data within organizations.
  Thus, data collection as the gathering of raw data is mainly done as part of
  their value supporting processes, and given the degree of digitization of
  data, one can suppose that most organizations gather data for the different
  purposes, even if only a small part are maybe digitized.
  Even if many organizations build data warehouses, marts, lakehouses, mashes or
  whatever architecture might be en-vogue these days, an important percentage of
  the reporting needs are covered by standard reports or reporting tools that
  access directly the source systems without data preparation or even data
  visualization. The first important question is what is understood by data
  analytics? Is it only the use of machine learning and statistical analysis?
  Does it resume only to pattern and insight finding or does it includes also
  what is typically considered under the Business Intelligence umbrella? 
  Pragmatically thinking, Data Analytics should consider BI capabilities as well
  as its an extension of the current infrastructure to consider analytic
  capabilities. On the other side Data Warehousing and BI are considered
  together by DAMA as part of their Data Management methodology. Moreover,
  organizations may have a Data Strategy and a BI strategy, respectively a Data
  Analytics strategy as they might have different goals, challenges and bodies
  to support them. To make it even more complicated, an organization might even
  consider all these important topics as part of the Data or even Information
  Governance, or consider BI or Analytics without Data Management. 
  So, a Data Strategy might or might not address Data Analytics at all. It's a
  matter of management philosophy, organizational structure, politics and other
  factors. Probably, having a strayegy related to data should count. Even if a
  written and communicated data-related strategy is recommended for all medium
  to big organizations, only a small percentage of them have one, while small
  organizations might ignore the topic completely.
  At least in the past, data analysis and its various subcomponents was
  performed before preparing and visualizing the data, or at least in parallel
  with data visualization. Frankly, it's a strange succession of steps. Or does
  it refers to exploratory data analysis (EDA) from a statistical perspective,
  which requires statistical experience to model and interpret the facts?
  Moreover, data exploration and discovery happen usually in the early stages.
  The most puzzling step is the last one - what does the author intended with
  it? Ideally, data should be actionable, at least that's what one says about
  KPIs, OKRs and other metrics. Does it make sense to extend Data Analytics into
  the decision-making process? Where does a data professional's responsibilities
  end and which are those boundaries? Or does it refer to the actions that need
  to be performed by data professionals? 
  The natural step after communicating insight is for the management to take
  action and provide feedback. Furthermore, the decisions taken have impact on
  the artifacts built and a reevaluation of the business problem, assumptions
  and further components is needed.
  The many steps of analytics projects are iterative, some iterations
    affecting the Data Strategy as well. The diagram shows the process as linear, which is not the case.
  For sure there's an interface between Data Analytics and Decision-Making and
  the processes associated with them, however there should be clear boundaries.
  E.g.,
  it's a data professional's responsibility to make sure that the
    data/information is actionable and eventually advise upon it, though whether
    the entitled people act on it is a management topic. Not acting upon an information is also a decision. Overstepping
  boundaries can put the data professional into a strange situation in which he
  becomes responsible and eventually accountable for an action not taken, which
  is utopic.
  The final question - is the last mile representative for the analytical
  process? The challenge is not the analysis and communication of data but of
  making sure that the feedback processes work and the changes are addressed
  correspondingly, that value is created continuously from the data analytics
  infrastructure, that data-related risks and opportunities are addressed as
  soon they are recognized. 
  As any model, a diagram doesn't need to be correct to be useful and
  might not be even wrong in the right context and argumentation. A data
  analytics CMM might allow better estimates and comparison between
  organizations, though it can easily become more complex to use. Between the
  two models lies probably a better solution for modeling the data analytics
  process.
 

No comments:
Post a Comment