Showing posts with label knowledge. Show all posts
Showing posts with label knowledge. Show all posts

18 August 2024

Business Intelligence: Mea Culpa (Part III: Problem Solving)

Business Intelligence Series
Business Intelligence Series

I've been working for more than 20 years in BI and Data Analytics area, in combination with Software Engineering, ERP implementations, Project Management, IT services and several other areas, which allowed me to look at many recurring problems from different perspectives. One of the things I learnt is that problems are more complex and more dynamic than they seem, respectively that they may require tailored dynamic solutions. Unfortunately, people usually focus on one or two immediate perspectives, ignoring the dynamics and the multilayered character of the problems!

Sometimes, a quick fix and limited perspective is what we need to get started and fix the symptoms, and problem-solvers usually stop there. When left unsupervised, the problems tend to kick back, build up momentum and appear under more complex forms in various places. Moreover, the symptoms can remain hidden until is too late. To this also adds the political agendas and the further limitations existing in organizations (people, money, know-how, etc.).

It seems much easier to involve external people (individual experts, consultancy companies) to solve the problem(s), though unless they get a deep understanding of the business and the issues existing in it, the chances are high that they solve the wrong problems and/or implement the wrong solutions. Therefore, it's more advisable to have internal experts, when feasible, and that's the point where business people with technical expertise and/or IT people with business expertise can help. Ideally, one should have a good mix and the so called competency centers can do a great job in handling the challenges of organizations. 

Between business and IT people there's a gap that can be higher or lower depending on resources know-how or the effort made by organizations to reduce it. To this adds the nature of the issues existing in organizations, which can vary considerable across departments, organizations or any other form of establishment. Conversely, the specific skillset can be transmuted where needed, which might happen naturally, though upon case also considerable effort needs to be involved in the process.

Being involved in similar tasks, one may get the impression that one can do whatever the others can do. This can happen in IT as well on the business side. There can be activities that can be done by parties from the other group, though there are also many exceptions in both directions, especially when one considers that one can’t generalize the applicability and/or transmutation of skillset. 

A more concrete example is the know-how needed by a businessperson to use the BI infrastructure for answering business questions, and ideally for doing all or at least most of the activities a BI professional can do. Ideally, as part of the learning path, it would be helpful to have a pursuable path in between the two points. The mastery of tools helps in the process though there are different mindsets involved.

Unfortunately, the data-related fields are full of overconfident people who get the problem-solving process wrong. Data-based problem-solving resumes in gathering the right facts and data, building the right conceptual model, identifying the right questions to ask, collecting more data, refining methods and solutions, etc. There’s aways an easy wrong way to solve a problem!

The mastery of tools doesn’t imply the mastery of business domains! What people from the business side can bring is deeper insight in the business problems, though getting from there to implementing solutions can prove a long way, especially when problems require different approaches, different levels of approximations, etc. No tool alone can bridge such gaps yet! Frankly, this is the most difficult to learn and unfortunately many data professionals seem to get this wrong!

Previous Post <<||>> Next Post

28 February 2024

Business Intelligence: A Software Engineer's Perspective V (From Process Management to Mental Models in Knowledge Gaps)

Business Intelligence Series
Business Intelligence Series 

An organization's business processes are probably one of its most important assets because they reflect the business model, philosophy and culture, respectively link the material, financial, decisional, informational and communicational flows across the whole organization with implication in efficiency, productivity, consistency, quality, adaptability, agility, control or governance. A common practice in organizations is to document the business-critical processes and manage them accordingly over their lifetime, making sure that the employees understand and respect them, respectively improve them continuously. 

In what concerns the creation of data artifacts, data without the processual context are often meaningless, no matter how much a data professional knows about data structures/models. Processes allow to delimit the flow and boundaries of data, respectively delimit the essential from non-essential. Moreover, it's the knowledge of processes that allows to reengineer the logic behind systems especially when no proper documentation about the logic is available. 

Therefore, the existence of documented processes allows to bridge the knowledge gaps existing on the factual side, and occasionally also on the technical side. In theory, the processes should provide a complete overview of the procedures, rules, policies and responsibilities existing in the organization, respectively how the business operates. However, even if people tend to understand how the world works locally, when broken down into parts, their understanding is systemically flawed, missing the implications of causal relationships that span time with delays, feedback, variable confusion, chaotic behavior, and/or other characteristics borrowed from the vocabulary of complex systems.  

Jay W Forrester [3], Peter M Senge [1], John D Sterman [2] and several other systems-thinking theoreticians stressed the importance of mental models in making-sense about the world especially in setups that reflect the characteristics of complex systems. Mental models frame our experience about the world in congruent mental constructs that are further used to think, understand and navigate the world. They are however tacit, fuzzy, incomplete, imprecisely stated, inaccurate, evolving simplifications with dual character, enabling on one side, while impeding on the other side cognitive processes like sense-making, learning, thinking or decision-making, limiting the range of action to what is familiar and comfortable. 

On one side one of the primary goals of Data Analytics is to provide new insights, while on the other side the new insights fail to be recognized and put into practice because they conflict with existing mental models, limiting employees to familiar ways of thinking and acting. 

Externalizing and sharing mental models allow besides making assumptions explicit and creating a world view also to strategize, make tests and simulations, respectively make sure that the barriers and further constraints don't impact the decisional process. Sange goes further and advances that mental models, especially at management level, offer a competitive advantage, allowing to maintain coherence and direction, people becoming more perceptive and responsive about environmental or circumstance changes.

The whole process isn't about creating a unique congruent mental model, even if several mental models may converge toward one or more holistic models, but of providing different diverse perspectives and enabling people to make leaps in abstraction (by moving from direct observations to generalizations) while blending advocacy and inquiry to promote collaborative learning. Gradually, people and organizations should recognize a shift from mental models dominated by events to mental models that recognize longer-tern patterns of change and the underlying structures producing those patterns [1].

Probably, for many the concept of mental models seems to be still too abstract, respectively that the effort associated with it is unnecessary, or at least questionable on whether it can make a difference. Conversely, being aware of the positive and negative implications the mental models hold, can makes us explore, even if ad-hoc, the roads they open.

Previous Post <<||>> Next Post

Resources:
[1] Peter M Senge (1990) The Fifth Discipline: The Art & Practice of The Learning Organization
[2] John D Sterman (2000) "Business Dynamics: Systems thinking and modeling for a complex world"
[3] Jay W Forrester (1971) "Counterintuitive Behaviour of Social Systems", Technology Review

21 February 2024

Business Intelligence: A Software Engineer's Perspective IV (The Loom of Interactions)

Business Intelligence Series
Business Intelligence Series 

The process of developing or creating a report is quite simple - there's a demand for data, usually a business problem, the user (aka requestor) defines a set of requirements, the data professional writes one or more queries to address the requirements, which are then used to build one or more reports. The report(s) is/are reviewed by the requestor and with this the process should be over in most of the cases. However, this is rather the exception - a long series of changes over multiple iterations are usually necessary, the queries and the reports get modified and even rewritten until they reach the final form, lot of effort being wasted in the process on both sides.

Common practices for improving the process behind resume to assuring that the requirements are complete and understood upfront, that best practices are followed, that the user gets an early review of the work and that there's a continuous communication, that process' performance is monitored, that controls are in place, etc. Standardizing the process helps to reduce the number of iterations, but only by a factor. Unfortunately, the bigger issue - the knowledge gap - is often ignored.

There's lot of literature on problem solving, on what steps to follow, on how to define the problem, what aspects should be considered, etc. Recipes are good when one knows how to follow them, respectively how to cook, and that can be a tedious process. It is said that framing the right problem is half the way to its solving, and that's so true. Part of the bigger issue is that users need data to better understand the problem, however the drives can be different - sometimes is problem's complexity, while other times the need is apparent, only with the first set of data the users start thinking seriously about the problem. 

So, the first major gap is between the problem and user's knowledge about the problem. Experience and theory can help reduce the gap, however the most important progress comes when the user understands the data behind the various processes that overlap with the problem. Sometimes, it's enough to explore the data visually, while other times deeper explorations are needed. Data literacy is important, though more important are the exposure to the data and problems of different variety and complexity, respectively having the time for this. 

The second gap concerns the data professional - building the data model and the logic for the report requires domain knowledge. The level of knowledge depends from case to case, and typically what one doesn't know has the biggest impact. A data professional can help to the degree of the information, respectively knowledge he has about the business. The expectation to provide a report based on a set of fields might be valid for simple requirements, though the more complex a problem, the more domain knowledge is needed. Moreover, the data professional might need to reengineer the logic from the source system, which can prove challenging only by looking at the data.

Ideally, the two parties should work together starting with problem's framing and build common ground while covering the knowledge gaps on both sides. Of course, the user doesn't need to dive into the technical knowledge unless the organization leverages this interaction further by adopting the data citizen mindset. Such interactions can help to build trust, respectively a basis for further collaboration. Conversely, the more isolated the two parties, the higher the chances for more iterations to occur. 

Covering the knowledge gaps might look like a redistribution of the effort, though by keeping the status quo there is little chance for growth!

18 February 2024

Business Intelligence: A Software Engineer's Perspective III (More of a One-Man Show)

Business Intelligence Series
Business Intelligence Series 

Probably, in some organizations there are still recounted stories about a hero who knew so much about the business and was technically proficient that he/she was able to provide data-driven answers to most business questions. Unfortunately, the times of solo representations are for long gone - the world moves too fast, there are too many questions looking for an answer, many of them requiring a solution before the problem was actually defined, a whole infrastructure is needed to be able to harness the potential of  technologies and data, the volume of knowledge required grows exponentially, etc. 

One of the approaches of handling the knowledge gap between the initial and required knowledge in solving problems based on data is to build all the required knowledge in one person, either on the business or the technical side. More common is to hire a data analyst and build the knowledge in the respective resource, and the approach has great chances to work until the volume of work exceeds a person's limits. The data analyst is forced to request to have the workload prioritized, which might work in certain occasions, while in others one needs to compromise on quality and/or do overtime, and all the issues deriving from this. 

There are also situations in which the complexity of the problem exceeds a person's ability to handle it, and that's not necessarily a matter of intelligence but of knowhow. Some organizations respond with complexity to complexity, while others are more creative and break the complexity in manageable pieces. In both cases, more resources are needed to cover the knowledge and resource gap. Hiring more data analysts can get the work done though it's not a recipe for success. The more diverse the team, the higher the chances to succeed, though again it's a matter of creativity and of covering the knowledge gaps. Sometimes, it's more productive to use the resources already available in organization, though this can involve other challenges. 

Even if much of the knowledge gets documented, as soon the data analyst leaves the organization a void is created until a similar resource is able to fill it. Organizations can better cope with these challenges if they disseminate the knowledge between data professionals respectively within the business. The more resources are involved the higher the level of retention and higher the chances of reusing the knowledge. However, the more people are involved, the higher the costs, especially the one associated with the waste of effort. 

Organizations can compromise by choosing 1-2 resources from each department to be involved in knowledge dissemination, ideally people with data and technology affinity. They shall become data citizens, people who use  data, data processing and visualization for building solutions that enable their job. Data citizens are expected to act as showmen in their knowledge domain and do their magic whenever such requirements arise.

Having a whole team of data citizens opens new opportunities for organizations, though such resources will need beside domain knowledge and data literacy also technical knowledge. Unfortunately, many people will reach their limitations in this area. Besides the learning effort, understanding what good architecture, design and techniques means is unfortunately not for everybody, and here's where the concept of citizen data analyst or citizen scientist breaks, and this independently of the tools used.

A data citizen's effort works best in data discovery, exploration and visualization scenarios where the rapid creation of prototypes reduces the time from idea to solution. However, the results are personal solutions that need to be validated by a technical person, pieces of the solutions maybe redesigned and moved until enterprise solutions result.

Previous Post <<||>> Next Post

17 February 2024

Business Intelligence: A Software Engineer's Perspective II (Major Knowledge Gaps)

Business Intelligence Series
Business Intelligence Series

Solving a problem requires a certain degree of knowledge in the areas affected by the problem, degree that varies exponentially with problem's complexity. This requirement applies to scientific fields with low allowance for errors, as well as to business scenarios where the allowance for errors is in theory more relaxed. Building a report or any other data artifact is closely connected with problem solving as the data artifacts are supposed to model the whole or parts of what is needed for solving the problem(s) in scope.

In general, creating data artifacts requires: (1) domain knowledge - knowledge of the concepts, processes, systems, data, data structures and data flows as available in the organization; (2) technical knowledge - knowledge about the tools, techniques, processes and methodologies used to produce the artifacts; (3) data literacy - critical thinking, the ability to understand and explore the implications of data, respectively communicating data in context; (4) activity management - managing the activities involved. 

At minimum, creating a report may require only narrower subsets from the areas mentioned above, depending on the complexity of the problem and the tasks involved. Ideally, a single person should be knowledgeable enough to handle all this alone, though that's seldom the case. Commonly, two or more parties are involved, though let's consider the two-parties scenario: on one side is the customer who has (in theory) a deep understanding of the domain, respectively on the other side is the data professional who has (in theory) a deep understanding of the technical aspects. Ideally, both parties should be data literates and have some basic knowledge of the other party's domain. 

To attack a business problem that requires one or more data artifacts both parties need to have a common understanding of the problem to be solved, of the requirements, constraints, assumptions, expectations, risks, and other important aspects associated with it. It's critical for the data professional to acquire the domain knowledge required by the problem, otherwise the solution has high chances to deviate from the expectations. The general issue is that there are multiple interactions that are iterative. Firstly, the interactions for building the needed common ground. Secondly, the interaction between the problem and reality. Thirdly, the interaction between the problem and parties’ mental models und understanding about the problem. 

The outcome of these interactions is that the problem and its requirements go through several iterations in which knowledge from the previous iterations are incorporated successively. With each important piece of knowledge gained, it's important to revise and refine the question(s), respectively the problem. If in each iteration there are also programming and further technical activities involved, the effort and costs resulted in the process can explode, while the timeline expands accordingly. 

There are several heuristics that could be devised to address these challenges: (1) build all the required knowledge in one person, either on the business or the technical side; (2) make sure that the parties have the required knowledge for approaching the problems in scope; (3) make sure that the gaps between reality and parties' mental models is minimal; (4) make sure that the requirements are complete and understood before starting the development; (5) adhere to methodologies that accommodate the necessary iterations and endeavor's particularities; (6) make sure that there's a halt condition for regularly reviewing the progress, respectively halting the work; (7) build an organizational culture to support all this. 

The list is open, and the heuristics aren't exclusive, so in theory any combination of them can be considered. Ideally, an organization should reflect all these heuristics in one form or another. The higher the coverage, the more mature the organization is. The question is how organizations with a suboptimal setup can change the status quo?

Previous Post <<||>> Next Post

21 March 2021

Strategic Management: The Impact of New Technologies II (The Technology-oriented Patient)

Strategic Management

Looking at the way data, information and knowledge flow through an organization, with a little imagination one can see the resemblance between an organization and the human body, in which the networks created by the respective flows spread through organization as nervous, circulatory or lymphatic braids do, each with its own role in the good functioning of the organization. Each technology adopted by an organization taps into these flows creating a structure that can be compared with the nerve plexus, as the various flows intersect in such points creating an agglomeration of nerves and braids.

The size of each plexus can be considered as proportional to the importance of the technology in respect to the overall structure. Strategic technologies like ERP, BI or planning systems, given their importance (gravity), resemble with the organs from the human body, with complex networks of braids in their vicinity. Maybe the metaphor is too far-off, though it allows stressing the importance of each technology in respect to its role and the good functioning of the organization. Moreover, each such structure functions as pressure points that can in extremis block any of the flows considered, a long-term block having important effects.

The human organism is a marvelous piece of work reflecting the grand design, however in time, especially when neglected or driven by external agents, diseases can clutch around any of the parts of the human body, with all the consequences deriving from this. On the other side, an organization is a hand-made structure found in continuous expansion as new technologies or resources are added. Even if the technologies are at peripheral side of the system, their good or bad functioning can have a ripple effect trough the various networks.

Replacing any of the above-mentioned strategic systems can be compared with the replacement of an organ in the human body, having a high degree of failure compared with other operations, being complex in nature, the organism needing long periods to recover, while in extreme situations the convalescence prolongs till the end. Fortunately, organizations seem to be more resilient to such operations, though that’s not necessarily a rule. Sometimes all it takes is just a small mistake for making the operation fail.

The general feeling is that ERP and BI implementations are taken too lightly by management, employees and implementers. During the replacement operation one must make sure not only that the organ fits and functions as expected, but also that the vital networks regained their vitality and function as expected, and the latter is a process that spans over the years to come. One needs to check the important (health) signs regularly and take the appropriate countermeasures. There must be an entity having the role of the doctor, who/which has the skills to address adequately the issues.

Moreover, when the physical structure of an organization is affected, a series of micro-operations might be needed to address the deformities. Unfortunately, these areas are seldom seen in time, and can require a sustained effort for fixing, while a total reconstruction might apply. One works also with an amorphous and ever-changing structure that require many attempts until a remedy is found, if a remedy is possible after all.

Even if such operations are pretty well documented, often what organizations lack are the skilled resources needed during and post-implementation, resources that must know as well the patient, and ideally its historical and further health preconditions. Each patient is different and quite often needs its own treatment/medication. With such changes, the organization lands itself on a discovery journey in which the appropriate path can easily deviate from the well-trodden paths.

Previous Post <<||>> Next Post

20 March 2021

Business Intelligence: New Technologies, Old Challenges I (Introduction)

Business Intelligence

Each important technology has the potential of creating divides between the specialists from a given field. This aspect is more suggestive in the data-driven fields like BI/Analytics or Data Warehousing. The data professionals (engineers, scientists, analysts, developers) skilled only in the new wave of technologies tend to disregard the role played by the former technologies and their role in the data landscape. The argumentation for such behavior is rooted in the belief that a new technology is better and can solve any problem better than previous technologies did. It’s a kind of mirage professionals and customers can easily fall under.

Being bigger, faster, having new functionality, doesn’t make a tool the best choice by default. The choice must be rooted in the problem to be solved and the set of requirements it comes with. Just because a vibratory rammer is a new technology, is faster and has more power in applying pressure, this doesn’t mean that it will replace a hammer. Where a certain type of power is needed the vibratory rammer might be the best tool, while for situations in which a minimum of power and probably more precision is needed, like driving in a nail, then an adequately sized hammer will prove to be a better choice.

A technology is to be used in certain (business/technological) contexts, and even if contexts often overlap, the further details (aka requirements) should lead to the proper use of tools. It’s in a professional’s duties to be able to differentiate between contexts, requirements and the capabilities of the tools appropriate for each context. In this resides partially a professional’s mastery over its field of work and of providing adequate solutions for customers’ needs. Especially in IT, it’s not enough to master the new tools but also have an understanding about preceding tools, usage contexts, capabilities and challenges.

From an historical perspective each tool appeared to fill a demand, and even if maybe it didn’t manage to fill it adequately, the experience obtained can prove to be valuable in one way or another. Otherwise, one risks reinventing the wheel, or more dangerously, repeating the failures of the past. Each new technology seems to provide a deja-vu from this perspective.

Moreover, a new technology provides new opportunities and requires maybe to change our way of thinking in respect to how the technology is used and the processes or techniques associated with it. Knowledge of the past technologies help identifying such opportunities easier. How a tool is used is also a matter of skills, while its appropriate use and adoption implies an inherent learning curve. Having previous experience with similar tools tends to reduce the learning curve considerably, though hands-on learning is still necessary, and appropriate learning materials or tutoring is upon case needed for a smoother transition.

In what concerns the implementation of mature technologies, most of the challenges were seldom the technologies themselves but of non-technical nature, ranging from the poor understanding/knowledge about the tools, their role and the implications they have for an organization, to an organization’s maturity in leading projects. Even the most-advanced technology can fail in the hands of non-experts. Experience can’t be judged based only on the years spent in the field or the number of projects one worked on, but on the understanding acquired about implementation and usage’s challenges. These latter aspects seem to be widely ignored, even if it can make the difference between success and failure in a technology’s implementation.

Ultimately, each technology is appropriate in certain contexts and a new technology doesn’t necessarily make another obsolete, at least not until the old contexts become obsolete.

Previous Post <<||>>Next Post

04 March 2021

Project Management: Projects' Dynamics I (Introduction)

Despite the considerable collection of books on Project Management (PM) and related methodologies, and the fact that projects are inherent endeavors in professional as well personal life (setups that would give in theory people the environment and exposure to different project types), people’s understanding on what it takes to plan and execute a project seems to be narrow and questionable sometimes. Moreover, their understanding diverges considerably from common sense. It’s also true that knowledge and common sense are relative when considering any human endeavor in which there are multiple roads to the same destination, or when learning requires time, effort, skills, and implies certain prerequisites, however the lack of such knowledge can hurt when endeavor’s success is a must and a team effort. 

Even if the lack of understanding about PM can be considered as minor when compared with other challenges/problems faced by a project, when one’s running fast to finish a race, even a small pebble in one’s running shoes can hurt a lot, especially when one doesn’t have the luxury to stop and remove the stone, as it would make sense to do.

It resides in the human nature to resist change, to seek for information that only confirm own opinions, to follow the same approach in handling challenges, even if the attempts are far from optimal, even if people who walked the same path tell you that there’s a better way and even sketch the path and provide information about what it takes to reach there. As it seems, there’s the predisposition to learn on the hard way, if there’s significant learning involved at all. Unfortunately, such situations occur in projects and the solutions often overrun the boundaries of PM, where social and communication skills must be brought into play. 

On the other side, there’s still hope that change can be managed optimally once the facts are explained to a certain level that facilitates understanding. However, such an attempt can prove to be quite a challenge, given the various setups in which PM takes place. The intersection between technologies and organizational setups lead to complex scenarios which make such work more difficult, even if projects’ challenges are of organizational rather than technological nature. 

When the knowledge we have about the world doesn’t fit our expectation, a simple heuristic is to return to the basics. A solid edifice can be built only on a solid foundation and the best foundation in coping with reality is to establish common ground with other people. One can achieve this by identifying their suppositions and expectations, by closing the gap in perception and understanding, by establishing a basis for communication, in which feedback is a must if one wants to make significant progress.

Despite of being explorative and time-consuming, establishing common ground can be challenging when addressing to an imaginary audience, which is quite often the situation. The practice shows however that progress can be made by starting with a set of well-formulated definitions, simple models, principles, and heuristics that have the potential of helping in sense-making.

The goal is thus to identify first the definitions that reflect the basic concepts that need to be considered. Once the concepts defined, they can be related to each other with the help of a few models. Even if fictitious, as simplifications of the reality, the models should allow playing with the concepts, facilitating concepts’ understanding. Principles (set of rules for reasoning) can be used together with heuristics (rules of thumb methods or techniques) for explaining the ‘known’ and approaching the ‘unknown’. Even maybe not perfect, these tools can help building theories or explanatory constructs.

||>>Next Post

04 February 2021

Data Migrations (DM): Conceptualization VII (Data Import Layer)

Data Migration
Data Migrations Series

The data requirements for the Data Migration (DM) and Data Quality (DQ) are driven by the processes implemented in the target system(s). Therefore, a good knowledge of these requirements can decrease the effort needed for these two subprojects considerably. The needed knowledge basis starts with the entities and their attributes, the dependencies existing between them and the various rules that apply, and ends with the parametrization requirements, respectively the architecture(s) that can be used to import the data.

The DM process starts with defining the entities in scope and their attributes, respectively identifying the corresponding entities and attributes from the legacy systems. The attributes not having a correspondent in the legacy system need to be provided by the business and integrated in the DM logic. In addition, it’s needed to consider also the attributes needed by the business and not available in the target system, some of them more likely available in the legacy systems. For such attributes is needed either to misuse an attribute from the target or to extend the target system.

For each entity is created a data mapping that basically documents the data transformations needed for migrating the data. In the process is needed to consider also attributes’ data types, the (standard) formatting, their domain of definition, as well the various rules that apply. Their implementation belongs into the DM layer from which the data are exported in a standard format as needed by the target system.

Exporting the data from the DM layer directly into the target system’s tables has in theory the lowest overhead even if the rejected records are difficult to track, the rejections resulting only from records’ ‘validation against database’s schema. For this approach to work, one must have a good knowledge of the database schema and of the business rules implemented into the target system.

To solve the issue with errors’ logging, systems have a further layer on top of the database model, which also allow running data validation against target system’s business rules. Modern import frameworks allow loading the data via a set of standard files with a predefined structure. The data can be thus imported manually or via load jobs into the system a log with the issues being generated in the process. Some frameworks allow even the manual editing of failed records, respectively to import the data. Unfortunately, calling the layer from the DM layer is not possible from a database, though this would bring seldom a benefit. Some third-party tools attempt to improve the import functionality by calling the target system’s import layer.

The import files must be generated from the DM layer in the required structure with the appropriate formatting. The challenge however resides in identifying all the attributes that should make scope of the load. It’s an iterative process which sometimes is backed by try-and-error heuristics. Unless target system’s validation rules are known beforehand, the rules need to be discovered in this process, which can prove time-consuming. The discoveries need to be integrated also in the DM and from here results the big number of changes that need to be performed.

Given the dependencies existing between entities the files need to be generated and loaded in a predefined order. These dependencies are reflected also in the data processing and the validation rules considered in the DM layer.

A quality checkpoint can be implemented between the export from the DM layer and import to enforce the four-eyes principle. It’s normally the last opportunity for trapping the eventual issues. A further quality check is performed after import by validating on whether the data were imported as expected.

Previous Post <<||>> Next Post

09 January 2021

ERP Implementations: It’s all about Planning

ERP Implementation

Ideally the total volume of work can be uniformly distributed for all project’s duration though in praxis the curve representing the effort has the form of a wave or aggregation of waves that tend to reach the peak shortly before or during the Go-Live(s). More important, higher fluctuations occur in the various areas of the project on whole project’s duration, as there are dependencies between the various functional areas, as one needs to wait for decisions to be made, people are not available, etc. Typically, the time wasted on waiting, researching or other non-value-added activities have the potential of leading to such peaks. Therefore, the knowledge must be available, and decisions must be taken when required, which can be challenging but not impossible to achieve. 

To bridge the time wasted on waiting, the team members need to work on other topics. If on customer’s side the resources can handle maybe other activities, on vendor’s side the costs can be high and proportional with the volume of waiting. Therefore, vendor’s resources must be involved at least in two projects or do work in other areas in advance, which is not always possible. However, when vendor’s resources are involved in two or more projects, unless the planning is perfect or each resource can handle the work as it comes, there are further waiting times added. The customer is then forced either to book the resources exclusively, or to wait and carry the costs associated with it. 

On the other side ERP Implementations tend to become exploration projects, especially when the team has only partial knowledge about the system, or the requirements have a certain degree of specialization that deviates from the standard processes. The more unknowns an ERP implementation has, the more difficult is to plan. To be able to plan one must know the activities ahead, how long they take, and of course, one must adhere to the delivery dates, because each delay can have a cascading effect that can impact project’s schedule considerably. 

Probably the best approach to planning is to group the activities into packages and plan the packages, being in each subteam’s duty to handle the planning for each package, following to manage at upper level only the open issues, risks or opportunities. This shifts the burden from Project Manager’s shoulders within the project. Moreover, even if in theory the plan can consider each activity, it will become obsolete as soon it’s updated given the considerable volume of work requested to maintain it. Periodically, one can still revise the whole plan to identify opportunities and risks. What the team can do is to plan for a certain time interval (e.g. 4-6 weeks) and build from there. This allows focusing on the most important activities. 

To further shift the burden, activities like Data Migration, Data Cleaning or the integrations with third party systems should be treated when possible as subprojects. Despite having interdependencies with the main project (e.g. parameters, master data, decisions) and share same resources, they have their own schedule whose deadlines need to be aligned with main project’s milestones. 

Unless the team and business put all effort to respect the plan and, as long the plan is realistic, the initial plan can seldom be respected – it’s anyway just a sketch of the road ahead that can change as the project progresses – and this aspect needs to be understood by the business otherwise will lead to false expectations. On the other side, the team must try respecting the deadlines and communicate in time inability to do so. It’s an interplay in which communication is more important than ever.

Previous Post <<||>> Next Post


ERP Implementations: It’s all about Partnership I

ERP Implementation

Unless the organization (customer) implementing an ERP system has a strong IT team and the knowledge required for the implementation is available already in-house, the resources need to be acquired from the market, and probably the right thing to do is to identify a certified implementer (partner) which can fill the knowledge and skillset gaps, respectively which can help splitting the risks associated with such an implementation.

In theory, the customer provides knowledge about its processes, while the partner comes with expertise about the system to be implemented and further technologies, industry best practices, project methodologies, etc. Further on, the mix is leveraged to harness the knowledge and reach project’s objectives. 

In praxis however finding an implementer which can act as partner might be more challenging than expected. This because the implementer needs to understand customer’s business and where it’s heading, bridge the gap between functional requirements and system’s functionality, advise on areas of improvement, prepare the customer for the project and lead the customer through the changes, respectively establish a basis for the future. Some of the implications are seldom made explicit even if they are implied by what is needed by the project. 

Technology is seldom the issue in an ERP implementation, the challenges residing in handing the change and the logistics required. There are so many aspects to be considered and handled, and this can be challenging for any implementer no matter how long has been on the market or how experienced the resources are. Somebody needs to lead the change and the customer seldom has the knowledge to handle the change. In some cases, the implementer must make the customer aware of the implications, while in others needs to take the initiative and lead the change, though the customer needs to play along, which can be challenging also. 

Many aspects need to be handled at management level from a strategical point of view on customer’s side. It starts with assuring that the most important aspects of the business where considered, that the goals and objectives are clear, that the proper environment is created, and ends with the timely decision-making, with assuring that the resources are available when needed, that the needed organization structures and roles are in place, that the required knowledge is available before, during and after implementation, that the potential brought by the ERP system is harnessed for the years to come. 

A partnership allows in theory splitting the implementation risks as ERP implementations have a high rate of failure. Quite often the outcomes of such projects don’t meet the expectations, the systems being in extremis unusable or a bottleneck for the organization. Ideally one should work with the partner(s) and attempt solving the issues, split eventually the incurred cost overruns, find a middle way. Most of the times it’s recommended to find a solution together rather than coming to a litigation. 

Given the complex dependencies existing between the various parts of the project, the causes that lead to poor implementations are difficult to prove, as there are almost always grey areas. Moreover, the litigations can require a considerable time and resources to settle. These can be just extreme situations, and as long one has a good partner, there’s no need to think that far. On the other side, even if undesirable, one must be prepared also for such outcomes, even if the countermeasures may involve an additional effort. Therefore, one must address such issues in contracts by establishing the areas of accountability/responsibilities for each party, document adequately the requirements and further (important) communication, make sure that the deliverables have the expected quality, etc.

Previous Post <<||>> Next Post

06 November 2020

Graphical Representation: Reports vs. Data Visualizations

Graphical Representation

Considering visualizations, John Tukey remarked that ‘the greatest value of a picture is when it forces us to notice what we never expected to see’, which is not always the case for many of the graphics and visualizations available in organizations, typically in the form of simple charts and dashboards, quite often with no esthetics or meaning behind.

In general reports are needed as source for operational activities, in which the details in form of raw or aggregate data are important. As one moves further to the tactical or strategic aspects of a business, visualizations gain in importance especially when they allow encoding data and information, respectively variations, trends or relations in smaller places with minimal loss of information.

There are also different aspects of visualizations that need to be considered. Modern tools allow rapid visualization and interactive navigation of data across different variables which is great as long one knows what is searching for, which is not always the case.

There are junk charts in which the data drowns in graphical elements that bring no value to the reader, in extremis even distorting the message/meaning.

There are graphics/visualizations that attempt bringing together and encoding multiple variables in respect to a theme, and for which a ‘project’ is typically needed as data is not ad-hoc available, don’t have the desired quality or need further transformations to be ready for consumption. Good quality graphics/visualizations require time and a good understanding of the business, which are not necessarily available into the BI/Analytics teams, and unfortunately few organizations do something in that direction, ignoring typically such needs. In this type of environments is stressed the rapid availability of data for decision-making or action-relevant insight, which depends typically on the consumer.

The story-telling capabilities of graphics/visualizations are often exaggerated. Yes, they can tell a story though stories need to be framed into a context/problem, some background and further references need to be provided, while without detailed data the graphics/visualizations are just nice representations in which each consumer understands what he can.

In an ideal world the consumer and the ‘designer’ would work together to identify the important data for the theme considered, to find the appropriate level of detail, respectively the best form of encoding. Such attempts can stop at table-based representations (aka reports), respectively basic or richer forms of graphical representations. One can consider reports as an early stage of the visualization process, with the potential to derive move value when the data allow meaningful graphical representations. Unfortunately, the time, data and knowledge available seldom make this achievable.

In addition, a well-designed report can be used as basis for multiple purposes, while a graphic/visualization can enforce more limitations. Ideal would be when multiple forms of representation (including reports) are combined to harness the value of data. Navigations from visualizations to detailed data can be useful to understand what happens; learning and understanding the various aspects being an iterative process.

It’s also difficult to demonstrate the value of insight derived from visualizations, especially when graphical literacy goes behind the numeracy and statistical literacy - many consumers lacking the skills needed to evaluate numbers and statistics adequately. If for a good artistic movie you need an assistance to enjoy the show and understand the message(s) behind it, the same can be said also about good graphics/visualizations. Moreover, this requires creativity, abstraction-based thinking, and other capabilities to harness the value of representations.

Given the considerable volume of requirements related to the need of basis data, reports will continue to be on high demand in organizations. In exchange visualizations can complement them by providing insights otherwise not available.

Initially published on Medium as answer to a post on Reporting and Visualizations. 

30 October 2020

Data Science: Generalists vs Specialists in the Field of Data Science

Data Science

Division of labor favorizes the tasks done repeatedly, where knowledge of the broader processes is not needed, where aspects as creativity are needed only at a small scale. Division invaded the IT domains as tools, methodologies and demands increased in complexity, and therefore Data Science and BI/Analytics make no exception from this.

The scale of this development gains sometimes humorous expectations or misbelieves when one hears headhunters asking potential candidates whether they are upfront or backend experts when a good understanding of both aspects is needed for providing adequate results. The development gains tragicomical implications when one is limited in action only to a given area despite the extended expertise, or when a generalist seems to step on the feet of specialists, sometimes from the right entitled reasons. 

Headhunters’ behavior is rooted maybe in the poor understanding of the domain of expertise and implications of the job descriptions. It’s hard to understand how people sustain of having knowledge about a domain just because they heard the words flying around and got some glimpse of the connotations associated with the words. Unfortunately, this is extended to management and further in the business environment, with all the implications deriving from it. 

As Data Science finds itself at the intersection between Artificial Intelligence, Data Mining, Machine Learning, Neurocomputing, Pattern Recognition, Statistics and Data Processing, the center of gravity is hard to determine. One way of dealing with the unknown is requiring candidates to have a few years of trackable experience in the respective fields or in the use of a few tools considered as important in the respective domains. Of course, the usage of tools and techniques is important, though it’s a big difference between using a tool and understanding the how, when, why, where, in which ways and by what means a tool can be used effectively to create value. This can be gained only when one’s exposed to different business scenarios across industries and is a tough thing to demand from a profession found in its baby steps. 

Moreover, being a good data scientist involves having a deep insight into the businesses, being able to understand data and the demands associated with data – the various qualitative and quantitative aspects. Seeing the big picture is important in defining, approaching and solving problems. The more one is exposed to different techniques and business scenarios, with right understanding and some problem-solving skillset one can transpose and solve problems across domains. However, the generalist will find his limitations as soon a certain depth is reached, and the collaboration with a specialist is then required. A good collaboration between generalists and specialists is important in complex projects which overreach the boundaries of one person’s knowledge and skillset. 

Complexity is addressed when one can focus on the important characteristic of the problem, respectively when the models built can reflect the demands. The most important skillset besides the use of technical tools is the ability to model problems and root the respective problems into data, to elaborate theories and check them against reality. 

Complex problems can require specialization in certain fields, though seldom one problem is dependent only on one aspect of the business, as problems occur in overreaching contexts that span sometimes the borders of an organization. In addition, the ability to solve problems seem to be impacted by the diversity of the people involved into the task, sometimes even with backgrounds not directly related to organization’s activity. As in evolution, a team’s diversity is an important factor in achievement and learning, most gain being obtained when knowledge gets shared and harnessed beyond the borders of teams.

Note:
Written as answer to a Medium post on Data Science generalists vs specialists.

14 October 2020

Strategic Management: Simplicity VI (ERP Implementations' Story II)


Besides the witty sayings and theories advanced in defining what simplicity is about, life shows that there’s a considerable gap between theory and praxis. In the attempt at a definition, one is forced to pull more concepts like harmony, robustness, variety, balance, economy, or proportion, which can be grouped under organic unity or similar concepts. However, intuitionally one can advance the idea that from a cybernetic perspective simplicity is achieved when the information flows are not disrupted and don’t meet unnecessary resistance. By information here are considered the various data aggregations – data, information, knowledge, and eventually wisdom (aka DIKW pyramid) – though it can be extended to encompass materials, cash and vital energy.

One can go further and say that an organization is healthy when the various flows mentioned above run smoothly through the organization nourishing it. The comparison with the human body can go further and say that a blockage in the flow can cause minor headaches or states that can take a period of convalescence to recover from them. Moreover, the sustained effort applied by an organization can result in fatigue or more complex ailments or even diseases if the state is prolonged. 

For example, big projects like ERP implementations tend to suck the vital energy of an organization to the degree that it will take months to recover from the effort, while the changes in the other types of flow can lead to disruptions, especially when the change is not properly managed. Even if ERP implementations provide standard solutions for the value-added processes, they represent vendors’ perspective into the respective processes, which don’t necessarily fit an organization’s needs. One is forced then to make compromises either by keeping close to the standard or by expanding the standard processes to close the gap. Either way processual changes are implied, which affect the information flow, especially for the steps where further coordination is needed, respectively the data flow in respect to implementation or integration with the further systems. A new integration as well as a missing integration have the potential of disrupting the data and information flows.

The processual changes can imply changes in the material flow as the handling of the materials can change, however the most important impact is caused maybe by the processual bottlenecks, which can cause serious disruptions (e.g. late deliveries, production is stopped), and upon case also in the cash-flow (e.g. penalties for late deliveries, higher inventory costs). The two flows can be impacted by the data and information flows independently of the processual changes (e.g. when they have poor quality, when not available, respectively when don’t reach the consumer in timely manner). 

With a new ERP solution, the organization needs to integrate the new data sources into the existing BI infrastructure, or when not possible, to design and implement a new one by taking advantage of the technological advancements. Failing to exploit this potential will impact the other flows, however the major disruptions appear when the needed knowledge about business processes is not available in-house, in explicit and/or implicit form, before, during and after the implementation. 

Independently on how they are organized – in center of excellence or ad-hoc form – is needed a group of people who can manage the various flows and ideally, they should have the appropriate level of empowerment. Typically, the responsibility resides with key users, IT and one or two people from the management. Without a form of ‘organization’ to manage the flows, the organization will reside only on individual effort, which seldom helps reaching the potential. Independently of the number of resources involved, simplicity is achieved when the activities flow naturally. 

Previous Post <<||>> Next Post

Written: Sep-2020, Last Reviewed: Mar-2024

28 June 2020

Strategic Management: Simplicity II (A System's View)

Strategic Management

Each time one discusses in IT about software and hardware components interacting with each other, one talks about a composite referred to as a system. Even if the term Information System (IS) is related to it, a system is defined as a set of interrelated and interconnected components that can be considered together for specific purposes or simple convenience.

A component can be a piece of software or hardware, as well persons or groups if we extend the definition. The consideration of people becomes relevant especially in the context of ecologies, in which systems are placed in a broader context that considers people’s interaction with them, as this raises to important behavior that impacts system’s functioning.

Within a system each part has a role or function determined in respect to the whole as well as to the other parts. The role or function of the component is typically fixed, predefined, though there are also exceptions especially when the scope of a component is enlarged, respectively reduced to the degree that the component can be removed or ignored. What one considers or not considers as part of system defines a system’s boundaries; it’s what distinguishes it from other systems within the environment(s) considered.

The interaction between the components resumes in the exchange, transmission and processing of data found in different aggregations ranging from signals to complex data structures. If in non-IT-based systems the changes are determined by inflow, respectively outflow of energy, in IT the flow is considered in terms of data in its various aggregations (information, knowledge).  The data flow (also information flow) represents the ‘fluid’ that nourishes a system’s ‘organism’.

One can grasp the complexity in the moment one attempts to describe a system in terms of components, respectively the dependencies existing between them in term of data and processes. If in nature the processes are extrapolated, in IT they are predefined (even if the knowledge about them is not available). In addition, the less knowledge one has about the infrastructure, the higher the apparent complexity. Even if the system is not necessarily complex, the lack of knowledge and certainty about it makes it complex. The more one needs to dig for information and knowledge to get an acceptable level of knowledge and logical depth, the more time is needed for designing a solution.

Saint Exupéry’s definition of simplicity applies from a system’s functional point of view, though it doesn’t address the relative knowledge about the system, which often is implicit (in people’s heads). People have only fragmented knowledge about the system which makes it difficult to create the whole picture. It’s typically the role of system or process operational manuals, respectively of data descriptions, to make that knowledge explicit, also establishing a fundament for common knowledge and further communication and understanding.

Between the apparent (perceived) and real complexity of a system there’s an important gap that needs to be addressed if one wants to manage the systems adequately, respectively to simplify the systems. Often simplification happens when components or whole systems are replaced, consolidated, or migrated, a mix between these approaches existing as well. Simplifications at data level (aka data harmonization) or process level (aka process optimization and redesign) can have an important impact, being inherent to the good (optimal) functioning of systems.

Whether these changes occur in big-bang or gradual iterations it’s a question of available resources, organizational capabilities, including the ability to handle such projects, respectively the impact, opportunities and risks associated with such endeavors. Beyond this, it’s important to regard the problems from a systemic and systematic point of view, in which ecology’s role is important.

Previous Post <<||>> Next Post

Written: Jun-2020, Last Reviewed: Mar-2024

10 July 2019

IT: Crowdsourcing (Definitions)

"Obtaining information by tapping the collective knowledge of many people." (W Roy Schulte & K Chandy, "Event Processing: Designing IT Systems for Agile Companies", 2009)

"A model of problem solving and idea generation that marshals the collective talents of a large group of people." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"the act of outsourcing a task to an undefined, generally large group of people or community, typically in the form of some sort of post on the Internet." (Bill Holtsnider & Brian D Jaffe, "IT Manager's Handbook" 3rd Ed., 2012)

"Tapping into collective online knowledge by inviting large numbers of people, via the Internet, to contribute ideas on different aspects of a business’s operations. A related concept is 'crowdfunding', which involves funding a project or venture by raising capital from individual investors via the Internet." (DK, "The Business Book", 2014)

"The process by which ideas, services, or other needs are solicited from predominantly amorphous and undefined large groups of people." (Evan Stubbs, "Big Data, Big Innovation", 2014)

"A method of resource gathering where interested potential customers pledge money to innovators for a product that has not yet been created." (Rachel Heinen et al, "Tools for the Process: Technology to Support Creativity and Innovation", 2015)

"The practice of outsourcing organisational tasks by placing a call on the internet and inviting all-comers to post submissions, often with the lure of a prize or commission for the 'best entry'." (Duncan Angwin & Stephen Cummings, "The Strategy Pathfinder" 3rd Ed., 2017)

"Dividing the work of collecting a substantial amount of data into small tasks that can be undertaken by volunteers." (Open Data Handbook)

12 May 2019

Software Engineering: Programming (Misconceptions about Programming - Part I)

Software Engineering
Software Engineering Series

Besides equating the programming process with a programmer’s capabilities, minimizing the importance of programming and programmers’ skills in the whole process (see previous post), there are several other misconceptions about programming that influence process' outcomes.


Having a deep knowledge of a programming language allows programmers to easily approach other programming languages, however each language has its own learning curve ranging from a few weeks to half of year or more. The learning curve is dependent on the complexity of the languages known and the language to be learned, same applying to frameworks and architectures, the scenarios in which the languages are used, etc. One unrealistic expectation is that the programmers are capablle of learning a new programming language or framework overnight, this expectation pushing more pressure on programmers’ shoulders as they need to compensate in a short time for the knowledge gap. No, the programming languages are not the same even if there’s high resemblance between them!

There’s lot of code available online, many of the programming tasks involve writing similar code. This makes people assume that programming can resume to copy-paste activities and, in extremis, that there’s no creativity into the act of programming. Beside the fact that using others’ code comes with certain copyright limitations, copy-pasting code is in general a way of introducing bugs in software. One can learn a lot from others’ code, though programmers' challenge resides in writing better code, in reusing code while finding the right the level of abstraction.  
 
There’s the tendency on the market to build whole applications using wizard-like functionality and of generating source-code based on data or ontological models. Such approaches work in a range of (limited) scenarios, and even if the trend is to automate as much in the process, is not what programming is about. Each such tool comes with its own limitations that sooner or later will push back. Changing the code in order to build new functionality or to optimize the code is often not a feasible solution as it imposes further limitations.

Programming is not only about writing code. It involves also problem-solving abilities, having a certain understanding about the business processes, in which the conceptual creativity and ingenuity of design can prove to be a good asset. Modelling and implementing processes help programmers gain a unique perspective within a business.

For a programmer the learning process never stops. The release cycle for the known tools becomes smaller, each release bringing a new set of functionalities. Moreover, there are always new frameworks, environments, architectures and methodologies to learn. There’s a considerable amount of effort in expanding one's (necessary) knowledge, effort usually not planned in projects or outside of them. Trainings help in the process, though they hardly scratch the surface. Often the programmer is forced to fill the knowledge gap in his free time. This adds up to the volume of overtime one must do on projects. On the long run it becomes challenging to find the needed time for learning.

In resource planning there’s the tendency to add or replace resources on projects, while neglecting the influence this might have on a project and its timeline. Each new resource needs some time to accommodate himself on the role, to understand project requirements, to take over the work of another. Moreover, resources are replaced on project with a minimal or even without the knowledge transfer necessary for the job ahead. Unfortunately, same behavior occurs in consultancy as well, consultants being moved from one known functional area into another unknown area, changing the resources like the engines of different types of car, expecting that everything will work as magic.

18 April 2019

Meta-Blogging: Mea Culpa (Part I: Changing the Status Quo)


During the past years I started multiple posts on various programming-related topics though I seldom managed to bring something close to a publishable form. The main reason seems to be the lack of time needed to put an idea into words, to look at it from different perspectives in form of a logical meaningful unit and, last but not the least, make it count. This is accentuated by the fact that each idea pulls another, and often there are so many things to say that it’s hard to find a delimitation between what to be included and what to be left out. In extremis one feels that something is missing.

Often, it's required a certain amount of research needed to validate or support the facts. The knowledge about SQL Server and other DBMS is relative – it can be only relative as long their internals are known only to a certain degree. The relativity is found also in the area of applicability, the usage of a solution over another lying in details. Readers want solid facts while all one can give is a dry “it depends”…

Unfortunately, for a blogger not found close to the source of knowledge, the content posted tends to be third or fourth-hand knowledge and, in one form or the other, just duplication of information. As long content isn’t copied and there’s some personal touch the duplication is not necessarily a bad thing. Duplication makes knowledge more likely to be found as the content is indexed by search engines, however it becomes more difficult to stand in the crowd. To bring something new one must to put existing knowledge into new contexts, to be creative, and this takes time as well.

Without access to a pool of readers and of knowledge for a lone blogger it’s hard to succeed, giving up being just a few posts or a few years away. Of course, life tends to take over. It’s also in human nature to be enthusiastic about an idea and renounce shortly with the first difficulties met. On the other side, often it’s hard to keep or to find the needed motivation, especially when there is little support coming from the blogging platforms, tools creators or content publishers. Not being able to monetize one’s effort makes blogging more of a hobby.

With small exceptions, the investments made in blogging tools are below expectations. It’s frustrating when the tools or the integration between them stopped working and there’s no simple way to overcome this. Some aspects changed with time, however blogging seems to lose in contrast with other forms of media content.

Despite the lack of time and other difficulties I want to write and share my thoughts, my experience, make the time invested in learning and solving problems count. Blogging is also a way to externalize the implicit knowledge, of sharing, of questioning some of the ideas and practices, and ultimately of getting feedback. In this resides the personal value of blogging! 

In the fight with time and words, I found myself forced to limit the length of the posts on some random nontechnical topics to 600 words. This number is rooted in the university years, representing the proximate limit of a written assignment to include an acceptable quality and coverage, and involve a bearable amount of effort. 600 is not a perfect number as its leading digit though, for the time being will do.

The challenge is to find a context to express my thoughts and experience without being too boring, without skimming through ideas. Without carrying great expectations, it’s an attempt to change the status quo! 

30 December 2018

Data Science: Information (Just the Quotes)

"Probability, however, is not something absolute, [it is] drawn from certain information which, although it does not suffice to resolve the problem, nevertheless ensures that we judge correctly which of the two opposites is the easiest given the conditions known to us." (Gottfried W Leibniz, "Forethoughts for an encyclopaedia or universal science", cca. 1679)

"Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it." (Samuel Johnson, 1775)

"What is called science today consists of a haphazard heap of information, united by nothing, often utterly unnecessary, and not only failing to present one unquestionable truth, but as often as not containing the grossest errors, today put forward as truths, and tomorrow overthrown." (Leo Tolstoy, "What Is Art?", 1897)

"There can be no unique probability attached to any event or behaviour: we can only speak of ‘probability in the light of certain given information’, and the probability alters according to the extent of the information." (Sir Arthur S Eddington, "The Nature of the Physical World" , 1928)

"As words are not the things we speak about, and structure is the only link between them, structure becomes the only content of knowledge. If we gamble on verbal structures that have no observable empirical structures, such gambling can never give us any structural information about the world. Therefore such verbal structures are structurally obsolete, and if we believe in them, they induce delusions or other semantic disturbances." (Alfred Korzybski, "Science and Sanity", 1933)

"Much of the waste in business is due to lack of information. And when the information is available, waste often occurs because of lack of application or because of misapplication." (John R Riggleman & Ira N Frisbee, "Business Statistics", 1938)

"Upon this gifted age, in its dark hour, rains from the sky a meteoric shower of facts […] they lie, unquestioned, uncombined. Wisdom enough to leach us of our ill is daily spun; but there exists no loom to weave it into a fabric." (Edna St. Vincent Millay, "Huntsman, What Quarry?", 1939)

"Just as entropy is a measure of disorganization, the information carried by a set of messages is a measure of organization. In fact, it is possible to interpret the information carried by a message as essentially the negative of its entropy, and the negative logarithm of its probability. That is, the more probable the message, the less information it gives. Clichés, for example, are less illuminating than great poems." (Norbert Wiener, "The Human Use of Human Beings", 1950)

"Knowledge is not something which exists and grows in the abstract. It is a function of human organisms and of social organization. Knowledge, that is to say, is always what somebody knows: the most perfect transcript of knowledge in writing is not knowledge if nobody knows it. Knowledge however grows by the receipt of meaningful information - that is, by the intake of messages by a knower which are capable of reorganising his knowledge." (Kenneth E Boulding, "General Systems Theory - The Skeleton of Science", Management Science Vol. 2 (3), 1956)

"We have overwhelming evidence that available information plus analysis does not lead to knowledge. The management science team can properly analyse a situation and present recommendations to the manager, but no change occurs. The situation is so familiar to those of us who try to practice management science that I hardly need to describe the cases." (C West Churchman, "Managerial acceptance of scientific recommendations", California Management Review Vol 7, 1964)

"This is the key of modern science and it was the beginning of the true understanding of Nature - this idea to look at the thing, to record the details, and to hope that in the information thus obtained might lie a clue to one or another theoretical interpretation." (Richard P Feynman, "The Character of Physical Law", 1965)

"[...] 'information' is not a substance or concrete entity but rather a relationship between sets or ensembles of structured variety." (Walter F Buckley, "Sociology and modern systems theory", 1967)

"There are as many types of questions as components in the information." (Jacques Bertin, Semiology of graphics [Semiologie Graphique], 1967)

"The idea of knowledge as an improbable structure is still a good place to start. Knowledge, however, has a dimension which goes beyond that of mere information or improbability. This is a dimension of significance which is very hard to reduce to quantitative form. Two knowledge structures might be equally improbable but one might be much more significant than the other." (Kenneth E Boulding, "Beyond Economics: Essays on Society", 1968)

"When action grows unprofitable, gather information; when information grows unprofitable, sleep. (Ursula K Le Guin, "The Left Hand of Darkness", 1969)

"What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it." (Herbert Simon, "Computers, Communications and the Public Interest", 1971)

"What we mean by information - the elementary unit of information - is a difference which makes a difference, and it is able to make a difference because the neural pathways along which it travels and is continually transformed are themselves provided with energy. The pathways are ready to be triggered. We may even say that the question is already implicit in them." (Gregory Bateson, "Steps to an Ecology of Mind", 1972)

"Science gets most of its information by the process of reductionism, exploring the details, then the details of the details, until all the smallest bits of the structure, or the smallest parts of the mechanism, are laid out for counting and scrutiny. Only when this is done can the investigation be extended to encompass the whole organism or the entire system. So we say. Sometimes it seems that we take a loss, working this way." (Lewis Thomas, "The Medusa and the Snail: More Notes of a Biology Watcher", 1974)

"Science is not a heartless pursuit of objective information. It is a creative human activity, its geniuses acting more as artists than information processors. Changes in theory are not simply the derivative results of the new discoveries but the work of creative imagination influenced by contemporary social and political forces." (Stephen J Gould, "Ever Since Darwin: Reflections in Natural History", 1977)

"Data, seeming facts, apparent asso­ciations-these are not certain knowledge of something. They may be puzzles that can one day be explained; they may be trivia that need not be explained at all. (Kenneth Waltz, "Theory of International Politics", 1979)

"To a considerable degree science consists in originating the maximum amount of information with the minimum expenditure of energy. Beauty is the cleanness of line in such formulations along with symmetry, surprise, and congruence with other prevailing beliefs." (Edward O Wilson, "Biophilia", 1984)

"Knowledge is the appropriate collection of information, such that it's intent is to be useful. Knowledge is a deterministic process. When someone 'memorizes' information (as less-aspiring test-bound students often do), then they have amassed knowledge. This knowledge has useful meaning to them, but it does not provide for, in and of itself, an integration such as would infer further knowledge." (Russell L Ackoff, "Towards a Systems Theory of Organization", 1985)

"Information is data that has been given meaning by way of relational connection. This 'meaning' can be useful, but does not have to be. In computer parlance, a relational database makes information from the data stored within it." (Russell L Ackoff, "Towards a Systems Theory of Organization", 1985)

"Probability plays a central role in many fields, from quantum mechanics to information theory, and even older fields use probability now that the presence of 'noise' is officially admitted. The newer aspects of many fields start with the admission of uncertainty." (Richard W Hamming, "Methods of Mathematics Applied to Calculus, Probability, and Statistics", 1985)

"Probabilities are summaries of knowledge that is left behind when information is transferred to a higher level of abstraction." (Judea Pearl, "Probabilistic Reasoning in Intelligent Systems: Network of Plausible, Inference", 1988)

"Information exists. It does not need to be perceived to exist. It does not need to be understood to exist. It requires no intelligence to interpret it. It does not have to have meaning to exist. It exists." (Tom Stonier, "Information and the Internal Structure of the Universe: An Exploration into Information Physics", 1990)

"What about confusing clutter? Information overload? Doesn't data have to be ‘boiled down’ and  ‘simplified’? These common questions miss the point, for the quantity of detail is an issue completely separate from the difficulty of reading. Clutter and confusion are failures of design, not attributes of information." (Edward R Tufte, "Envisioning Information", 1990)

"Knowledge is theory. We should be thankful if action of management is based on theory. Knowledge has temporal spread. Information is not knowledge. The world is drowning in information but is slow in acquisition of knowledge. There is no substitute for knowledge." (William E Deming, "The New Economics for Industry, Government, Education", 1993)

"The science of statistics may be described as exploring, analyzing and summarizing data; designing or choosing appropriate ways of collecting data and extracting information from them; and communicating that information. Statistics also involves constructing and testing models for describing chance phenomena. These models can be used as a basis for making inferences and drawing conclusions and, finally, perhaps for making decisions." (Fergus Daly et al, "Elements of Statistics", 1995)

"[Schemata are] knowledge structures that represent objects or events and provide default assumptions about their characteristics, relationships, and entailments under conditions of incomplete information." (Paul J DiMaggio, "Culture and Cognition", Annual Review of Sociology No. 23, 1997)

"When it comes to information, it turns out that one can have too much of a good thing. At a certain level of input, the law of diminishing returns takes effect; the glut of information no longer adds to our quality of life, but instead begins to cultivate stress, confusion, and even ignorance." (David Shenk, "Data Smog", 1997)

"Each element in the system is ignorant of the behavior of the system as a whole, it responds only to information that is available to it locally. This point is vitally important. If each element ‘knew’ what was happening to the system as a whole, all of the complexity would have to be present in that element." (Paul Cilliers, "Complexity and Postmodernism: Understanding Complex Systems" , 1998)

"Complexity is that property of a model which makes it difficult to formulate its overall behaviour in a given language, even when given reasonably complete information about its atomic components and their inter-relations." (Bruce Edmonds, "Syntactic Measures of Complexity", 1999)

"A model isolates one or a few causal connections, mechanisms, or processes, to the exclusion of other contributing or interfering factors - while in the actual world, those other factors make their effects felt in what actually happens. Models may seem true in the abstract, and are false in the concrete. The key issue is about whether there is a bridge between the two, the abstract and the concrete, such that a simple model can be relied on as a source of relevantly truthful information about the complex reality." (Uskali Mäki, "Fact and Fiction in Economics: Models, Realism and Social Construction", 2002)

"Entropy is not about speeds or positions of particles, the way temperature and pressure and volume are, but about our lack of information." (Hans C von Baeyer," Information, The New Language of Science", 2003)

"The use of computers shouldn't ignore the objectives of graphics, that are: 
 1) Treating data to get information. 
 2) Communicating, when necessary, the information obtained." (Jacques Bertin, [interview] 2003)

"There is no end to the information we can use. A 'good' map provides the information we need for a particular purpose - or the information the mapmaker wants us to have. To guide us, a map’s designers must consider more than content and projection; any single map involves hundreds of decisions about presentation." (Peter Turchi, "Maps of the Imagination: The writer as cartographer", 2004)

"While in theory randomness is an intrinsic property, in practice, randomness is incomplete information." (Nassim N Taleb, "The Black Swan", 2007)

"Put simply, statistics is a range of procedures for gathering, organizing, analyzing and presenting quantitative data. […] Essentially […], statistics is a scientific approach to analyzing numerical data in order to enable us to maximize our interpretation, understanding and use. This means that statistics helps us turn data into information; that is, data that have been interpreted, understood and are useful to the recipient. Put formally, for your project, statistics is the systematic collection and analysis of numerical data, in order to investigate or discover relationships among phenomena so as to explain, predict and control their occurrence." (Reva B Brown & Mark Saunders, "Dealing with Statistics: What You Need to Know", 2008)

"Access to more information isn’t enough - the information needs to be correct, timely, and presented in a manner that enables the reader to learn from it. The current network is full of inaccurate, misleading, and biased information that often crowds out the valid information. People have not learned that 'popular' or 'available' information is not necessarily valid." (Gene Spafford, 2010) 

"We face danger whenever information growth outpaces our understanding of how to process it. The last forty years of human history imply that it can still take a long time to translate information into useful knowledge, and that if we are not careful, we may take a step back in the meantime." (Nate Silver, "The Signal and the Noise", 2012)

"Complexity has the propensity to overload systems, making the relevance of a particular piece of information not statistically significant. And when an array of mind-numbing factors is added into the equation, theory and models rarely conform to reality." (Lawrence K Samuels, "Defense of Chaos: The Chaology of Politics, Economics and Human Action", 2013)

"Complexity scientists concluded that there are just too many factors - both concordant and contrarian - to understand. And with so many potential gaps in information, almost nobody can see the whole picture. Complex systems have severe limits, not only to predictability but also to measurability. Some complexity theorists argue that modelling, while useful for thinking and for studying the complexities of the world, is a particularly poor tool for predicting what will happen." (Lawrence K Samuels, "Defense of Chaos: The Chaology of Politics, Economics and Human Action", 2013)

"One of the most powerful transformational catalysts is knowledge, new information, or logic that defies old mental models and ways of thinking" (Elizabeth Thornton, "The Objective Leader", 2015)

"The term data, unlike the related terms facts and evidence, does not connote truth. Data is descriptive, but data can be erroneous. We tend to distinguish data from information. Data is a primitive or atomic state (as in ‘raw data’). It becomes information only when it is presented in context, in a way that informs. This progression from data to information is not the only direction in which the relationship flows, however; information can also be broken down into pieces, stripped of context, and stored as data. This is the case with most of the data that’s stored in computer systems. Data that’s collected and stored directly by machines, such as sensors, becomes information only when it’s reconnected to its context."  (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.