Showing posts with label standardization. Show all posts
Showing posts with label standardization. Show all posts

29 July 2019

💻IT: Standardization (Definitions)

"The imposition of standards which, in turn, are fixed ways of doing things that are widely recognized." (Roy Rada &  Heather Holden, "Online Education, Standardization, and Roles", 2009)

"Formulation, publication, and implementation of guidelines, rules, methods, procedures and specifications for common and repeated use, aimed at achieving optimum degree of order or uniformity in given context, discipline, or field; standards are most frequently developed on international level; there exist national standardization bodies cooperating with international bodies; standards can be either legally binding or de facto standards followed by informal convention or voluntary standards (recommendations)." (Lenka Lhotska et al,"Interoperability of Medical Devices and Information Systems", 2013)

"A framework of agreements to which all relevant parties in an industry or organization must adhere to ensure that all processes associated with the creation of a good or performance of a service are performed within set guideline." (Victor A Afonso & Maria de Lurdes Calisto, "Innovation in Experiential Services: Trends and Challenges", 2015)

"The development of uniform specifications for materials, products, processes, practices, measurement, or performance, usually via consultation with stakeholders and sanction by a recognized body, providing for improvements in productivity, interoperability, cooperation, and accountability." (Gregory A Smith, "Assessment in Academic Libraries", 2015)

"A process of developing and implementing technical standards based on consensus among various stakeholders in the field. Standardization can greatly assist with compatibility and interoperability of otherwise disparate software components, where consistent solutions enable mutual gains for all stakeholders." (Krzysztof Krawiec et al, "Metaheuristic Design Patterns: New Perspectives for Larger-Scale Search Architectures", 2018)

"The process through which a standard is developed." (Kai Jakobs, "ICT Standardization", 2018)

"Is a framework of agreements to which professionals in an organization must accept to ensure that all processes associated with the creation of a product or service are performed within set guidelines, achieving uniformity to certain practices or operations within the selected environment. It can be seen as a professional strategy to strengthen professional trust and provide a sense of certainty for professionals or it can be interpreted as a way to lose professionalization and as an adjustment to organizational demands." (Joana V Guerra, "Digital Professionalism: Challenges and Opportunities to Healthcare Professions", 2019)

"The process of making things of the same kind, including products and services, have the same basic features and the same requirements." (Julia Krause, "Through Harmonization of National Technical Regulations to More Sustainability in Engineering Business", 2019)

29 April 2018

🔬Data Science: Data Standardization (Definitions)

"The process of reaching agreement on common data definitions, formats, representation and structures of all data layers and elements." (United Nations, "Handbook on Geographic Information Systems and Digital Mapping", Studies in Methods No. 79, 2000)

[value standardization:] "Refers to the establishment and adherence of data to standard formatting practices, ensuring a consistent interpretation of data values." (Evan Levy & Jill Dyché, "Customer Data Integration", 2006)

"Converting data into standard formats to facilitate parsing and thus matching, linking, and de-duplication. Examples include: “Avenue” as “Ave.” in addresses; “Corporation” as “Corp.” in business names; and variations of a specific company name as one version." (Danette McGilvray, "Executing Data Quality Projects", 2008)

"Normalizes data values to meet format and semantic definitions. For example, data standardization of address information may ensure that an address includes all of the required pieces of information and normalize abbreviations (for example Ave. for Avenue)." (Martin Oberhofer et al, "Enterprise Master Data Management", 2008)

"Using rules to conform data that is similar into a standard format or structure. Example: taking similar data, which originates in a variety of formats, and transforming it into a single, clearly defined format." (Gregory Lampshire, "The Data and Analytics Playbook", 2016)

"a process in information systems where data values for a data element are transformed to a consistent representation." (Meredith Zozus, "The Data Book: Collection and Management of Research Data", 2017)

"Data standardization is the process of converting data to a common format to enable users to process and analyze it." (Sisense) [source]

"In the context of data analysis and data mining: Where “V” represents the value of the variable in the original datasets: Transformation of data to have zero mean and unit variance. Techniques used include: (a) Data normalization; (b) z-score scaling; (c) Dividing each value by the range: recalculates each variable as V /(max V – min V). In this case, the means, variances, and ranges of the variables are still different, but at least the ranges are likely to be more similar; and, (d) Dividing each value by the standard deviation. This method produces a set of transformed variables with variances of 1, but different means and ranges." (CODATA)

10 December 2014

✨Performance Management: Skills (Just the Quotes)

"By far the most valuable possession is skill. Both war and the chances of fortune destroy other things, but skill is preserved." Hipparchus, Commentaries, 2nd century BC)

"Let a man practice the profession which he best knows." (Cicero, "Tusculanarum Disputationum", cca. 45 BC)

"Numeracy has two facets - reading and writing, or extracting numerical information and presenting it. The skills of data presentation may at first seem ad hoc and judgmental, a matter of style rather than of technology, but certain aspects can be formalized into explicit rules, the equivalent of elementary syntax." (Andrew Ehrenberg, "Rudiments of Numeracy", Journal of Royal Statistical Society, 1977)

"Five coordinating mechanisms seem to explain the fundamental ways in which organizations coordinate their work: mutual adjustment, direct supervision, standardization of work processes, standardization of work outputs, and standardization of worker skills." (Henry Mintzberg, "The Structuring of Organizations", 1979)

"Training is the teaching of specific skills. It should result in the employee having the ability to do something he or she could not do before." (Mary A Allison & Eric Anderson, "Managing Up, Managing Down", 1984)

"The skills that make technical professionals competent in their specialties are not necessarily the same ones that make them successful within their organizations." (Bernard Rosenbaum, "Training", 1986)

"[…] data analysis in the context of basic mathematical concepts and skills. The ability to use and interpret simple graphical and numerical descriptions of data is the foundation of numeracy […] Meaningful data aid in replacing an emphasis on calculation by the exercise of judgement and a stress on interpreting and communicating results." (David S Moore, "Statistics for All: Why, What and How?", 1990)

"[By understanding] I mean simply a sufficient grasp of concepts, principles, or skills so that one can bring them to bear on new problems and situations, deciding in which ways one’s present competencies can suffice and in which ways one may require new skills or knowledge." (Howard Gardner, "The Unschooled Mind", 1991)

"Education is not the piling on of learning, information, data, facts, skills, or abilities - that's training or instruction - but is rather making visible what is hidden as a seed." (Thomas W Moore, "The Education of the Heart", 1996)

"Even when you have skilled, motivated, hard-working people, the wrong team structure can undercut their efforts instead of catapulting them to success. A poor team structure can increase development time, reduce quality, damage morale, increase turnover, and ultimately lead to project cancellation." (Steve McConnell, "Rapid Development", 1996)

"Success or failure of a project depends upon the ability of key personnel to have sufficient data for decision-making. Project management is often considered to be both an art and a science. It is an art because of the strong need for interpersonal skills, and the project planning and control forms attempt to convert part of the 'art' into a science." (Harold Kerzner, "Strategic Planning for Project Management using a Project Management Maturity Model", 2001)

"Even with simple and usable models, most organizations will need to upgrade their analytical skills and literacy. Managers must come to view analytics as central to solving problems and identifying opportunities - to make it part of the fabric of daily operations." (Dominic Barton & David Court, "Making Advanced Analytics Work for You", 2012)

"The biggest thing to know is that data visualization is hard. Really difficult to pull off well. It requires harmonization of several skills sets and ways of thinking: conceptual, analytic, statistical, graphic design, programmatic, interface-design, story-telling, journalism - plus a bit of ‘gut feel.’ The end result is often simple and beautiful, but the process itself is usually challenging and messy." (David McCandless, 2013)

"Finding the right answer is important, of course. But more important is developing the ability to see that problems have multiple solutions, that getting from X to Y demands basic skills and mental agility, imagination, persistence, patience." (Mary H Futrell)

"Productivity is the name of the game, and gains in productivity will only come when better understanding and better relationships exist between management and the work force. [...] Managers have traditionally developed the skills in finance, planning, marketing and production techniques. Too often the relationships with their people have been assigned a secondary role. This is too important a subject not to receive first-line attention." (William Hewlett, "The Human Side of Management", [speech])

"Solving problems is a practical skill like, let us say, swimming. We acquire any practical skill by imitation and practice." (George Polya)

06 April 2012

🧭Business Intelligence: Enterprise Reporting (Part X: Between Potential, Reality, Quality and Stories)

Business Intelligence
Business Intelligence Series

Have you ever felt that you are investing quite a lot of time, effort, money and other resources into your BI infrastructure, and in the end you don’t meet your expectations? As it seems you’re not the only one. The “Does your business intelligence tell you the whole story” paper released in 2009 by KPMG provides some interesting numbers to support that:
1. “More than 50% of business intelligence projects fail to deliver the expected benefit” (BI projects failure)
2. “Two thirds of executives feel that the quality of and timely access to data is poor and inconsistent” (reports and data quality)
3. “Seven out of ten executives do not get the right information to make business decisions.” (BI value)
4. “Fewer than 10% of organizations have successfully used business intelligence to enhance their organizational and technological infrastructures”  (BI alignment)
5. “those with effective business intelligence outperform the market by more than 5% in terms of return on equity” (competitive advantage)

The numbers reflect to some degree also my expectations, though they seem more pessimistic than I expected. That’s not a surprise, considering that such studies can be strongly biased, especially because in them are reflected expectations, presumptions and personal views over the state of art within an organization.

KPMG builds on the above numbers and several other aspects that revolve around the use of governance and alignment in order to increase the value provided by BI to the business, though I feel that they are hardly scratching the surface. Governance and alignment look great into studies and academic work, though they alone can’t bring success, no matter how much their importance and usage is accentuated. Sometimes I feel that people hide behind big words without even grasping the facts. The importance of governance and alignment can’t be neglected, though the argumentation provided by KPMG isn’t flawless. There are statements I can agree with, and many which are circumstantial. Anyway, let’s look a little deeper at the above numbers.

I suppose there is no surprise concerning the huge rate of BI projects’ failure. The value is somewhat close to the rate of software projects’ failure. Why would make a BI project an exception from a typical software project, considering that they are facing almost the same environments and challenges?  In fact, given the role played by BI in decision making, I would say that BI projects are more sensitive to the various factors than a typical software project.  

It doesn’t make sense to retake the motives for which software projects fail, but some particular aspects need to be mentioned. KPMG insists on the poor quality of data, on the relevance and volume of reports and metrics used, the lack of reflecting organization’s objectives, the inflexibility of data models, lack of standardization, all of them reflecting in a degree or other on the success of a BI project. There is much more to it!

KPMG refers to a holistic approach concentrated on the change of focus from technology to the actual needs, a change of process and funding.  A reflection of the holistic approach is also the view of the BI infrastructure from the point of view of the entire IT infrastructure, of the organization, network of partners and of the end-products – mainly models and reports. Many of the problems BI initiatives are confronted with refer to the quality of data and its many dimensions (duplicates, conformity, consistency, integrity, accuracy, availability, timeliness, etc.) , problems which could be in theory solved in the source systems, mainly through design. Other problems, like dealing with complex infrastructures based on more or less compatible IS or BI tools, might involve virtualization, consolidation or harmonization of such solutions, plus the addition of other tools.

Looking at the whole organization, other problems appear: the use of reports and models without understanding the whole luggage of meaning hiding behind them, the different views within the same data and models, the difference of language, problems, requirements and objectives, the departmental and organizational politics, the lack of communication, the lack of trust in the existing models and reports, and so on. What all these points have in common are people! The people are the maybe the most important factor in the adoption and effective usage of BI solutions. It starts with them – identifying their needs, and it ends with them – as end users. Making them aware of all contextual requirements, actually making them knowledge workers and not considering them just simple machines could give a boost to your BI strategy.

Partners doesn’t encompass just software vendors, service providers or consultants, but also the internal organizational structures – teams, departments, sites or any other similar structure. Many problems in BI can be tracked down to partners and the ways a partnership is understood, on how resources are managed, how different goals and strategies are harmonized, on how people collaborate and coordinate. Maybe the most problematic is the partnership between IT and the other departments on one side, and between IT and external partners on the other side. As long IT is not seen as a partner, as long IT is skip from the important decisions or isn’t acting as a mediator between its internal and external partners, there are few chances of succeeding. There are so many aspects and lot of material written on this topic, there are models and methodologies supposed to make things work, but often between theory and practice there is a long distance.

How many of the people you met were blaming the poor quality of the data without actually doing something to improve anything? If the quality of your data in one of your major problems then why aren’t you doing something to improve that?  Taking the ownership over your data is a major step on the way to better data quality, though a data management strategy is needed. This involve the design of a framework that facilitates data quality and data consumption, the design and use of policies, practices and procedures to properly manage the full data lifecycle. Also this can be considered as part of your BI infrastructure, and given the huge volume, the complexity and diversity of data, is nowadays a must for an organization.

The “right information” is an evasive construct. In order to get the right information you must be capable to define what you want, to design your infrastructure with that in mind and to learn how to harness your data. You don’t have to look only at your data and information but also at the whole DIKW pyramid. The bottom line is that you don’t have to build only a BI infrastructure but a knowledge management infrastructure, and methodologies like ITIL can help you achieve that, though they are not sufficient. Sooner or later you’ll arrive to blame the whole DIKW pyramid - the difficulty of extracting information from data, knowledge from information, and the ultimate translation into wisdom. Actually that’s also what the third and fourth of the above statements are screaming out loud – it’s not so easy to get information from the silos of data, same as it’s not easy to align the transformation process with organizations’ strategy.

Also timeliness has a relative meaning. It’s true that nowadays’ business dynamics requires faster access to data, though it requires also to be proactive, many organizations lacking this level of maturity. In order to be proactive it’s necessary to understand your business’ dynamics thoroughly, that being routed primarily in your data, in the tools you are using and the skill set your employees acquired in order to move between the DIKW layers. I would say that the understanding of DIKW is essential in harnessing your BI infrastructure.

KPMG considers that the 5% increase in return on equity associated with the effective usage of BI is a positive sign, not necessarily. The increase can be associated with hazard or other factors as well, even if it’s unlikely probable to be so. The increase it’s quite small when considered with the huge amount of resources spent on BI infrastructure. I believe that BI can do much more for organizations when harnessed adequately. It’s just a belief that needs to be backed up by numbers, hopefully that will happen someday, soon.

Previous Post <<||>> Next Post

13 January 2010

🗄️Data Management: Data Quality Dimensions (Part II: Conformity)

Data Management
Data Management Series

Conformity or format compliance, as named by [1], refers to the extent data are in the expected format, each attribute being associated with a set of metadata like type (e.g. text, numeric, alphanumeric, positive), length, precision, scale, or any other formatting patterns (e.g. phone number, decimal and digit grouping symbols).

Because distinct decimal, digit grouping, negative sign and currency symbols can be used to represent numeric values, same as different date formats could be used alternatively (e.g. dd-mm-yyyy vs. mm-dd-yyyy), the numeric and date data types are highly sensitive to local computer and general applications settings because the same attribute could be stored, processed and represented in different formats. Therefore, it’s preferable to minimize the variations in formatting by applying the same format to all attributes having the same data type and, whenever is possible, the format should not be confusing. 

For example all the dates in a data set or in a set of data sets being object of the same global context (e.g. data migration, reporting) should have the same format, being preferred a format of type dd-mon-yyyy which, ignoring the different values the month could have for different language settings, it lets no space for interpretations (e.g. 01-10-2009 vs. 10-01-2009). There are also situations in which the constraints imposed by the various applications used restraints the flexibility of working adequately with the local computer formats.

If for decimal and dates there are a limited number of possibilities that can be dealt with, for alphanumeric values things change drastically because excepting the format masks that could be used during data entry, the adherence to a format depends entirely on the Users and whether they applied the formatting standards defined. In the absence of standards, Users might come with their own encoding, and even then, they might change it over time. 

The use of different encodings could be also required by the standards within a specific country, organization, or other type of such entity. All these together makes from alphanumeric attributes the most often candidate for data cleaning, and the business rules used can be quite complex, needing to handle each specific case. For example, the VAT code could have different length from country to country, and more than one encoding could be used reflecting the changes in formatting policy.

In what concerns the format, the alphanumeric attributes offer greater flexibility than the decimal and date attributes, and their formatting could be in theory ignored unless they are further parsed by other applications. However, considering that such needs change over time, it’s advisable to standardize the various formats used within an organization and use 'standard' delimiters for formatting the various chunks of data with a particular meaning within an alphanumeric attribute, fact that could reduce considerably the volume of overwork needed in order to cleanse the data for further processing. An encoding could be done without the use of delimiters, e.g. when the length of each chunk of data is the same, though chunk length-based formatting could prove to be limited when the length of a chunk changes.

Note:
Delimiters should be chosen from the characters that will never be used in the actual chunks of data or in the various applications dealing with the respective data. For example pipe (“|”) or semicolon (“;”) could be good candidates for such a delimiter though they are often used as delimiters when exporting the data to text files, therefore it’s better to use a dash (“-”) or even a combinations of characters (e.g. “.-.”) when a dash is not enough, while in some cases even a space or a dot could be used as delimiter.


Written: Jan-2010, Last Reviewed: Mar-2024 

References:
[1] David Loshin (2009) "Master Data Management", Morgan Kaufmann OMG Press. ISBN 978-0-12-374225-4.
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.