SQL Troubles

04 May 2019

🧊Data Warehousing: Architecture (Part I: Push vs. Pull)

In data integrations, data migrations and data warehousing there is the need to move data between two or more systems. In the simplest scenario there are only two systems involved, a source and a target system, though there can be complex scenarios in which data from multiple sources need to be available in a common target system (as in the case of data warehouses/marts or data migrations), or data from one source (e.g. ERP systems) need to be available in other systems (e.g. Web shops, planning systems), or there can be complex cases in which there is a many-to-many relationship (e.g. data from two ERP systems are consolidated in other systems).

The data can flow in one direction from the source systems to the target systems (aka unidirectional flow), though there can be situations in which once the data are modified in the target system they need to flow back to the source system (aka bidirectional flow), as in the case of planning or product development systems. In complex scenarios the communication may occur multiple times within same process until a final state is reached.

Independently of the number of systems and the type of communication involved, data need to flow between the systems as smooth as possible, assuring that the data are consistent between the various systems and available when needed. The architectures responsible for moving data between the sources are based on two simple mechanisms - push vs pull – or combinations of them.

A push mechanism makes data to be pushed from the source system into the target system(s), the source system being responsible for the operation. Typically the push can happen as soon as an event occurs in the source system, event that leads to or follows a change in the data. There can be also cases when is preferred to push the data at regular points in time (e.g. hourly, daily), especially when the changes aren’t needed immediately. This later scenario allows to still make changes to the data in the source until they are sent to other system(s). When the ability to make changes is critical this can be controlled over specific business rules.

A pull mechanism makes the data to be pulled from the source system into the target system, the target systems being responsible for the operation. This usually happens at regular points in time or on demand, however the target system has to check whether the data have been changed.

Hybrid scenarios may involve a middleware that sits between the systems, being responsible for pulling the data from the source systems and pushing them into the targets system. Another hybrid scenario is when the source system pushes the data to an intermediary repository, the target system(s) pulling the data on a need basis. The repository can reside on the source, target on in-between. A variation of it is when the source informs the target that a change happened at it’s up to the target to decide whether it needs the data or not.

The main differentiators between the various methods is the timeliness, completeness and consistency of the data. Timeliness refers to the urgency with which data need to be available in the target system(s), completeness refers to the degree to which the data are ready to be sent, while consistency refers to the degree the data from the source are consistent with the data from the target systems.

Based on their characteristics integrations seem to favor push methods while data migrations and data warehousing the pull methods, though which method suits the best depends entirely on the business needs under consideration.

Previous Post <<||>> Next Post

#️⃣Software Engineering: Programming (Part X: Programming as Art)

Software Engineering Series

Maybe seeing programming as an art is an idealistic thought, while attempting to describe programming as an art may seem an ingrate task. However, one can talk about the art of programming same way one can talk about the art of applying a craft. It’s a reflection of the mastery reached and what it takes to master something. Some call it art, others mastery, in the end it’s the drive that makes one surpass his own condition.

Besides an audience's experience with a creative skill, art means the study, process and product of a creative skill. Learning the art of programming, means primarily learning its vocabulary and its grammar, the language, then one has to learn the rules, how and when to break them, and in the end how to transcend the rules to create new languages. The poet uses metaphors and rhythm to describe the world he sees, the programmer uses abstractedness and patterns for the same. Programming is the art of using patterns to create new patterns, much like the poet does.

The drive of art is creativity independently if one talks about music, painting, poetry, mathematics or any other science. Programmer's creativity is reflected in the way he uses his tools and builds new ones. Despite the limits imposed by the programming languages he uses, the programmer can borrow anytime the knowledge of other sciences – mathematics, physics or biology – to describe the universe and make it understandable for machines. In fact, when we understand well enough something to explain to a computer we call it science [1].

Programming is both a science and an art. Paraphrasing Leonard Tippett [2], programming is a science in that its methods are basically systematic and have general application; and an art in that their successful application depends to a considerable degree on the skill and special experience of the programmer, and on his knowledge of the field of application. The programmer seems to borrow from an engineer’s natural curiosity, attention to detail, thirst for knowledge and continual improvement though these are already in programmer’s DNA.

In programming aesthetics is judged by the elegance with which one solves a problem and transcribes its implementation. The programmer is in a continuous quest with simplicity, reusability, abstractedness, elegance, time and complexity. Beauty resides in the simplicity of the code, the easiness with which complexity is reduced to computability, the way everything fit together in a whole. Through reusability and abstractedness the whole becomes more than the sum of its parts.

Programming takes its rigor and logic from mathematics. Even if the programmer is not a mathematician, he borrows from a mathematician’s way of seeing the world in structures, patterns, order, models (approximations), connectedness, networks, the designs converging to create new paradigms. Programmer's imagery conjures some part from a mathematician's art.

In extremis, through the structures and thought patterns, the programmer is in a continuous search for meanings, of creating a meaning to encompass other meanings, meanings which will hopefully converge to a greater good. It resembles the art of the philosopher, without the historical luggage.

Between the patterns of the mathematician and philosopher's search for truth, between poets artistry of manipulating the language to create new views and engineer’s cold search for formalism and methodic, programming is a way to understand the world and create new worlds. The programmer becomes the creator of glimpses of universes which, when put together like the pieces of a puzzle can create a new reality, not necessarily better, but a reality that reflects programmers’ art. For the one who learned to master a programming language nothing is impossible.

Previous Post <<||>> Next Post

Quotations used:
(1)“Learning the art of programming, like most other disciplines, consists of first learning the rules and then learning when to break them.” (Joshua Bloch, “Effective Java”, 2001)
(2)“[Statistics] is both a science and an art. It is a science in that its methods are basically systematic and have general application; and an art in that their successful application depends to a considerable degree on the skill and special experience of the statistician, and on his knowledge of the field of application, e.g. economics.” (Leonard Tippett, “Statistics”, 1943)

02 May 2019

#️⃣Software Engineering: Programming (Part IX: Programmer, Coder or Developer?)

Software Engineering Series

Programmer, coder or (software) developer are terms used interchangeably to denote a person who writes a set of instructions for a computer or any other electronic device. Looking at the intrinsic meaning of the three denominations, a programmer is a person who writes programs, a coder is a person who writes code, and a developer is one who develops (makes grow) a piece of software. They look like redundant definitions, isn’t it?

A program is a stand-alone piece of code written for a given purpose – in general it’s used to transform inputs in outputs or specific actions, and involves a set of structures, libraries and other resources. Programming means primarily being able to write, understand, test and debug programs, however there can be other activities like designing, refactoring, documenting programs and the resources needed. It also involves the knowledge of a set of algorithms, libraries, architectures, methodologies and practices that can be used in the process.

Code may refer to a program, as well as parts of a program. Writing code means being able to use and understand a programming language’s instruction for a given result – validating input, acting on diverse events, formatting and transforming content, etc. The code doesn’t necessarily have to stand alone, often being incorporated inside of documents like web pages, web parts or reports.

Development of software usually means more than programing as the former is considered as a process in conceiving, specifying, designing, programming, documenting, testing and maintaining software. The gap between the two is neglectable as programming typically involves in practice the other activities as well.

Programmer and coder are unfortunately often used with a pejorative connotation. Therefore the denomination of developer seems fancier. An even fancier term is the one of software engineer, software engineering being the application of engineering to the development of software in a systematic method.

In IT there are several other roles which involve tangentially the writing of instructions – database administrator, security engineer, IT analyst, tester, designer, modeler, technical writer, etc. It looks like a soup of fancy denominations chosen expressly to confuse nontechnical people. Thus a person who covered many of the roles mentioned above, finds it sometimes difficult to define the most appropriate denomination.

A person who writes such code doesn’t have to be a programmer or even an IT professional. There are many tools on the market whose basic functionality can be extended with the help of scripts - Excel, Access, SSRS or SSIS. Many tools nowadays have basic drag and drop and wizard-based functionality which limits the need for coding, and the trend seems to move in this direction. Another trend is the building of minimizing the need for writing code to the degree that full applications can be built with drag and drops, however some degree of coding is still needed. It seems to be in demand the knowledge of one or two universal scripting languages and data-interchange formats.

Probably the main factor for naming somebody a programmer is whether he does this for a living. On the other side a person can identify himself as programmer even if his role involves only a small degree of programming or programing is more of a hobby. One can consider programming as a way of living, as a way of understanding and modelling life. This way of life borrows a little from the way of being of the mathematician, the philosopher and the engineer.

In the end is less important what’s the proper denomination. More important is with what one identifies himself and what one makes with his skills – the mental and machine-understandable universes one builds.

Previous Post <<||>> Next Post

01 May 2019

Database Management: Version Management (Part I: SQL Server Feature Bloat)

Database Management Series

In an old SSWUG editorial “SQL Server Feature Bloat” by Ben Taylor, the author raises the question on whether SQL Server features like the support for Unicode, the increase in page size for data storage to 8k or the storage of additional metadata and statistics create a feature bloat. He further asks on whether customers may consider other database solutions, and on whether this aspect is important for customers.

A software or feature bloat is the “process whereby successive versions of a computer program become perceptibly slower, use more memory, disk space or processing power, or have higher hardware requirements than the previous version - whilst making only dubious user-perceptible improvements or suffering from feature creep” (Wikipedia).

Taylor’s question seems to be entitled, especially when is considered the number of features added in the various releases of SQL Server. Independently on whether they attempt to improve performance, extend existing functionality or provide new functionality, many of these features target special usage and are hardly used by average applications that use SQL Server only for data storage. Often after upgrading to a new release, it may happen that the customers see no performance improvement in the features used or the performance even decays, while the new release needs more resources to perform the same tasks. This can make customers wonder on whether all these new features bring any benefit for them.

It’s easy to neglect the fact that the SQL Server is just used as storage layer in an architecture and more likely that some of the problems reside in the business or presentation layers. In addition, not always a solution is designed to take advantage of a database’s (latest) features. Besides, it may happen that the compatibility level is set to a lower value, so the latest functionality won’t be used at all.

Probably the customers hope that the magic will happen immediately after the upgrade. For some features like the ones regarding engine’s optimization are enabled by default and is expected a performance gain, however, to take advantage of the new features the existing applications need to be redesigned. With each new edition it’s important to look at the opportunities provided by the upgrades and analyze the performance benefit as there’s often a trade-off between benefit and effort on one side, respectively between technical advantages and disadvantages on the other.

The examples used by Taylor aren’t necessarily representative because they refer to changes made prior to SQL Server 2005 edition and there are good arguments for their implementation. The storage of additional metadata and statistics is neglectable in comparison with the size of the databases and the benefits, given that the database engine needs statistics so it can operate optimally. SQL Server moved from 2 KB pages to 8 KB pages between versions 6.5 and 7.0 probably because it offers a good performance with efficient use of space. The use of Unicode character set become a standard given that databases needed to support multiple languages.

Feature bloating is not a problem that concerns only SQL Server but also other database products like Oracle, DB2 or MySQL, and other types of software. Customers’ choice of using one vendor’s products over another is often a strategic decision in which the database is just a piece of a bigger architecture. In the TPC-H benchmarks SQL Server 2014 and 2016 scored during the last years higher than the competitors. It’s more likely that customers will move to SQL Server than vice-versa, when possible. Customers expect performance, stability and security and are willing to pay for them, as long the gain is visible.

29 April 2019

💼Project Management: Planning Correctly Misunderstood - Part I

It is sometimes helpful to take a step back, observe, and then logically generalize the extremes of the observed facts; if possible, without judging people’s behavior as there’s more to it as the eyes can perceive. In some cases however one can feel that the observed situations are really close to extreme. It’s the case of some tendencies met in project planning - not planning, planning for the sake of planning, expecting a plan to be perfect, setting a plan as fix, without the possibility of changing it in utile time, respectively changing the plan too often.

There are situations in which it’s better to be spontaneous and go with the flow. Managing a project isn’t one of these situations. As Lakein’s Law formulates it succinctly: "failing to plan is planning to fail", or paraphrasing Eisenhower (1) and Clausewitz (2) - plans are useless as no plan ever survived contact with the enemy (reality), but planning is indispensable - as a plan increases awareness about project’s scope, actions, challenges, risks and opportunities, and allows devising the tactics and logistics needed to reach the set goals. Even if the plan doesn’t reflect anymore the reality, it can still be adapted to fit the new requirements. The more planning experience one has the more natural it becomes to close the gap between the initial plan and reality, and of adapting the plan as needed.

There’s an important difference between doing something because one is forced to do it and doing it because one sees and understands the value of planning. There's the tendency to plan for the sake of planning, because there's the compel to do it. Besides the fact that it documents the what, when, why and who, and that is used as a basis for action, the plan must reflect project’s current status and the activities planed for the next reporting cycle. As soon a plan is not able to reflect these aspects it becomes thus in time unusable.

The enemy of a good plan can prove to be the dream of a perfect plan (3). Some may think that the holy grail of planning is the perfect plan, that the project can’t start until all the activities were listed to the lowest detail and the effort thoroughly assigned. Few plans actually survive the contact with the reality and there can be lot of energy lost by working on the perfect plan.

Another similar behavior, rooted mainly in the methodologies used, is that of not allowing a plan to be changed for a part or whole duration of the project. Publilius Syrus recognized more than two millennia ago that a plan that admits no modification is a bad plan (4) per se. Methodologies and practices that don’t allow a flexible way of changing the plan make no service to projects. Often changes need to occur immediately and not at an ideal point in time, when maybe the effect is lost.

Modern Project Management tools allow building the dependencies between the various activities and it’s inevitable that a change in one place will cause a chain reaction and lead to a contraction or dilatation of the plan, and this can happen with each planning iteration. In extremis the end date will alternate as the lines of a seismograph during an earthquake. It’s natural for this to happen in projects in a first phase, however it’s in Project Manager’s attribution to mitigate such variations.

The project plan is a reflection of the project and how it’s managed, therefore, one needs to give it the proper focus, how often and how detailed required.

Referenced quotes:
(1) "In preparing for battle I have always found that plans are useless, but planning is indispensable." (Eisenhower quoted by Nixon)
(2) "No plan ever survived contact with the enemy. " (Carl von Clausewitz)
(3) "The enemy of a good plan is the dream of a perfect plan." (Carl von Clausewitz)
(4) "It's a bad plan that admits of no modification." (Publilius Syrus)

🗄️Data Management: Data Integration (Part I: From Disintegration to Integration)

Data Management Series

No matter how tight the integration between the various systems or processes there will be always gaps that need to be addressed in one way or another. The problems are in general caused by design errors rooted in the complexity of the logic from the integration layer or from the systems integrated. The errors can range from missing or incorrect validation rules, mappings and parameters to data quality issues.

A unidirectional integration involves distributing data from one system (aka publisher) to one or more systems (aka subscribers), while in bidirectional integrations systems can act as publishers and subscribers, resulting thus complex data flows with multiple endpoints. In simplest integrations the records flow one-to-one between systems, though more complex scenarios can involve logic based on business rules, mappings and other type of transformations. The challenge is to reflect the states as needed by the system with minimal involvement from the users.

Typically, it falls in application/process owners or key users’ responsibilities to make sure that the integration works smoothly. When the integration makes use of interface or staging tables they can be used as starting point for the troubleshooting, however even then the troubleshooting can be troublesome and involve a considerable manual effort. When possible the data can be exported manually from the various systems and matched in Excel or similar solutions. This leads often to personal or departmental solutions hard to maintain, control and support.

A better approach is to automatize the process by importing the data from the integrated systems at regular points in time into the same database (much like in a data warehouse), model the entities and the needed logic in there, and report the differences. Even if this approach involves a small investment in the beginning and some optimization in logic or performance over time, it can become a useful tool for troubleshooting the differences. Such solutions can be used successfully in multiple integration scenarios (e.g. web shop or ERP integrations).

A set of reports for each entity can help identify the differences between the various entities. Starting from the reported differences the users can identify, categorize and devise specific countermeasures for the various issues. The best time to have such a solution is shortly before or during UAT. This would allow to make sure that the integration layer really works, and helps correcting the issues as long they still have a small impact on the systems. Some integration issues might even lead to a postponement of the Go-Live. The second best time is during the time the first important issues were found, as the issues can be used as support for a Business Case for implementing this type of solutions.

In general, it’s recommended to fix the problems in the integration layer and use the reports only for troubleshooting and for assuring that the integration runs smoothly. There are however situations in which the integration problems can’t be fixed without creating more issues. It’s the case in which multiple systems are involved and integrated over an integration bus.

One extreme approach, not advisable though, is to build a second integration to correct the issues of the first. This solution might work in theory however there’s the risk of multiplying the issues is really high and the complexity of troubleshooting increases with the degree of dependency between the two integrations. It would be more advisable to rebuild the integration anew, however also this approach has its advantages and disadvantages.

Bottom line is that integration issues should be addressed while they are small and that an automated solution for comparing the data can help in the process

24 April 2019

💼Project Management: Project Execution (Part V: The Butterflies of Project Management)

Expressed metaphorically as "the flap of a butterfly’s wings in Brazil set off a tornado in Texas”, in Chaos Theory the “butterfly effect” is a hypothesis rooted in Edward N Lorenz’s work on weather forecasting and used to depict the sensitive dependence on initial conditions in nonlinear processes, systems in which the change in input is not proportional to the change in output.

Even if overstated, the flapping of wings advances the idea that a small change (the flap of wings) in the initial conditions of a system cascades to a large-scale chain of events leading to large-scale phenomena (the tornado) . The chain of events is known as the domino effect and represents the cumulative effect produced when one event sets off a chain of similar events. If the butterfly metaphor doesn’t catch up maybe it’s easier to visualize the impact as a big surfing wave – it starts small and increases in size to the degree that it can bring a boat to the shore or make an armada drown under its force.

Projects start as narrow activities however the longer they take and the broader they become tend to accumulate force and behave like a wave, having the force to push or drawn an organization in the flood that comes with it. A project is not only a system but a complex ecosystem - aggregations of living organisms and nonliving components with complex interactions forming a unified whole with emergent behavior deriving from the structure rather than its components - groups of people tend to self-organize, to swarm in one direction or another, much like birds do, while knowledge seems to converge from unrelated sources (aka consilience).

 Quite often ignored, the context in which a project starts is very important, especially because these initial factors or conditions can have a considerable impact reflected in people’s perception regarding the state or outcomes of the project, perception reflected eventually also in the decisions made during the later phases of the project. The positive or negative auspices can be easily reinforced by similar events. Given the complex correlations and implications, aspects not always correct perceived and understood can have a domino effect.

The preparations for the project start – the Business Case, setting up the project structure, communicating project’s expectation and addressing stakeholders’ expectations, the kick-off meeting, the approval of the needed resources, the knowledge available in the team, all these have a certain influence on the project. A bad start can haunt a project long time after its start, even if the project is on the right track and makes a positive impact. In reverse, a good start can shade away some mishaps on the way, however there’s also the danger that the mishaps are ignored and have greater negative impact on the project. It may look as common sense however the first image often counts and is kept in people’s memory for a long time.

As people are higher perceptive to negative as to positive events, there are higher the chances that a multitude of negative aspects will have bigger impact on the project. It’s again something that one can address as the project progresses. It’s not necessarily about control but about being receptive to the messages around and of allowing people to give (constructive) feedback early in the project. It’s about using the positive force of a wave and turning negative flow into a positive one.

Being aware of the importance of the initial context is just a first step toward harnessing waves or winds’ power, it takes action and leadership to pull the project in the right direction.

22 April 2019

💼Project Management: Tools (Part I: The Choice of Tools in Project Management)

“Beware the man of one book” (in Latin, “homo unius libri”), a warning generally attributed to Thomas Aquinas and having a twofold meaning. In its original interpretation it was referring to the people mastering a single chosen discipline, however the meaning degenerated in expressing the limitations of people who master just one book, and thus having a limited toolset of perspectives, mental models or heuristics. This later meaning is better reflected in Abraham Maslow adage: “If the only tool you have is a hammer, you tend to see every problem as a nail”, as people tend to use the tools they are used to also in situations in which other tools are more appropriate.

It’s sometimes admirable people and even organizations’ stubbornness in using the same tools in totally different scenarios, expecting though the same results, as well in similar scenarios expecting different results. It’s true, Mathematics has proven that the same techniques can be used successfully in different areas, however a mathematician’s universe and models are idealistically fractionalized to a certain degree from reality, full of simplified patterns and never-ending approximations. In contrast, the universe of Software Development and Project Management has a texture of complex patterns with multiple levels of dependencies and constraints, constraints highly sensitive to the initial conditions.

Project Management has managed to successfully derive tools like methodologies, processes, procedures, best practices and guidelines to address the realities of projects, however their use in praxis seems to be quite challenging. Probably, the challenge resides in stubbornness of not adapting the tools to the difficulties and tasks met. Even if the same phases and multiple similarities seems to exist, the process of building a house or other tangible artefact is quite different than the approaches used in development and implementation of software.

Software projects have high variability and are often explorative in nature. The end-product looks totally different than the initial scaffold. The technologies used come with opportunities and limitations that are difficult to predict in the planning phase. What on paper seems to work often doesn’t work in praxis as the devil lies typically in details. The challenges and limitations vary between industries, businesses and even projects within the same organization.

Even if for each project type there’s a methodology more suitable than another, in the end project particularities might pull the choice in one direction or another. Business Intelligence projects for example can benefit from agile approaches as they enable to better manage and deliver value by adapting the requirements to business needs as the project progresses. An agile approach works almost always better than a waterfall process. In contrast, ERP implementations seldom benefit from agile methodologies given the complexity of the project which makes from planning a real challenge, however this depends also on an organization’s dynamicity.

Especially when an organization has good experience with a methodology there’s the tendency to use the same methodology across all the projects run within the organization. This results in chopping down a project to fit an ideal form, which might be fine as long the particularities of each project are adequately addressed. Even if one methodology is not appropriate for a given scenario it doesn’t mean it can’t be used for it, however in the final equation enter also the cost, time, effort, and the quality of the end-results.

In general, one can cope with complexity by leveraging a broader set of mental models, heuristics and set of tools, and this can be done only though experimentation, through training and exposing employees to new types of experiences, through openness, through adapting the tools to the challenges ahead.

21 April 2019

💼Project Management: Project Planning (Part II: Planning Correctly Misunderstood II)

Even if planning is the most critical activity in Project Management it seems to be also one of the most misunderstood concepts. Planning is critical because it charters the road ahead in terms of what, when, why and who, being used as a basis for action, communication, for determining the current status in respect to the initial plan, as well the critical activities ahead.

The misunderstandings derive maybe also from the fact that each methodology introduces its own approach to planning. PMI as traditional approach talks about baseline planning with respect to scope schedule and costs, about management plans, which besides the theme covered in the baseline, focus also on quality, human resources, risks, communication and procurement, and separate plans can be developed for requirements, change and configuration management, respectively process improvement. To them one can consider also action and contingency planning.

In Prince2 the product-based planning is done at three levels – at project, stage, respectively team level – while separate plans are done for exceptions in case of deviations from any of these plans; in addition there are plans for communication, quality and risk management. Scrum uses an agile approach looking at the product and sprint backlog, the progress being reviewed in stand-up meetings with the help of a burn-down chart. There are also other favors of planning like rapid application planning considered in Extreme Programming (XP), with an open, elastic and undeterministic approach. In Lean planning the focus is on maximizing the value while minimizing the waste, this being done by focusing on the value stream, the complete list of activities involved in delivering the end-product, value stream's flow being mapped with the help of visualization techniques such as Kanban, flowcharts or spaghetti diagrams.

With so many types of planning nothing can go wrong, isn’t it? However, just imagine customers' confusion when dealing with a change of methodology, especially when the concepts sound fuzzy and cryptic! Unfortunately, also the programmers and consultants seem to be bewildered by the various approaches and the philosophies supporting the methodologies used, their insecurity bringing no service for the project and customers’ peace of mind. A military strategist will more likely look puzzled at the whole unnecessary plethora of techniques. On the field an army has to act with the utmost concentration and speed, to which add principles like directedness, maneuver, unity, economy of effort, collaboration, flexibility, simplicity and sustainability. It’s what Project Management fails to deliver.

Similarly to projects, the plan made before the battle seldom matches the reality in the field. Planning is an exercise needed to divide the strategy in steps, echelon and prioritize them, evaluate the needed resources and coordinate them, understand the possible outcomes and risks, evaluate solutions and devise actions for them. With a good training, planning and coordination, each combatant knows his role in the battle, has a rough idea about difficulties, targets and possible ways to achieve them; while a good combatant knows always the next action. At the same time, the leader must have visibility over fight’s unfold, know the situation in the field and how much it diverged from the initial plan, thus when the variation is considerable he must change the plan by changing the priorities and make better use the resources available.

Even if there are multiple differences between the two battlefields, the projects follow the same patterns of engagement at different scales. Probably, Project Managers can learn quite of a deal by studying the classical combat strategists, and hopefully the management of projects would be more effective and efficient if the imperatives of planning, respectively management, were better understood and addressed.

#️⃣Software Engineering: Programming (Part VIII: Pair Programming)

Software Engineering Series

“Two heads are better than one” – a proverb whose wisdom is embraced today in the various forms of harnessing the collective intelligence. The use of groups in problem solving is based on principles like “the collective is more than the sum of its individuals” or that “the crowds are better on average at estimations than the experts”. All well and good, based on the rationality of the same proverb has been advanced the idea of having two developers working together on the same piece of code – one doing the programming while the other looks over the shoulder as a observer or navigator (whatever that means), reviewing each line of code as it is written, strategizing or simply being there.

This approach is known as pair programming and considered as an agile software development technique, adhering thus to the agile principles (see the agile manifesto). Beyond some intangible benefits, its intent is to reduce the volume of defects in software and thus ensure an acceptable quality of the deliverables. It’s also an extreme approach of the pear review concept.

Without considering whether pair programming adheres to the agile principles, the concept has several big loopholes. The first time I read about pair programming it took me some time to digest the idea – I was asking myself what programmer will do that on a daily basis, watching as other programmers code or being watched while coding, each line of code being followed by questions, affirmative or negative nodding… Beyond their statute of being lone wolves, programmers can cooperate when the tasks ahead requires it, however to ask a programmer watch actively as others program it won’t work on the long run!

Talking from my own experience as programmer and of a professional working together with other programmers, I know that a programmer sees each task as a challenge, a way of learning, of reaching beyond his own condition. Programming is a way of living, with its pluses and minuses.

Moreover, the complexity of the tasks doesn’t resume at handling the programming language but of resolving the right problem. Solving the right problem is not something that can one overcome with brute force but with intelligence. If using the programming language is the challenge then the problem lies somewhere else and other countermeasures must be taken!

Some studies have identified that the use of pair programming led to a reduction of defects in software, however the numbers are misleading as long they compare apples with pears. To statistically conclude that one method is better than the other means doing the same experiment with the different methods using a representative population. Unless one addressees the requirements of statistics the numbers advanced are just fiction!

Just think again about the main premise! One doubles the expenditure for a theoretical reduction of the defects?! Actually, it's more than double considering that different types of communication takes place. Without a proven basis the effort can be somewhere between 2.2 and 2.5 and for an average project this can be a lot! The costs might be bearable in situations in which the labor is cheap, however programmers’ cooperation is a must.

The whole concept of pair programming seems like a bogus idea, just like two drivers driving the same car! This approach might work when the difference in experience and skills between developers is considerable, that being met in universities or apprenticeship environments, in which the accent is put on learning and forming. It might work on handling complex tasks as some adepts declare, however even then is less likely that the average programmer will willingly do it!

Previous Post <<||>> Next Post

19 April 2019

🌡Performance Management: Mastery (Part I: The Need for Perfection vs. Excellence)

A recurring theme occurring in various contexts over the years seemed to be corroborated with the need for perfection, need going sometimes in extremis beyond common sense. The simplest theory attempting to explain at least some of these situations is that people tend to confuse excellence with perfection, from this confusion deriving false beliefs, false expectations and unhealthy behavior.

Beyond the fact that each individual has an illusory image of what perfection is about, perfection is in certain situations a limiting force rooted in the idealistic way of looking at life. Primarily, perfection denotes that we will never be good enough to reach it as we are striving to something that doesn’t exist. From this appears the external and internal criticism, criticism that instead of helping us to build something it drains out our energy to the extent that it destroys all we have built over the years with a considerable effort. Secondarily, on the long run, perfection has the tendency to steal our inner peace and balance, letting fear take over – the fear of not making mistakes, of losing the acceptance and trust of the others. It focuses on our faults, errors and failures instead of driving us to our goals. In extremis it relieves the worst in people, actors and spectators altogether.

In its proximate semantics though at diametral side through its implications, excellence focuses on our goals, on the aspiration of aiming higher without implying a limit to it. It’s a shift of attention from failure to possibilities, on what matters, on reaching our potential, on acknowledging the long way covered. It allows us building upon former successes and failures. Excellence is what we need to aim at in personal and professional life. Will Durant explaining Aristotle said that: “We are what we repeatedly do. Excellence, then, is not an act, but a habit.”

People who attempt giving 100% of their best to achieve a (positive) goal are to admire, however the proximity of 100% is only occasionally achievable, hopefully when needed the most. 100% is another illusory limit we force upon ourselves as it’s correlated to the degree of achievement, completeness or quality an artefact or result can ideally have. We rightly define quality as the degree to which something is fit for purpose. Again, a moving target that needs to be made explicit before we attempt to reach it otherwise quality envisions perfection rather than excellence and effort is wasted.

Considering the volume of effort needed to achieve a goal, Pareto’s principles (aka the 80/20 rule) seems to explain the best its underlying forces. The rule states that roughly 80% of the effects come from 20% of the causes. A corollary is that we can achieve 80% of a goal with 20% of the effort needed altogether to achieve it fully. This means that to achieve the remaining 20% toward the goal we need to put four times more of the effort already spent. This rule seems to govern the elaboration of concepts, designs and other types of documents, and I suppose it can be easily extended to other activities like writing code, cleaning data, improving performance, etc.

Given the complexity, urgency and dependencies of the tasks or goals before us probably it's beneficial sometimes to focus first on the 80% of their extent, so we can make progress, and focus on the remaining 20% if needed, when needed. This concurrent approach can allow us making progress faster in incremental steps. Also, in time, through excellence, we can bridge the gap between the two numbers as is needed less time and effort in the process.

18 April 2019

🪧Meta-Blogging: Mea Culpa (Part I: Changing the Status Quo)

During the past years I started multiple posts on various programming-related topics though I seldom managed to bring something close to a publishable form. The main reason seems to be the lack of time needed to put an idea into words, to look at it from different perspectives in form of a logical meaningful unit and, last but not the least, make it count. This is accentuated by the fact that each idea pulls another, and often there are so many things to say that it’s hard to find a delimitation between what to be included and what to be left out. In extremis one feels that something is missing.

Often, it's required a certain amount of research needed to validate or support the facts. The knowledge about SQL Server and other DBMS is relative – it can be only relative as long their internals are known only to a certain degree. The relativity is found also in the area of applicability, the usage of a solution over another lying in details. Readers want solid facts while all one can give is a dry “it depends”…

Unfortunately, for a blogger not found close to the source of knowledge, the content posted tends to be third or fourth-hand knowledge and, in one form or the other, just duplication of information. As long content isn’t copied and there’s some personal touch the duplication is not necessarily a bad thing. Duplication makes knowledge more likely to be found as the content is indexed by search engines, however it becomes more difficult to stand in the crowd. To bring something new one must to put existing knowledge into new contexts, to be creative, and this takes time as well.

Without access to a pool of readers and of knowledge for a lone blogger it’s hard to succeed, giving up being just a few posts or a few years away. Of course, life tends to take over. It’s also in human nature to be enthusiastic about an idea and renounce shortly with the first difficulties met. On the other side, often it’s hard to keep or to find the needed motivation, especially when there is little support coming from the blogging platforms, tools creators or content publishers. Not being able to monetize one’s effort makes blogging more of a hobby.

With small exceptions, the investments made in blogging tools are below expectations. It’s frustrating when the tools or the integration between them stopped working and there’s no simple way to overcome this. Some aspects changed with time, however blogging seems to lose in contrast with other forms of media content.

Despite the lack of time and other difficulties I want to write and share my thoughts, my experience, make the time invested in learning and solving problems count. Blogging is also a way to externalize the implicit knowledge, of sharing, of questioning some of the ideas and practices, and ultimately of getting feedback. In this resides the personal value of blogging!

In the fight with time and words, I found myself forced to limit the length of the posts on some random nontechnical topics to 600 words. This number is rooted in the university years, representing the proximate limit of a written assignment to include an acceptable quality and coverage, and involve a bearable amount of effort. 600 is not a perfect number as its leading digit though, for the time being will do.

The challenge is to find a context to express my thoughts and experience without being too boring, without skimming through ideas. Without carrying great expectations, it’s an attempt to change the status quo!

Previous Post <<||>>Next Post

12 March 2019

🧭Business Intelligence: Enterprise Reporting (Part XII: Reports’ Lifecycle)

Introduction

A report’s lifecycle is the sequence of stages through which a report goes during the timespan of its ownership. The main stages resume mainly to report’s definition, development, testing and deployment, however a report’s life occurs within the context of IT processes like Change, Incident/Problem, Access, Availability, Information Security and Knowledge Management. To them can add up Data Management processes like Data Governance, Data Quality and Metadata Management. Therefore, the extended reports’ lifecycle could take the following form:

The processes can be easily tailored to an organization’s needs, even if it may take several attempts until the best mix is found. The activities introduced by the supporting processes don’t necessarily change the way reports are developed as long the processes integrate smoothing in report’s authoring.

Definition Phase

The lifecycle of a report starts with a series of steps that lead to report’s definition and the requirements associated with it:

The starting point is the identification of a need for data. It can be a business question that needs to be answered, a decision that needs to be made, data needed to keep an operational, tactical or strategical objective under control, and so on. Such business situations can be referred simple as (business) problems.

Problem definition

Problem definition (statement) is the process by which a business issue or need is clearly and concisely stated. This step might seem trivial and implied, however in praxis correlated to it lies the most important volume of overwork.

The dictum “a problem well stated is a problem half-solved” applies as well in BI field. Unfortunately, there are cases in which the users want something else than stated or they leave important details out. Sometimes the users aren’t sure what they need/want, and it comes in developer’s attributions to help clarify the problem and put it within a context.

There are cases in which the users just request a report without specifying the problem they need to solve. This might do when the user has a good understanding of the data and the problem, however this approach does not always work. Personally, I find it useful to define for each report also the underneath problem. I see it as a “win-win” situation in which the user invests some knowledge into the developer and thus the developer will better understand the business, while in time he can provide better help. A thorough understanding of the business and knowledge of the users and their needs can help minimize the volume of overwork involved in reports’ development.

Requirements definition

Requirements definition is the process by which functional and non-functional expectations, targets and specifications are elicited and documented.

Functional requirements specify what the report must do - how the report is structured or formatted, how data need to be visualized or navigated, to what file formats need to be exported, on whether needs to be printed, how the data needs to be grouped, in which order, in what currency/language needs to be displayed, what data sources need to be used, etc. The functional requirements are typically listed in the use case and test script.

Non-functional requirements refer to requirements related to report’s accessibility, availability, performance, compliance, documentation, quality, maintainability, security or testability.

The degree to which a requirement can be fulfilled depends entirely on the reporting platform. It can be differentiated between soft and hard constraints. Soft constraints can be overcome by adding more processing power, memory or other types of resources, while hard constraints can’t be easily or at all overcome. Of course, not all requirements are equally important. Important not fulfilled requirements can make a report unusable and, in extremis, can lead to choosing one reporting platform over another.

The requirements can be elicited by a developer, an analyst/consultant or defined by the business itself. Organizations can simplify the process by defining a set of guidelines and standards that need to be considered in reports’ definition. Normally, is enough to reference the document(s) where the guidelines and standards are found. In contrast to other software artifacts, the requirements for reports can be gather in a simplified version of a document. Quite often a checklist can help identify these requirements upfront with a minimum of overhead.

Report definition

Report definition is the process by which report’s content, logic and layout are explicitly defined - what attributes are needed for output and from what source, what static/dynamic parameters are needed, how the data need to be displayed/formatted, what formulas, aggregations or ordering apply.

A report’s definition can be anything between a simple statement summarizing what the report is about and complex structures (mainly in form of a mapping) reflecting in detail each attribute, constraint, formula, grouping or sorting.

A good definition should allow a developer to create the report as needed by the users, eventually with minimal deviations implied by user’s understanding. The holy grail in report’s definition is finding a structure flexible enough to cover all the aspects of a report. Even if some structures allow such flexibility, sometimes it’s almost impossible not provide additional descriptions in textual forms. The less insight the developer has into the business, the more textual descriptions and visuals are needed to be included to support the knowledge gap.

GAP Analysis

GAP Analysis is the iterative process by which the current state of a software artifact or situation is compared with the potential or desired state. It became an integrant tool from professionals’ thinking to the extent its role as separate process is quite often ignored. In the context of reporting authoring it can be used when comparing the requirements against the current infrastructure and the data available, as well while comparing the developed report against the requirements.

It can happen that the technical and data constraints don’t allow building the report as needed by the users. The differences need to be mitigated and eventually the requirements need to be changed to accommodate the reality. In extremis must be considered whether the report still make sense in the light of the modified requirements.

Solution formulation

Solution formulation is the process by which a formal (technical) solution is defined for the given requirements. It’s a conceptualization (aka concept) of the requirements, and in many cases it’s just a short description by which means the report will be build and what data sources will be used. In more complex cases it can include details about the changes needed in the infrastructure to support the report (e.g. creation/extensions of tables and other database objects, ETL jobs, components, etc.), about the data that need to be collected, etc.

Of course, the conceptualization must be considered together with report’s definition. In fact, report’s definition can be considered as part of the conceptualization. A conceptualization can cover multiple reports, as well two or more different solutions can be provided for different sets of reports. The infrastructure can make a concept futile, either when there is a single reporting platform, or when clear rules are in place.

Prototyping

Prototyping is the iterative process of building a simplified version of the report for demonstration and evaluation purposes, so that users can better define the requirements or to prove the concept. The prototype is a preliminary version that can be refined successively until user’s requirements have a final form. It can take the form of a mock-up query to verify report’s technical and logical feasibility, and/or an Excel layout to depict how the report will look like. Prototypes can facilitate the communication between the parties involved and can be considered as part of the requirements.

A prototype might be needed 1 from 5 cases or so, however this number depends also on the number of queries available or of the knowledge of the source and business processes. Because a prototype can involve additional work, it’s important to identify those cases in which a prototype makes sense and keep the effort to a minimum, especially when an approval is involved in the process. Therefore, one should consider the most important characteristics that need to be proved (e.g. if the data can be aggregated, matched, displayed at the requested level of detail, or in the requested format).

With the help of self-service tools, the business has the capabilities to play with the data and find answers by itself, being able thus to create a prototyped version of the report. Once the report met business needs it can be standardized so it can be used organization-wide. It’s recommended to standardize the reports that are used as part of organization’s processes, otherwise self-service can become a bottleneck for the organization.

Change Management

Change Management is the process of ensuring that the changes performed to a system, in this case a BI tool or the whole BI infrastructure, are performed with minimal disruption for the business and that risks are kept under control. Changes can be requested via standard requests or change requests. A standard request (SR) is a pre-approved change that involves low risks, is relatively common and follows a predefined procedure. In contrast to SRs, a change request (CR) requires the authorization of a board, e.g. the Change Advisory Board (CAB), it often involves risks, an investment and the approach is not that common.

Both are hard-copy or electronic templates that allow to capture information about the changes and allow to document the change and track its status. They include typically the problem definition together with users’ requirements, report definition and the formulation of the solution. What differentiates them thus is the approval process that can be sometimes time-consuming, and the volume of formalism needed to manage the requests (e.g. tracking status, writing status reports, handling risks, etc.).

Unless infrastructural changes are necessary, the risks involved with the creation of reports are relatively small, especially when the reports are developed in-house. Reports developed by vendors involve more risks and imply investments that in a form or other need to be approved. Considering the particularities of the two approaches, personally I think that reports that can be developed with internal resources should be done via SRs, while reports developed externally should be done via CRs. Even if this categorization has the potential of creating some confusion, the use of SRs allows reducing the volume of effort necessary to manage the requests. I suppose there can be found solutions to request external changes via SRs as well (e.g. by using contingents and a set of well-defined rules).

06 February 2019

🤝Governance: COBIT (Definitions)

"An IT governance framework and supporting toolset that allows managers to bridge the gap between control requirements, technical issues, and business risks. COBIT enables clear policy development and good practice for IT control throughout organizations. COBIT is managed by the IT Governance Institute and the Information Systems Audit and Control Foundation® (ISACF)." (Tilak Mitra et al, "SOA Governance", 2008)

"COBIT is a set of standards from the IT Governance Institute relating to IT Governance. It defines a set of governance control objectives to help guide the IT organization in making appropriate decisions for each domain." (Martin Oberhofer et al, "Enterprise Master Data Management", 2008)

"An internationally accepted IT governance and control framework that aligns IT business objectives, delivering value and managing associated risks." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"An IT framework with a focus on governance and managing technical and business risks." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"A management framework used for IT governance. COBIT 5 is based on five principles and provides organizations with a set of good practices they can apply to IT management and IT governance." (Darril Gibson, "Effective Help Desk Specialist Skills", 2014)

"A process-based information technology governance framework that represents a consensus of experts worldwide. It was codeveloped by the IT Governance Institute and ISACA." (Robert F Smallwood, "Information Governance: Concepts, Strategies, and Best Practices", 2014)

"A framework that provides best practices for IT governance and control." (Weiss, "Auditing IT Infrastructures for Compliance" 2nd Ed., 2015)

"Provides guidance and best practice for the management of IT processes" (ITIL)

30 January 2019

🤝Governance: Compliance (Definitions)

"(1) Conforming or acquiescing to requirements from a third party. (2) A subset of data retention policies and procedures that must adhere to more rigid and rigorous conditions." (David G Hill, "Data Protection: Governance, Risk Management, and Compliance", 2009)

"The successful fulfillment of regulations, usually set by a financial institution (for borrowing purposes) or industry standards." (Annetta Cortez & Bob Yehling, "The Complete Idiot's Guide® To Risk Management", 2010)

"The process of conforming, completing, performing, or adapting actions to meet the rules, demands, or wishes of another party. Commonly used when discussing conformance to external government or industry regulations." (Craig S Mullins, "Database Administration: The Complete Guide to DBA Practices and Procedures 2nd Ed", 2012)

"The ability to operate in the way defined by a regulation. Many organizations are introduced to governance concepts as they begin the process of complying with business regulations, such as Sarbanes|Oxley or Basel II. These regulations are enforced by audits that determine whether business decisions were made by the appropriate staff according to appropriate policies. To pass these audits, organizations must document their decision rights, policies, and records, specifically that each of the decisions was in fact made by the appropriate person according to policy." (Paul C Dinsmore et al, "Enterprise Project Governance", 2012)

"A general concept of conforming to a rule, standard, law, or requirement such that the assessment of compliance results in a binomial result stated as 'compliant' or 'noncompliant'." (For Dummies, "PMP Certification All-in-One For Dummies, 2nd Ed.", 2013)

"Business rules enforced by legislation or some other governing body" (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"Compliance refers to a strategy and a set of activities and artifacts that allow teams to apply Lean-Agile development methods to build systems that have the highest possible quality, while simultaneously assuring they meet any regulatory, industry, or other relevant standards." (Dean Leffingwell, "SAFe 4.5 Reference Guide: Scaled Agile Framework for Lean Enterprises 2nd Ed", 2018)

"Ensuring that a standard or set of guidelines is followed, or that proper, consistent accounting or other practices are being employed." (ITIL)

"The capability of the software product to adhere to standards, conventions or regulations in laws and similar prescriptions." [ISO 9126]

SQL Troubles

Pages

04 May 2019

🧊Data Warehousing: Architecture (Part I: Push vs. Pull)

#️⃣Software Engineering: Programming (Part X: Programming as Art)

02 May 2019

#️⃣Software Engineering: Programming (Part IX: Programmer, Coder or Developer?)

01 May 2019

Database Management: Version Management (Part I: SQL Server Feature Bloat)

29 April 2019

💼Project Management: Planning Correctly Misunderstood - Part I

🗄️Data Management: Data Integration (Part I: From Disintegration to Integration)

24 April 2019

💼Project Management: Project Execution (Part V: The Butterflies of Project Management)

22 April 2019

💼Project Management: Tools (Part I: The Choice of Tools in Project Management)

21 April 2019

💼Project Management: Project Planning (Part II: Planning Correctly Misunderstood II)

#️⃣Software Engineering: Programming (Part VIII: Pair Programming)

19 April 2019

🌡Performance Management: Mastery (Part I: The Need for Perfection vs. Excellence)

18 April 2019

🪧Meta-Blogging: Mea Culpa (Part I: Changing the Status Quo)

12 March 2019

🧭Business Intelligence: Enterprise Reporting (Part XII: Reports’ Lifecycle)

Introduction

Definition Phase

Problem definition

Requirements definition

Report definition

GAP Analysis

Solution formulation

Prototyping

Change Management

06 February 2019

🤝Governance: COBIT (Definitions)

30 January 2019

🤝Governance: Compliance (Definitions)

About Me