Showing posts with label costs. Show all posts
Showing posts with label costs. Show all posts

11 June 2024

Business Intelligence: Is Microsoft Fabric Ready? (Part I: Some Random Thoughts)

Business Intelligence Series
Business Intelligence Series

When writing a Business Case, besides the problem and solution(s) high-level descriptions, is important to rougly estimate how much it costs, how long it takes, respectively how many resources are needed and for what activities. A proof-of-concept (PoC) might not need an explicit business case, though the same high-level information is needed at least for the planning of resources and a formal approval.

Given that there are several analytical experiences in Microsoft Fabric (MF), it’s clear that can’t be anymore a reference architecture that can be recommended for customers. Frankly, that ship has sailed even since the introduction of Microsoft Synapse, if not earlier, with the movement to the cloud. Also, there’s no one size fits all as certain building blocks make sense only in certain scenarios (e.g. organization scale, data volume or source’s type). Moreover, even if MF has been generally available for quite some time, customers and service providers ask themselves whether the available features are enough for building analytics solutions based on it. 

“Is Fabric Ready?” was the topic of today’s Explicit Measures webcast [1]. Probably the answer is as usual “it depends” and the general recommendation is to do a PoC to check solution's feasibility. Conversely, MF may be the best approach to consider if integration with other systems (e.g. Dynamics 365, Dataverse) is needed. 

What the customers need are some rough realistic estimates they can base any planning upon (at least for a PoC if not for the whole project) in terms of making the data available into OneLake, building a semantic model, respectively processing and making the data available for consumption. Ideally, one needs a translation of the various steps as done earlier. For example, how long it takes to make the data available in OneLake, how long it takes to move the data physically or logically though the various layers, to build semantic models, etc. 

Probably, some things can be achieved in a matter of days, at least if one knows what one’s doing. However, we are talking here about a new architecture that may resemble for some of an unknow territory. Even if old and new techniques can be mixed, there are further implications or improvements that can be considered. There are many webcasts, blog posts and other material on how to do things, on what’s possible, though building a functioning solution from beginning to the end, even as PoC, requires more than putting all this together. 

Just making the data flow from point A to B or C is not enough - data security, data governance and a few other topics like scalability and availability need to be considered as well. Security and governance are also the areas in which probably more features must be considered. For many customers starting now with MF, the hope is that most of these features will be available during the time the solutions are ready for production.

From a cost perspective, there’s the cost of data at rest, in transit, the licensing for MF and the other components involved. Ideally, one should start small and increase capacities as needed, though small can vary from case to case, while it’s important to find out the optimum. Starting in the middle could be an alternative approach even if may involve higher costs. If one starts small, the costs for PoC can be neglectable, though sooner or later a compromise is needed to provide an acceptable performance. 

In terms of human resources, the topic is more complex (see [2]), and it depends largely on the nature of the project. The pool of skillsets is the most important constraint or enabler such projects can have.

Previous Post <<||>> Next Post

References:
[1] Explicit Measures (2024) Power BI tips Ep.327: Is Fabric Ready? (link)
[2] Explicit Measures (2024) Power BI tips Ep.321: Building and BI Team (link)

07 March 2024

Data Migrations (DM): The SQL Server Perspective (Licensing Costs and Edition Choices)

Data Migration
Data Migration Series

A Data Migration (DM) moves all or a subset of the data available from one or more system(s) into other system(s). For this purpose, especially in ERP Implementations, one can use a SQL Server as intermediate layer, where SSIS can be used for the data extraction and exporting, SSRS for reporting the errors, while the database engine for the heavy processing. Master Data and Data Quality Services can be used as well in certain scenarios. Therefore, SQL Server allows by design to address the various challenges related to a DM. At high level the architecture can be depicted as follows:

Data Migration Architecture
Data Migration Architecture

Once the decision to go with SQL Server for the DM layer is made, one needs to define which edition to use. If the DM doesn't have special requirements, one can use for it an available SQL Server instance, as long as the cumulated workloads don't create major issues. Therefore, in the past I used existing licensed versions of SQL Server to build solutions for DMs in ERP implementations, though I evaluated in each project whether it's possible to reduce the costs and remain compliant with the license requirements. 

Of course, there's always the alternative of using SQL Server Express which supports databases with a maximum of 10 GB, which should be enough for most of DMs, though it has also further limitations (see [2]). There are also ways of moving around existing limitations, like splitting the logic across multiple databases. 

Then there's the SQL Server Developer edition, which involves no license costs, has the full SQL Server functionality available, and can be used to build and test applications. In a recent post [1], Bob Ward, principal architect at Microsoft made several clarifications on the licenses for the Developer edition, which is "licensed for development, test, and demonstration purposes only" and "may not be used in a production environment”. Bob Ward makes the following clarifications:
(1) "Production environments include any system that is accessed by end-users for anything more than acceptance testing, environments that connects to production systems (such as Linked servers), disaster recovery or backups of production systems, and environments that are 'rotated' into production at any point in time." [1]
(2) One "cannot use Developer edition to build test data and move that same data into production" [1].
(3) One can "restore a production set of data backup for testing purposes" [1].

There are two-three impediments for using the Developer edition completely for a DM. The first, at least during Go Live and UAT, one needs to work with data coming directly from the various production environments. Secondly, the data generated by the solution are used primarily for UAT and in a second step for Production, which seems to be against the rule (2), or at least it's a grey area (which might be overlooked by Microsoft). Thirdly, some data from the production environment might need to be imported back into the DM layer for validation or enhancing the entities with data generated in the target systems. 

In what concerns the first issue, the DM solution can always point to the test environments used as source, following that during UAT to copy the databases from production into the test environments. This might be anyway necessary for other purposes. Otherwise, the effort might be considerable and not working in the last phases with the data timeliness might raise other concerns. 

The second issue is a matter of interpretation. The UAT phase makes sure that the data generated by the DM solution respects the criteria for Go Live. If there are no issues, the same data can be used for Go-Live. If for this is required another licensed edition, then an environment can be built only for UAT and Go Live, project phases which usually span over a couple of weeks, unless multiple migrations need to be performed at different time intervals. If the environments are in the cloud, probably the instances can be turned on and off on a as-needed basis. 

One can plan for different environments between Production and Development and the environments can be on the same SQL Server as distinct databases, respectively use the Developer edition for Development, and use a different licensed edition for UAT and Production. This approach involves additional overhead in synchronizing the logic between environments. Conversely, in the case of the DM layer, the same environment can be used from beginning to the end, while the code should/must be backed-up periodically. For multiple migrations based on the same data, one should archive the data after each migration or important phase. 

For the scenarios in which after migration the data are copied back to the DM solution, it's enough to have these steps performed against the UAT target system(s). This should work as long there are no differences in configuration between UAT and Production. There are however exceptions, e.g. data generated by the target systems, for which the values between Prod and UAT are different. At least in Dynamics 365 one can attempt to generate the values in the DM layer and import them as they are into the target system. It worked for many scenarios, though there can be exceptions here as well. 

A more complex scenario is when data from the DM layer needs to be exported to Data Warehouses or similar solutions that can be considered as Production systems. Here a licensed edition seems to be mandatory. For other scenarios in which Master Data and/or Data Quality Services are needed, there's only the option to use the Enterprise or Developer editions.

To summarize, to reduce the overall costs for the DM, consider using an existing licensed SQL Server instance for building the solution. If separates environments need to be built, the Express edition might have some limitations though it can prove to be a viable solutions in many cases. Otherwise, consider the above workarounds for using the Developer edition, including the scenario in which distinct environments are used for Production and Development. 

Resources:
[1] Microsoft Data Platform (2024) How SQL developers can maximize savings, by Bob Ward (link)
[2] Microsoft Learn (2024) Editions and supported features of SQL Server 2022 (link)
[3] Microsoft Learn (2023) Master Data Services and Data Quality Services Features Support (link)

03 November 2007

Software Engineering: Costs (Just the Quotes)

"Clearly, methods of designing programs so as to eliminate or at least illuminate side effects can have an immense payoff in maintenance costs. So can methods of implementing designs with fewer people, fewer interfaces, and hence fewer bugs." (Fred P Brooks, "The Mythical Man-Month: Essays", 1975)

"The second fallacious thought mode is expressed in the very unit of effort used in estimating and scheduling: the man-month. Cost does indeed vary as the product of the number of men and the number of months. Progress does not. Hence the man-month as a unit for measuring the size of a job is a dangerous and deceptive myth. It implies that men and months are interchangeable." (Fred P Brooks, "The Mythical Man-Month: Essays", 1975)

"The design of a digital system starts with the specification of the architecture of the system and continues with its implementation and its subsequent realisation... the purpose of architecture is to provide a function. Once that function is established, the purpose of implementation is to give a proper cost-performance and the purpose of realisation is to build and maintain the appropriate logical organisation." (Gerrit A Blaauw, "Specification of Digital Systems", Proc. Seminar in Digital Systems Design, 1978)

"Economic principles underlie the overall structure of the software lifecycle, and its primary refinements of prototyping, incremental development, and advancemanship. The primary economic driver of the life-cycle structure is the significantly increasing cost of making a software change or fixing a software problem, as a function of the phase in which the change or fix is made." (Barry Boehm, "Software Engineering Economics", 1981)

"Poor management can increase software costs more rapidly than any other factor. Particularly on large projects, each of the following mismanagement actions has often been responsible for doubling software development costs." (Barry Boehm, "Software Engineering Economics", 1981)

"The entire idea behind prototyping is to save on the time and cost to develop something that can be tested with real users. These savings can only be achieved by somehow reducing the prototype compared with the full system: either cutting down on the number of features in the prototype or reducing the level of functionality of the features such that they seem to work but do not actually do anything." (Jakob Nielsen, "Usability Engineering", 1993)

"As a noun, design is the named (although sometimes unnamable) structure or behavior of a system whose presence resolves or contributes to the resolution of a force or forces on that system. A design thus represents one point in a potential decision space. A design may be singular (representing a leaf decision) or it may be collective (representing a set of other decisions). As a verb, design is the activity of making such decisions. Given a large set of forces, a relatively malleable set of materials, and a large landscape upon which to play, the resulting decision space may be large and complex. As such, there is a science associated with design (empirical analysis can point us to optimal regions or exact points in this design space) as well as an art (within the degrees of freedom that range beyond an empirical decision; there are opportunities for elegance, beauty, simplicity, novelty, and cleverness). All architecture is design but not all design is architecture. Architecture represents the significant design decisions that shape a system, where significant is measured by cost of change." (Grady Booch, "On design", 2006)

"Features have a specification cost, a design cost, and a development cost. There is a testing cost and a reliability cost. […] Features have a documentation cost. Every feature adds pages to the manual increasing training costs." (Douglas Crockford, "JavaScript: The Good Parts: The Good Parts", 2008)

"Good software designs accommodate change without huge investments and rework. When we use code that is out of our control, special care must be taken to protect our investment and make sure future change is not too costly."  (Robert C Martin, "Clean Code: A Handbook of Agile Software Craftsmanship", 2008)

"Just as it can accelerate the pace of a project, prototyping allows the exploration of many ideas in parallel. Early prototypes should be fast, rough, and cheap. The greater the investment in an idea, the more committed one becomes to it. Overinvestment in a refined prototype has two undesirable consequences: First, a mediocre idea may go too far toward realization - or even, in the worst case, all the way. Second, the prototyping process itself creates the opportunity to discover new and better ideas at minimal cost." (Tim Brown, "Change by Design: How Design Thinking Transforms Organizations and Inspires Innovation", 2009)

"Modeling is the creation of abstractions or representations of the system to predict and analyze performance, costs, schedules, and risks and to provide guidelines for systems research, development, design, manufacture, and management. Modeling is the centerpiece of systems architecting - a mechanism of communication to clients and builders, of design management with engineers and designers, of maintaining system integrity with project management, and of learning for the architect, personally."  (Mark W Maier, "The Art Systems of Architecting" 3rd Ed., 2009)

"But 'average cost to fix one defect' is a stupid metric […] It makes bad projects look good, and good projects look bad. How? By failing to divide the costs of fixing into two categories: fixed costs of detecting and fixing defects - costs which are the same no matter how buggy or how good the product is - and variable costs, those which you pay for each defect." (Laurent Bossavit, "The Leprechauns of Software Engineering", 2015)

"The fact that software engineering is not like other forms of engineering should really come as no surprise. Medicine is not like the law. Carpentry is not like baking. Software development is like one thing, and one thing only: software development. We need practices that make what we do more efficient, more verifiable, and easier to change. If we can do this, we can slash the short-term cost of building software, and all but eliminate the crippling long-term cost of maintaining it." (David S Bernstein, "Beyond Legacy Code", 2015)

"Making good engineering decisions is all about weighing all of the available inputs and making informed decisions about the trade-offs. Sometimes, those decisions are based on instinct or accepted best practice, but only after we have exhausted approaches that try to measure or estimate the true underlying costs." (Titus Winters, "Software Engineering at Google: Lessons Learned from Programming Over Time", 2020)

"One of the broad truths we’ve seen to be true is the idea that finding problems earlier in the developer workflow usually reduces costs." (Titus Winters, "Software Engineering at Google: Lessons Learned from Programming Over Time", 2020)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.