02 October 2010

Is MS Access or MS Excel the Answer to your Problem?

Introduction
   That’s one of the questions that followed me for years, quite often being asked by customers to provide a MS Access or MS Excel solution as an answer to a business need. The beauty of this question is that there is no right answer and, as I stressed out in several occasions, there is not always a straightforward answer to such a question in IT, the feasibility of an IT solution relying on many variables formulated typically in term of business and IT requirements. When a customer is requesting to built a MS Access or Excel solution outside of Office paradigm, I’m kind of circumspect, and this not because they are not great tools, but because they are not adequate for all purposes. I even recommend the two for personal or for small-scale solutions, though their applicability should stop right there.
     Maybe I need to detail a little the terms used. A personal solution is an application developed for personal use, for example to store and maintain the data for a report, to process data automatically or any other attempt of automating some tasks. By small-scale solutions I’m referring to the following types of applications: 
- applications of basic to average complexity, that don’t require complex design or could be developed by a developer with average skills.
- applications that target a small number of users, usually a small group of max 10-20 concurrent users, it may be occasionally a whole department or it could be cross departmental as long the previous mentioned condition are met.

A Short Review
   MS Excel is the perfect tool for storing non-relational tabular data, manipulating data manually or with the help of formulas, doing data analysis with pivoting and charting, or of querying various data sources. Its extensibility based on its DOM (Document Object Model), VBA (Visual Basic for Applications) and its IDE (Integrated Development Environment), Forms, add-ins, in-house or third-party developed libraries, the template and wizard-based approach, make from Excel a powerful development environment. I would say that Excel’s weakness resides in its intrinsic design, the DOM model which lacks a rich event model, in the fact that Excel is mainly a tool for data entry, analysis and reporting, the other types of functionality coming on a secondary plan. Excepting a few new features built in Excel itself, the important new functionality comes as add-on – SQL Server-based data mining add-in, MS Sharepoint Server-based Web Services features like multiuser collaboration, slicer and a few other.
   The extensibility capabilities mentioned above are not only a particularity of Excel but apply to the whole Office family: Access, Word, Outlook, Powerpoint, and even Visio if is considered the “extended family”, each of them with its role. Access’ role is that of flexible relational data storage, querying and reporting solution, its strength relying mainly in the easiness of providing a simple UI (User Interface) for maintaining and navigating the data, in the easiness of pulling data from various sources for further analysis. As in the case of Excel, Access’ weakness resides in its DOM, in the fact that it’s not a full RDBMS (Relational Database Management System) and all the consequences deriving from it.

Programming for the Masses
   The great thing about VBA is that also non-developers could successfully adventure in developing Office-based applications, the possibility of learning from the code built with “Record Macro” functionality allowing a small learning curve. Enabling “non-developers” to built applications makes from Office a powerful and altogether dangerous tool because such applications could be easily misused. Misused here refers to the fact that often is attempted to built in Excel or Access complex applications that sooner or later break apart under their complexity, that organizations arrive to have a multitude of such applications with no control over their existence, maintenance, security, etc. Unfortunately the downsides of such applications are discovered late in the process, when intended functionality is not available, thus arriving to reinvent the wheel, patch up functionality in a jumble, in a tumble. With some hard-work you could achieve the alike functionality as the one available in powerful frameworks like .Net, WPF, WCF or Silverlight, to mention the Microsoft technologies I’m somewhat acquainted to. VBA is great but with time became less powerful than VB, C# or C++ (the comparison between VBA and C++ is a little forced), to mention the most important programming languages for writing managed code in .Net. The barriers between the capabilities of the two types of programming languages are somehow broken by the possibility of developing add-ins and libraries for MS Office or of using Office DOM in .Net applications, though few (non-) programmers adventure on this path.

The Architectural Perspective
   There is another important architectural perspective – separating the data storage and eventually data processing from presentation. Also when using Access or Excel the data storage could be separated from presentation, though I’ve seen few solutions doing that, the three layers coexisting usually within the same tire. An Access solution could be split in two, one for database and other for UI and processing, allowing more flexibility in what concerns the architecture, security, version management, etc. Access is good for data presentation and rapid prototyping, though the concept and the data controls are quite old, having several limitations when compared with similar controls available for example in .Net. The advantage of using simple drag-and-drop or wizards in Access is for long over, the same functionality existing also in Visual Studio (Express), environment in which applications could be built with drag-and-drop and wizards too, in plus taking advantage of additional built-in features. The database layer could be replaced with a full RDBMS, same as the presentation layer could be replaced with a .Net UI. It’s not much easier then to built the architecture around the .Net UI and a RDBMS?!
   Excel is considered by many as a (relational) database, is it really so? It’s true the data could be stored in tabular format in which a sheet plays the role of a table and queryable through the various drivers available, though no primary key is available, less control over the data entered and many other features available in RDBMS need to be provided programmatically, again reinventing the wheel. Same as in the case of Access, Excel could be considered for data storage and presentation, its functionality being reduced when compared with the one of Access. Many people are used with the data entry mechanism available in Excel, especially in what concerns data manipulation, wanting similar functionality in other tools. If this was Excels’ advantage some time ago, that’s no more valid, several rich data grids offering similar data entry functionality which, with some effort, could simulate to an acceptable degree the functionality of Excel, and they could provide also richer validation functionality.

It’s all about Costs
   In the past Excel and Access were quite cheap as “development platforms” when compared with the purchasing of existing IDE, especially when we consider their extensibility through VBA and IDE’s availability, thus the functionality vs. extensibility favorable ratio. Since were introduced Express versions of powerful IDEs like the ones existing in Visual Studio, the boom of open source IDE or development frameworks that provide rich capabilities, the report of forces changed dramatically in the favor of the later. Today you could put together a small-scale application with a minimum of investment, making sometimes obsolete the use of Office tools outside of the Office solutions. I would say that the inventory of software tools and technologies changed in the past years considerable, but the mentality in what concerns the IT infrastructure and software development changed less. It’s true that sometimes organizations lack the resources who could architect and design such solutions, relying mainly on external resources, or being much easier to rely on an employee’s programming skills who knows “exactly” what he wants and it would be in theory much easier in order to attempt solving a problem directly rather than writing the requirements down. In VBA’s advantage comes also the fact that normally software solutions evolve and need to be changed in order to reflect business or philosophy changes, being much easier to introduce such changes directly by the employee who built the application in contrast with starting a whole project for this purpose. This aspect is rooted in other perspective – sometimes organizations ignore the software needs, falling in employees attribution to find cheap and fast ways of automating tasks in particular, solving work-related problems in general, Excel or Access being quite handy for this purpose. Sure, you can do almost anything also in Excel/Access but with what costs?

The Strategic Context
    Several times I heard people talking about replacing the collection of Excel sheets with an Access solution. I know that in the absence of adequate solutions people arrive to store various types of data in Excel sheets, duplicating data, loosing the control over versions, data quality, making data unsecure/unavailable or un-processable. Without a good data management and infrastructure strategy the situation doesn’t change significantly by using an Access solution. It’s true that the data could be easier stored in a global place, some validation could result in better data quality, while security, availability and data maintainability could suffer some improvements too, however the gain is insignificant when compared with the capabilities of a full-featured RDBMS. Even if a company doesn’t have the resources to invest in a mature RDBMS like Oracle or SQL Server, there are also the Express versions for the respective databases, several other free solutions existing on the market especially in the area of open source. On the other side it’s true that MS Access, through its easy to use SQL Designer, allows people building queries with simple drag-and-drops and limited SQL knowledge, though its value is relative.

   Talking about data management strategy, it concerns mainly the data quality as a function of its 6 main dimensions (accuracy, conformity, consistency, completeness, duplicates, referential integration) to which add data actuality, accessibility, security, relevance, usability, and so on. The main problem with personal solutions is that they lead to data and logic duplication, and even when such solutions are consolidated in one form or another, their consolidation and integration is quite complex because you have to consider not only the various designs but also the overall requirements from a higher perspective. On the other side it’s difficult to satisfy the needs of all the people in an organization, in a form or another, duplication of data being inevitable, with direct or indirect implications on data quality. It is required some effort and a good strategy in what concerns these aspects, finding the balance between the various requirements and the number of solutions to satisfy them.

Reformulating the Question

  How can we determine which tool or set of tools is appropriate for our problem? Normally the answer to this question depends on the needed functionality. The hard road in answering this question is to identify all the requirements, the features available in the various tools, weight both of them, and decide what worth best. Unfortunately that’s not an easy task, it need to be considered not only actual but also future requirements, organization’s strategy, and whatever might come around. Reports, best practices, lessons learned or other type of succinct content might help as well in taking a decision without going too deep in analyzing features and requirements thoroughly. Sometimes a gut feeling might work as well, especially when comes from a person with experience in the field. Other times you don’t have too many options – time, resources, knowledge, IT infrastructure, philosophy or politics reducing your area of maneuverability/decision. In the end we learn by doing, by fighting with the constraints and problems we have, hopefully we learn also from our or others’ mistakes…

PS: Even if I’m having several good cumulated years in developing solutions based on Excel and Access, and I can’t pretend that I know their full potential, especially when judged from the perspective of the new features introduced with Excel 2007 or 2010, even more when considering their integration with Sharepoint, SQL Server or other similar platforms. The various software tools or platforms existing on the market allow people to mix functionality theoretically in unlimited ways, the separation of functionality between layers, SaaS (software as a service) and mash-ups changing the way we program and perceive software development.

No comments: