SQL Troubles: RDBMS

Showing posts with label RDBMS. Show all posts

04 February 2017

💠🛠️SQL Server: Administration (Killing Sessions - Killing ‘em Softly and other Snake Stories)

Introduction

There are many posts on the web advising succinctly how to resolve a blocking situation by terminating a session via kill command, though few of them warn about its use and several important aspects that need to be considered. The command is powerful and, using an old adagio, “with power comes great responsibility”, responsibility not felt when reading between the lines. The easiness with people treat the topic can be seen in questions like “is it possibly to automate terminating sessions?” or in explicit recommendations of terminating the sessions when dealing with blockings.

A session is created when a client connects to a RDBMS (Relational Database Management System) like SQL Server, being nothing but an internal logical representation of the connection. It is used further on to perform work against the database(s) via (batches of) SQL statements. Along its lifetime, a session is uniquely identified by an SPID (Server Process ID) and addresses one SQL statement at a time. Therefore, when a problem with a session occurs, it can be traced back to a query, where the actual troubleshooting needs to be performed.

Even if each session has a defined scope and memory space, and cannot interact with other sessions, sessions can block each other when attempting to use the same data resources. Thus, a blocking occurs when one session holds a lock on a specific resource and a second session attempts to acquire a conflicting lock type on the same resource. In other words, the first session blocks the second session from acquiring a resource. It’s like a drive-in to a fast-food in which autos must line up into a queue to place an order. The second auto can’t place an order until the first don’t have the order – is blocked from placing an order. The third auto must wait for the second, and so on. Similarly, sessions wait in line for a resource, fact that leads to a blocking chain, with a head (the head/lead blocking) and a tail (the sessions that follow). It’s a FIFO (first in, first out) queue and using a little imagination one can compare it metaphorically with a snake. Even if imperfect, the metaphor is appropriate for highlighting some important aspects that can be summed up as follows:

Snakes have their roles in the ecosystem
Not all snakes are dangerous
Grab the snake by its head
Killing ‘em Softly
Search for a snake’s nest
Snakes can kill you in sleep
Snake taming

Warning: snakes as well blockings need to be handled by a specialist, so don’t do it by yourself unless you know what are you doing!

Snakes have their roles in the ecosystem

Snakes as middle-order predators have an important role in natural ecosystems, as they feed on prey species, whose numbers would increase exponentially if not kept under control. Fortunately, natural ecosystems have such mechanism that tend to auto-regulate themselves. Artificially built ecosystems need as well such auto-regulation mechanisms. As a series of dynamical mechanisms and components that work together toward a purpose, SQL Server is an (artificial) ecosystem that tends to auto-regulate itself. When its environment is adequately sized to handle the volume of information or data it must process then the system will behave smoothly. As soon it starts processing more data than it can handle, it starts misbehaving to the degree that one of its resources gets exhausted.

Just because a blocking occurs doesn’t mean that is a bad thing and needs to be terminated. Temporary blockings occur all the time, as unavoidable characteristic of any RDBMS with lock-based concurrency like SQL Server. They are however easier to observe in systems with heavy workload and concurrent access. The more users in the system touch the same data, the higher the chances for a block to occur. A good design database and application architecture typically minimize blockings’ occurrence and duration, making them almost unobservable. At the opposite extreme poor database design combined with poor application design can make from blockings a DBA’s nightmare. Persistent blockings can be a sign of poor database or application design or a sign that one of the environment’s limits was reached. It’s a sign that something must be done. Restarting the SQL server, terminating sessions or adding more resources have only a temporary effect. The opportunity lies typically in addressing poor database and application design issues, though this can be costlier with time.

Not all snakes are dangerous

A snake’s size is the easiest characteristic on identifying whether a snake is dangerous or not. Big snakes inspire fear for any mortal. Similarly, “big” blockings (blockings consuming an important percentage of the available resources) are dangerous and they have the potential of bringing the whole server down, eating its memory resources slowly until its life comes to a stop. It can be a slow as well a fast death.

Independently of their size, poisonous snakes are a danger for any living creature. By studying snakes’ characteristics like pupils’ shape and skin color patterns the folk devised simple general rules (with local applicability) for identifying whether snakes are poisonous or not. Thus, snakes with diamond-shaped pupils or having color patterns in which red touches yellow are likely/believed to be poisonous. By observing the behavior of blockings and learning about SQL Server’s internals one can with time understand the impact of each blocking on server’s performance.

Grab the snake by its head

Restraining a snake’s head assures that the snake is not able to bite, though it can be dangerous, as the snake might believe is dealing with a predator that is trying to hurt it, and reach accordingly. On the other side troubleshooting blockings must start with the head, the blocking session, as it’s the one which created the blocking problem in the first place.

In SQL Server sp_who and its alternative sp_who2 provide a list of all sessions, with their status, SPID and a reference with the SPID of the session blocking it. It displays thus all the blocking pairs. When one deals with a few blockings one can easily see whether the sessions form a blocking chain. Especially in environments under heavy load one can deal with a handful of blockings that make it difficult to identify all the formed blocking chains. Identifying blocking chains is necessary because by identifying and terminating directly the head blocking will often make the whole blocking chain disappear. The other sessions in the chain will perform thus their work undisturbed.

Going and terminating each blocking session in pairs as displayed in sp_who is not recommended as one terminates more sessions than needed, fact that could have unexpected repercussions. As a rule, one should restore system’s health by making minimal damage.

In many cases terminating the head session will make the blocking chain disperse, however there are cases in which the head session is replaced by other session (e.g. when the sessions involve the same or similar queries). One will need to repeat the needed steps until all blocking chain dissolve.

Killing ‘em Softly

Killing a snake, no matter how blamable the act, it is sometimes necessary. Therefore, it should be used as ultimate approach, when there is no other alternative and when needed to save one’s or others’ life. Similarly killing a session should be done only in extremis, when necessary. For example, when server’s performance has deprecated considerably affecting other users, or when the session is hanging indefinitely.

Kill command is powerful, having the power of a hammer. The problem is that when you have a hammer, every session looks like a nail. Despite all the aplomb one has when using a tool like a hammer, one needs to be careful in dealing with blockings. A blocking not addressed correspondingly can kick back, and in special cases the bite can be deadly, for system as well for one’s job. Killing the beast is the easiest approach. Kill one beast and another one will take its territory. It’s one of the laws of nature applicable also to database environments. The difference is that if one doesn’t addresses the primary cause that lead to a blocking, the same type of snake more likely will appear repeatedly.

Unfortunately, the kill command is no bulletproof for terminating a session, it may only severe the snake. As the documentation warns, there can be cases in which the method won’t have any effect on the blocking, the blocking continuing to room around. So, might be a good idea to check whether the session disappeared and keep an eye on it until it totally disappeared. Especially when dealing with a blocking chain it can happen that the head session is replaced by another session, which probably was waiting for the same resources as the previous head session. It may happen that one deals with two or more blocking chains independent from each other. Such cases appear seldom but are possible.

Killing the head session with a blocking without gathering some data provides less opportunities for learning, for understanding what’s happening in your system, of identifying what caused the blocking to occur. Therefore, before jumping to kill a session, collect the data you need for further troubleshooting.

Search for a snake’s nest

With the warning that unless one deals with small snakes, might not be advisable in searching for a snake’s nest, the idea behind this heuristic is that with a snake’s occurrence more likely there is also a nest not far away, where several other snakes might hide. Similarly, a query that causes permanent blockings might be the indication for code that generates a range of misbehaving queries. It can be same code or different pieces of code. One can attempt to improve the performance of a query that leads to blockings by adding more resources on the server or by optimizing SQL Server’s internals, though one can’t compensate for poor programming. When possible, one needs to tackle the problem at the source, otherwise performance improvements are only temporary.

Snakes can kill you in sleep

When wondering into the wild as well when having snakes as pets one must take all measures to assure that nobody’s health is endangered. Same principle should apply to databases as well, and the first line of defense resides in actively monitoring the blockings and addressing them timely as they occur. Being too confident that nothing happens and no taking the necessary precautions can prove to be a bad strategy when a problem occurs. In some situations, the damage might be acceptable in comparison with the effort and costs needed to build the monitoring infrastructure, though for critical systems it can come with important costs.

Snakes’ Taming

Having snakes as pets doesn’t seem like a good idea, and there are so many reasons why one shouldn’t do it (see PETA’s reasons)! On the other side, there are also people with uncommon hobbies, that not only limit themselves at having a snake pet, but try to tame them, to have them behave like pets. There are people who breed snakes to harness their venom for various purposes, occupation that requires handling snakes closely. There are also people who brought their relation with snakes at level of art, since ancient Egypt snake charming being a tradition in countries from Southeast Asia, Middle East, and North Africa. Even if not all snakes are tameable, snake’s taming and charming is possible. In the process the tamer must deprogram or control snakes’ behavior, following a specific methodology in a safe environment.

No matter how much one tries to avoid persistent blockings, one can learn from troubleshooting blockings, about their sources, behavior as well about own limitations. One complex blocking can be a good example with which one can test his knowledge about SQL Server internals as well about applications’ architecture. Each blocking provides a scenario in which one can learn something.

When fighting with a blocking, it’s wise to do it within a safe environment, typically a test or development environment. Fighting with it in a production environment can cause unnecessary stress and damage. So, if you don’t have a safe environment in which to carry the fight, then build one and try to keep the same essential characteristics as in production environment!

There will be also situations in which one must fight with a blocking in the production environment. Then, be careful in not damaging the data as well the environment, and take all the needed precautions!

Conclusion

The comparison between snakes and blockings might not be perfect, though hopefully it will imprint in reader’s mind the dangers of handling blockings inappropriately and increase the awareness in what concerns related topics.

12 October 2010

🔏MS Office: Why I (dis)like MS Access

In the previous post, 'The Limitations of MS Access Database", I highlighted a few of the limitations of MS Access as database, ignoring the other two or three important aspects – Access as development, reporting, respectively data analysis platform. In this post I’ll retake the topic from a general and personal perspective, considering some of the features that I like and makes from Access a useful and powerful tool, attempting also to mention some of its usage limitations I run into over the time. With the risk of repeating myself, I can’t say I’m an expert in MS Access even if I provided several solutions based on it, its use in the various contexts being not always so inspired, that being one of the reasons why in the first post "Is MS Access or MS Excel the Answer to Your Problem?" I insisted on this aspect.

Ad-Hoc Database

MS Access is a file server-based relational database, being one of the most used databases, though it can’t be compared with more mature RDBMS like Oracle, MySQL, SQL Server, Sybase, PostgreSQL, Teradata, Informix or DB2, richer in features, especially in what concerns their administration, transactional and concurrent processing, scalability, stability, availability, performance, reliability, portability, replication, integration, security, manageability, extensibility, the degree to which they fit in the overall architecture of an enterprise, of relevance being topics like Business Intelligence, Data Warehousing, SOA, Cloud Computing, etc. These are some of the reasons for which I categorized Access as a Personal or Ad-Hoc database, being, at least from my point of view, more appropriate for small-size or personal solutions. In essence Access has the characteristics of a relational database though the lack in the mentioned features makes it less desirable. Nobody denies Access’ usefulness, the point is that when compared with full-featured RDBMS Access has no chance, fact reflected also in the following market share diagrams:


DBMS Market 2006 (JoinVision e-Services via[2])	DBMS Market 2008 (Gartner via MySQL) [1]

Even if the diagrams are a few years old, I think they are still representative in what concerns the state of art in the world of databases, the first diagram providing an historical perspective, while the second the “actual” and “future” reflected tendencies. It’s not the first time I’m seeing MS Access and SQL Server represented together even if they belong to different technology stacks, Access’ strength and weakness being deeply rooted in its affiliation to MS Office set of tools. It would be interesting to know which was the ratio then between the number of Access and SQL Servers, and what’s the ratio now, SQL Server Express replacing Access’ role of personal or small-scale database.

The statistics are less representative when it comes to people, their interests and immediate needs. The bottom line is that Access is an easy to use database with pretty low learning curve, you don’t need to know the fancy stuff about databases, you could experiment and learn it as an add-on to your job, making the consumption of data much easier, at least in theory. Are you having your data stored across several Excel files? You can import or copy paste them in Access and there you have an ad-hoc database, then create several queries on top of them with the Query Designer or Query Designer, and this without any knowledge of SQL.

The saved queries could be reused much as the views, they could be parameterized, the parameters could be bounded much like the user-defined functions, and made available for further consumption. I can’t say I met any other similar software tool that simplifies so much the design and consumption of databases. The simplicity of Access query designing comes with its tribute, especially when you want to achieve more from your database, the minimum of features making difficult to design complex queries, Access requiring a different mindset in problem solving. In addition, those used with the rich features of RDBMS won’t feel too comfortable in using the Query Designer or Editor, the ANSI syntax it’s inflexible while the troubleshooting quite painful.

I used Access as database only when I had no other alternative, preferring to store the data in a RDBMS like SQL Server or Oracle. In exchange I used Access as presentation layer, allowing users for example to access and analyze the data. In many occasions I played with Access databases in small projects or enhancing existing applications, spending many hours in tweaking Access queries or on porting such queries to other RDBMS.

I had the occasion to work with several tools that were using Access as backend, one of them IQ Insight, used to assess the quality of data, was an interesting tool to work with though it was paying tribute to the stability and speed of its database, in a next implementation project deciding to take it out of the landscape, the performance of VB + SQL Server solution that replaced it, increased the performance from a matter of hours to minutes. I know that many people out there love Access as database, though once you acknowledged the performance power and flexibility of other databases, you don’t feel like returning in the past.

Data Analysis Tool

When having multiple Excel or other data sources, you don’t need to store your data in Access itself, it’s enough to link your text, Excel or any other ODBC data source, built a query on top of them, and there you have on the fly your data at your disposal, something that Data Warehousing and Business Intelligence tools hardly manage to do when considering all the people’s needs. By importing the data in your Access database, you could even correct some of the inherent issues existing in data, use some mappings in order to translate the data, use several queries in order to aggregate the data at the needed level of detail or get new insights.

From a mapping table or a query to creating a whole data analysis framework is just a small step, and this without too much involvement of the IT guys. Even more, the framework could be used by your colleagues too, they could use it directly or indirectly by re-linking the results of your analysis with a minimum of effort, they could even improve the character of your analysis or find other purposes for the data. Thus results a complex network of interconnected Access databases, and that’s a matter of time until it’s getting out of control, for example by not knowing how a change in one of the queries could impact the other known and unknown users of your data, on whether you are using the actual data, on whether the data have been tempered, and so on.

There should be no wonder when people are arriving to report different numbers, when the numbers don’t tie together, though also more modern reporting frameworks are dealing with these types of issues, isn’t it? In addition, you arrive to have multiple instances of the same data or have data distributed and isolated in a uncontrolled fashion, not the best strategy for an enterprise though…

I used Access as data access end point for data available in various data sources, allowing users to analyze and recombine the data by themselves, but this mainly in order to overcome the limitations of available standard reporting tools. This combined with the fact that has been attempted to move most of the logic created by users in a standardized form, limiting the risks of running into Access fallacies. Sure, there could be done more in order to avoid such pitfalls, for example having adequate reporting and data analysis tools, having in place a Data Management Policy which addresses common data problems, training users, etc.

Reporting

The possibility to present the data in a reporting-like fashion is one of the greatest advantages of Access, the tabular structure being easy to integrate with charting, paging, results breaking, formulas, filtering/parameterization, rich formatting, subreports and other types of report structures (e.g. footer, header), in other words the ingredients of a typical report. The combination between ad-hoc data analysis and reporting, it quite an advantage, depending on users’ skills in making most out of it. Reports’ functionality could be extended using Reports’ DOM and VBA, only the fact that a report could be entirely created and modified at runtime is quite of a deal.

I used Access reports only in the applications which were built entirely on MS Access, whenever was possible preferring to move the reports on more standardized platforms. Sometimes I find it more useful to export the data directly to Excel or to a more portable format like PDF, thing also possible with Access reports, though eliminating thus the intermediary platform. Now it depends on users’ preferences and organizations’ infrastructure.

Rapid Prototyping

Access could be used as frontend for various types of applications, and you don’t need to put too much effort in your application. Is enough to drop a form and link it to a table, then link the screens together and here you have an already functioning application, fact that makes from Access a tool ideal for rapid prototyping.

I used Access in several projects for building proof of concept prototypes, allowing customers to gather requirements, evaluate the concept and the available functionality. There were also cases in which the prototypes were comparable as performance with the applications that replaced them, from some points of view even better, though that’s a matter of architecture, skills and sometimes infrastructure.

Extensibility: VBA

A person could create in Access a data analysis framework, a report or a prototype without writing a single line of code, richer functionality being available by using VBA, which is nothing more than old-fashioned VB based on Access’ DOM. VBA extensibility refers here to the possibility of going beyond the wizarding and drag-and-drop functionality provided by Access, for example by adding complex validation into forms, linking forms, altering or creating content at runtime, etc. Not everybody needs to do go so far, however those who used formulas or have some programming experience would find VBA easy to learn.

Those wanting to change the default behavior of Access or provide missing functionality then they will have to go deeper in VBA’s secrets, in using built-in or third party developed libraries. For example in order to change the “sequential” access of data provided by Access a programmer will have to use ADO or DAO, the built-in transactional functionality provided in the two libraries could be used to cover the lack of transactional processing not built-in in Access. With some exceptions, in theory you could do with VBA anything you do with old fashioned VB, though with VB.Net the gap to VBA increased considerably ( see Converting Code from VBA to Visual Basic .NET for differences). There are also some limitations, for example the adding of controls in Access forms at runtime, and I remember I found a few other with time, some of them deriving from bugs existing in the tool itself.

Extensibility: .DLLs

I was saying that it’s possible to use third party developed libraries in Access, this functionality relying on COM+ and its predecessors DCOM, COM or ActiveX, technologies that allow the communication between components not only on the local computer but also in distributed networks or internet as in the case of ActiveX. In this way it’s possible to encapsulate functionality in libraries saved as .dlls, distribute them with your applications or reuse them in other applications. Writing COM classes is a job for programming languages like C++, VB, VB.Net, C#, etc. The old-fashioned VB was great in creating and debugging COM components in just a question of minutes, in theory any piece of code could be encapsulated in such a component. Having the possibility to extend the functionality of MS Access with such libraries open the door to an unlimited number of architectural scenarios.

Extensibility: Add-ins

Add-ins are forming a special type of components rooted in OLE, later based on COM, that use the MS Office DOM architecture, their primary utility relying in the fact that they make it possible to provide new features for MS Office itself. An example of such “bonus” features are Save as PDF add-in for Access 2007 or Open Database Connectivity add-in for Excel. I used Add-ins only to extend Excel’s UI-based functionality, therefore I can’t talk too much about their use in Access. For more see Building COM Add-ins for Office Applications material available in MSDN.

Database Templates

I observed that in MS Access 2007 are available several templates (e.g. Assets, Contacts, Sales pipeline, etc.) that could be extended or used in learning how an application is designed. Doing a little research I found out that is possible to create templates for whole databases, reports or forms. I haven’t use templates until now in Access, but it could prove to be an interesting feature when common architectural or functional characteristics are found.

References:
[1] MySQL. (2010). Market Share. [Online] Available from: http://www.mysql.com/why-mysql/marketshare/ (Accessed: 10 October 2010)
[2] Creative System Design. (2010) Databases. [Online] Available from: http://online.creativesystemdesigns.com/projects/databases.asp (Accessed: 10 October 2010)

02 October 2010

#️⃣Software Engineering: Programming (Part V: Is MS Access or Excel the Answer to your Problems?

Software Engineering Series

Introduction

That’s one of the topics that followed me for years, quite often being asked by customers to provide a MS Access or MS Excel solution as an answer to a business need. The beauty of this question is that there is no right answer and, as I stressed out in several occasions, there is not always a straightforward answer to such a question in IT, the feasibility of an IT solution relying on many variables formulated typically in term of business and IT requirements.

When a customer is requesting to built a MS Access or Excel solution outside of Office paradigm, I’m kind of circumspect, and this not because they are not great tools, but because they are not adequate for all purposes. I even recommend the two for personal or for small-scale solutions, though their applicability should stop right there.

A personal solution is an application developed for personal use, for example to store and maintain the data for a report, to process data automatically or any other attempt of automating some tasks. By small-scale solutions I’m referring to the following types of applications:
- applications of basic to average complexity, that don’t require complex design or could be developed by a developer with average skills.
- applications that target a small number of users, usually a small group of max 10-20 concurrent users, it may be occasionally a whole department or it could be cross departmental as long the previous mentioned condition are met.

A Short Review

MS Excel is the perfect tool for storing non-relational tabular data, manipulating data manually or with the help of formulas, doing data analysis with pivoting and charting, or of querying various data sources. Its extensibility based on its DOM (Document Object Model), VBA (Visual Basic for Applications) and its IDE (Integrated Development Environment), Forms, add-ins, in-house or third-party developed libraries, the template and wizard-based approach, make from Excel a powerful development environment. I would say that Excel’s weakness resides in its intrinsic design, the DOM model which lacks a rich event model, in the fact that Excel is mainly a tool for data entry, analysis and reporting, the other types of functionality coming on a secondary plan. Excepting a few new features built in Excel itself, the important new functionality comes as add-on – SQL Server-based data mining add-in, MS Sharepoint Server-based Web Services features like multiuser collaboration, slicer and a few other.

The extensibility capabilities mentioned above are not only a particularity of Excel but apply to the whole Office family: Access, Word, Outlook, Powerpoint, and even Visio if is considered the “extended family”, each of them with its role. Access’ role is that of flexible relational data storage, querying and reporting solution, its strength relying mainly in the easiness of providing a simple UI (User Interface) for maintaining and navigating the data, in the easiness of pulling data from various sources for further analysis. As in the case of Excel, Access’ weakness resides in its DOM, in the fact that it’s not a full RDBMS (Relational Database Management System) and all the consequences deriving from it.

Programming for the Masses/Citizens

The great thing about VBA is that also non-developers could successfully adventure in developing Office-based applications, the possibility of learning from the code built with “Record Macro” functionality allowing a small learning curve. Enabling “non-developers” to built applications makes from Office a powerful and altogether dangerous tool because such applications could be easily misused. Misused here refers to the fact that often is attempted to built in Excel or Access complex applications that sooner or later break apart under their complexity, that organizations arrive to have a multitude of such applications with no control over their existence, maintenance, security, etc.

Unfortunately the downsides of such applications are discovered late in the process, when intended functionality is not available, thus arriving to reinvent the wheel, patch up functionality in a jumble, in a tumble. With some hard-work you could achieve the alike functionality as the one available in powerful frameworks like .Net, WPF, WCF or Silverlight, to mention the Microsoft technologies I’m somewhat acquainted to. VBA is great but with time became less powerful than VB, C# or C++ (the comparison between VBA and C++ is a little forced), to mention the most important programming languages for writing managed code in .Net. The barriers between the capabilities of the two types of programming languages are somehow broken by the possibility of developing add-ins and libraries for MS Office or of using Office DOM in .Net applications, though few (non-) programmers adventure on this path.

The Architectural Perspective

There is another important architectural perspective – separating the data storage and eventually data processing from presentation. Also when using Access or Excel the data storage could be separated from presentation, though I’ve seen few solutions doing that, the three layers coexisting usually within the same tire. An Access solution could be split in two, one for database and other for UI and processing, allowing more flexibility in what concerns the architecture, security, version management, etc.

Access is good for data presentation and rapid prototyping, though the concept and the data controls are quite old, having several limitations when compared with similar controls available for example in .Net. The advantage of using simple drag-and-drop or wizards in Access is for long over, the same functionality existing also in Visual Studio (Express), environment in which applications could be built with drag-and-drop and wizards too, in plus taking advantage of additional built-in features. The database layer could be replaced with a full RDBMS, same as the presentation layer could be replaced with a .Net UI. It’s not much easier then to built the architecture around the .Net UI and a RDBMS?!

Excel is considered by many as a (relational) database, is it really so? It’s true the data could be stored in tabular format in which a sheet plays the role of a table and queryable through the various drivers available, though no primary key is available, less control over the data entered and many other features available in RDBMS need to be provided programmatically, again reinventing the wheel. Same as in the case of Access, Excel could be considered for data storage and presentation, its functionality being reduced when compared with the one of Access.

Many people are used with the data entry mechanism available in Excel, especially in what concerns data manipulation, wanting similar functionality in other tools. If this was Excels’ advantage some time ago, that’s no more valid, several rich data grids offering similar data entry functionality which, with some effort, could simulate to an acceptable degree the functionality of Excel, and they could provide also richer validation functionality.

It’s all about Costs

In the past MS Excel and Access were quite cheap as "development platforms" when compared with the purchasing of existing IDE, especially when we consider their extensibility through VBA and IDE’s availability, thus the functionality vs. extensibility favorable ratio. Recently were introduced express (aka community) versions of powerful IDEs for Visual Studio, respectively open source IDE and development frameworks that provide rich capabilities, the report of forces changed dramatically in the favor of the later.

Today you could put together a small-scale application with a minimum of investment, making sometimes obsolete the use of Office tools outside of the Office solutions. The pool of software tools and technologies changed in the past years considerable, but the mentality in what concerns the IT infrastructure and software development changed less. It’s true that sometimes organizations lack the resources who could architect and design such solutions, relying mainly on external resources, or being much easier to rely on an employee’s programming skills who knows “exactly” what's needed and it would be in theory much easier in order to attempt solving a problem directly rather than writing the requirements down.

In VBA’s advantage comes also the fact that normally software solutions evolve and need to be changed in order to reflect business or philosophy changes, being much easier to introduce such changes directly by the employee who built the application in contrast with starting a whole project for this purpose. This aspect is rooted in other perspective – sometimes organizations ignore the software needs, falling in employees attribution to find cheap and fast ways of automating tasks in particular, solving work-related problems in general, Excel or Access being quite handy for this purpose. Sure, you can do almost anything also in Excel/Access but with what costs?

The Strategic Context

Several times I heard people talking about replacing the collection of Excel sheets with an Access solution. I know that in the absence of adequate solutions people arrive to store various types of data in Excel sheets, duplicating data, loosing the control over versions, data quality, making data unsecure/unavailable or un-processable. Without a good data management and infrastructure strategy the situation doesn’t change significantly by using an Access solution.

It’s true that the data could be easier stored in a global place, some validation could result in better data quality, while security, availability and data maintainability could suffer some improvements too, however the gain is insignificant when compared with the capabilities of a full-featured RDBMS. Even if a company doesn’t have the resources to invest in a mature RDBMS like Oracle or SQL Server, there are also the Express versions for the respective databases, several other free solutions existing on the market especially in the area of open source. On the other side it’s true that MS Access, through its easy to use SQL Designer, allows people building queries with simple drag-and-drops and limited SQL knowledge, though its value is relative.

Talking about data management strategy, it concerns mainly the data quality as a function of its 6 main dimensions (accuracy, conformity, consistency, completeness, duplicates, referential integration) to which add data actuality, accessibility, security, relevance, usability, and so on. The main problem with personal solutions is that they lead to data and logic duplication, and even when such solutions are consolidated in one form or another, their consolidation and integration is quite complex because you have to consider not only the various designs but also the overall requirements from a higher perspective. On the other side it’s difficult to satisfy the needs of all the people in an organization, in a form or another, duplication of data being inevitable, with direct or indirect implications on data quality. It is required some effort and a good strategy in what concerns these aspects, finding the balance between the various requirements and the number of solutions to satisfy them.

Reformulating the Question

How can we determine which tool or set of tools is appropriate for our problem? Normally the answer to this question depends on the needed functionality. The hard road in answering this question is to identify all the requirements, the features available in the various tools, weight both of them, and decide what worth best. Unfortunately that’s not an easy task, it need to be considered not only actual but also future requirements, organization’s strategy, and whatever might come around.

Reports, best practices, lessons learned or other type of succinct content might help as well in taking a decision without going too deep in analyzing features and requirements thoroughly. Sometimes a gut feeling might work as well, especially when comes from a person with experience in the field. Other times you don’t have too many options – time, resources, knowledge, IT infrastructure, philosophy or politics reducing your area of maneuverability/decision. In the end we learn by doing, by fighting with the constraints and problems we have, hopefully we learn also from our or others’ mistakes…

PS: Even if I’m having several good cumulated years in developing solutions based on Excel and Access, and I can’t pretend that I know their full potential, especially when judged from the perspective of the new features introduced with Excel 2007 or 2010, even more when considering their integration with SharePoint, SQL Server or other similar platforms. The various software tools or platforms existing on the market allow people to mix functionality theoretically in unlimited ways, the separation of functionality between layers, SaaS (software as a service) and data meshes changing the way we program and perceive software development.

Previous Post <<||>> Next Post

30 March 2010

💎SQL Reloaded: The Power of Joins – Part I: Introduction

Joins are the heart of RDBMS, they allowing to retrieve data (bring data together in a single result dataset) from multiple (related) table-like objects (tables, views, user-defined functions) by defining how and which records are selected from such objects. In order to reduce the complexity of such a topic I will introduce the joins with regard to tables, the concepts applying also to views and user-defined functions, especially when they are regarded as “virtual” tables. In theory any two tables could be joined independently on whether a direct or indirect relation exists between them, it can be a foreign key, a natural key, a date or any other attribute that could be used to get a certain view into the data. Typically a join involves a minimum of two tables, but before going dipper into the topic it makes sense to look a little at the various terms used to refer to the tables joined and the anatomy of a join.

There are various ways of naming the tables involved in a join:
- left vs. right tables, where the left table is the first listed table participating in the join and the right table the second listed table participating in the join;
- parent vs. dependent tables, refers to tables between which exists a direct relationship, in other words exists a primary key and a foreign key, the dependent table being the table containing the foreign key, and the parent table the table containing the primary key;
- parent vs. child tables, is similar with the previous definition, the child table being an alternative naming for the dependent table, in addition we can talk also about grandchildren when a child table is referenced at its turn by other table called thus, when the three tables are considered together, the grandchild table;
- referent vs. referenced tables, another alternative for naming the tables between which exists a direct relationship, the referenced table being the table containing the foreign key, and the referent table the ones containing the primary key;
- inner vs. outer tables, these terms are used only in the case of outer joins, the inner table being the table from which are selected all the records, while the outer table is the table from which are selected only the matched records, therefore the two are also called sometimes the row-preserving table, respectively the null-supplying table;

None of the above pair of naming conventions are perfect because they can’t be used to address all the join situations in which tables are involved, and some of the terms may be used interchangely, especially when a term is preferred in order to send across adequate meaning, for example referring to a table as a inner table then we imply also that we deal with a left or right join. I realized that in time I used all the the above terms in a situation or another, however I prefer to use the left vs. right tables denomination, and occasionally inner vs. outer tables when referring to left or right joins.

In addition to the two tables joined a query typically includes also a join operator that specifies the join type (union, inner, outer or cross join), the join constraints that specifies which attributes are used to join the tables and the logical operators used, and eventually the non-join constraints referring to other constraints than the join constraints. I’m saying typically because either of the two could be missing, thus if the the join constraint and operator miss then we deal with a Cartesian join, while if the join constraint is appearing in the WHERE instead of FROM clause then there is no need to specify the join operator or in case of a UNION there is no need to specify the join constraint, though I will come back later on this topic.

The work with joins is deeply rooted in the Set Theory (the tables joined referred also as datasets) and the set operations – union, subtraction and intersection, their projection in world of databases coming, from my point of view and not sure if the database literature really discusses about this, into two flavors – vertical vs. horizontal joining. By vertical joining I’m referring to the intrinsic character of joins, in the result dataset the attributes from the two joined datasets appearing on the same row, while in the horizontal joining they never appear on the same row, thus a row being formed exclusively from the records of only one of the tables involved in the join. Thus we can discuss about vertical/horizontal joins as projection of union, intersection and subtraction operations, each of them with different connotations, the horizontal operations being actually similar to the ones from set theory. If we discuss about joins then we inevitably we have to mention also the anti-joins and semi-joins, a special type of queries based on NOT IN, respectively EXISTS operators.

|>> Next Post

05 February 2010

🕋Data Warehousing: Star Schema (Definitions)

"A relational database structure in which data is maintained in a single fact table at the center of the schema with additional dimension data stored in dimension tables. Each dimension table is directly related to the fact table by a key column." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A star schema is a dimensional data model implemented on a relational database." (Claudia Imhoff et al, "Mastering Data Warehouse Design", 2003)

"A star schema is a set of tables comprised of a single, central fact table surrounded by dimension tables. Each dimension is represented by a single dimension table. Star schemas implement dimensional data structures with denormalized dimensions. Snowflake schemas are an alternative to a star schema design." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling" 2nd Ed., 2005)

"A single fact table surrounded by a single hierarchical layer of dimensional tables, in a data warehouse database." (Gavin Powell, "Beginning Database Design", 2006)

"A single fact table which joins to many dimension tables, each of which is a single denormalized dimension table." (Reed Jacobsen & Stacia Misner, "Microsoft SQL Server 2005 Analysis Services Step by Step", 2006)

"The instantiation of a dimensional model in a relational database. A star schema consists of a fact table and the dimension tables that it references. The fact table contains facts and foreign keys; the dimension tables contain dimensional attributes by which the facts will be filtered, rolled up, or grouped." (Christopher Adamson, "Mastering Data Warehouse Aggregates", 2006)

"The implementation of a dimensional model in a relational database. The tables are organized around a single central fact table possessing a multi-part key, and each surrounding dimension table has its own primary key." (Laura Reeves, "A Manager's Guide to Data Warehousing", 2009)

"The basic form of data organization for a data warehouse, consisting of a single large fact table and many smaller dimension tables." (Toby J Teorey, ", Database Modeling and Design 4th Ed", 2010)

"The arrangement of the collection of fact and dimension tables in the dimensional data model, resembling a star formation, with the fact table placed in the middle surrounded by the dimension tables. Each dimension table is in a one-to-many relationship with the fact table." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"The basic form of data organization for a data warehouse, consisting of a single large fact table and many smaller dimension tables." (Toby J Teorey, ", Database Modeling and Design" 4th Ed., 2010)

"A common form of a dimensional data model, where a fact table is directly linked by foreign keys to several dimension tables." (Craig S Mullins, "Database Administration", 2012)

"A relational database structure in which data is maintained in a single fact table at the center of the schema with additional dimension data stored in dimension tables. Each dimension table is directly related to and usually joined to the fact table by a key column." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A relational schema whose design represents a dimensional data model. The star schema consists of one or more fact tables and one or more dimension tables that are related through foreign keys." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

"A type of relational database schema that is composed of a set of tables comprising a single, central fact table surrounded by dimension tables. See also dimension table, star join." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

03 February 2010

🕋Data Warehousing: Snowflake Schema (Definitions)

"An extension of a star schema such that one or more dimensions are defined by multiple tables. In a snowflake schema, only primary dimension tables are joined to the fact table. Additional dimension tables are joined to primary dimension tables. " (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A variation on dimensional schema design in which dimension tables are further normalized, split into multiple tables based on hierarchies in the data." (Christopher Adamson, "Mastering Data Warehouse Aggregates", 2006)

"A single fact table which joins to many dimension tables, with each dimension normalized as two or more tables." (Reed Jacobsen & Stacia Misner, "Microsoft SQL Server 2005 Analysis Services Step by Step", 2006)

"A variation of the star schema in which the business dimensions are implemented as a set of normalized tables. The resulting diagram resembles a snowflake." (Laura Reeves, "A Manager's Guide to Data Warehousing", 2009)

"A normalized version of the STAR schema in which dimension tables are partially or fully normalized. Not generally recommended because it compromises query performance and simplicity for understanding." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"A star schema in which dimension tables reference other tables." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

15 February 2009

🛢DBMS: Relational Database Management System [RDBMS] (Definitions)

"A system that organizes data into related rows and columns. SQL Server is a relational database management system." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A database, built on a model of data as existing in rows and columns, intended to embody the theoretical foundations of relational data that were originally defined by Dr. E. F. Codd at IBM. The Oracle server is one example, along with IBM's DB2, Microsoft's SQL Server, and mySQL." (Bill Pribyl & Steven Feuerstein, "Learning Oracle PL/SQL", 2001)

"Database management system based on the relational model that supports the full range of standard SQL. Uses a series of joined tables with rows and columns to organize and store data." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit 2nd Ed ", 2002)

"A system that organizes data into related rows and columns. SQL Server is a relational database management system." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A system used to create, edit, and manage relational databases." (Johannes Link & Peter Fröhlich, "Unit Testing in Java", 2003)

"The controlling software for databases in which data is organized into related objects within a database rather than tied to a file. Each of these objects is related to another in some way." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A system that uses a database that contains tables with data. The management system part is the part allowing you access to that database, and the power to manipulate both the database and the data contained within it." (Gavin Powell, "Beginning Database Design", 2006)

"Software that organizes manipulates and retrieves data stored in a relational database." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"A system that organizes data into related rows and columns. SQL Server is an RDBMS." (Jim Joseph, "Microsoft SQL Server 2008 Reporting Services Unleashed", 2009)

"A database management system whose architecture is based on Dr. E. F. Codd’s relational theory. That is, it stores data in terms of simple, two dimensional tables." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

"Type of DBMS that uses SQL to store data in related tables." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"A collection of programs that manages a relational database. The RDBMS software translates a user’s logical requests (queries) into commands that physically locate and retrieve the requested data. A good RDBMS also creates and maintains a data dictionary (system catalog) to help provide data security, data integrity, concurrent access, easy access, and system administration to the data in the database through a query language (SQL) and application programs." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A database system that organizes data into related rows and columns as specified by a relational model." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A database management system that organizes data in defined tables. " (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"A collection of hardware and software that organizes and provides access to a relational database." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

02 February 2009

🛢DBMS: Relational Database (Definitions)

"A collection of data arranged in rows and columns and manipulated using relational algebraic operations." (Joseph P Bigus, "Data Mining with Neural Networks: Solving Business Problems from Application Development to Decision Support", 1996)

"A collection of information organized in tables, each of which models a class of objects of interest to the organization [..]" (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A collection of information organized in tables. Each table models a class of objects of interest to the organization." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A collection of tables that stores data without any assumptions as to how the data is related within the tables or between the tables." (Bob Bryla, "Oracle Database Foundations", 2004)

"A method for storing data in multiple tables, where relationships exist between tables." (Margaret Y Chu, "Blissful Data ", 2004)

"A database in which data is viewed as being stored in tables consisting of columns (data items) and rows (units of information). Data from different tables can be combined to form new data relationships." (Sybase, "Glossary", 2005)

"A collection of information organized in tables. Each table models a class of objects of interest to the organization. Each column in a table models an attribute of the object. Each row in a table represents one entity in the class of objects modeled by the table. Queries can use data from one table to find related data in other tables." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"The set of requirements and fulfilling data that is developed and documented during critical parameter management. They link many-to-many relationships up and down throughout the hierarchy of the system being developed." (Clyde M Creveling, "Six Sigma for Technical Processes: An Overview for R Executives, Technical Leaders, and Engineering Managers", 2006)

"A database that organizes data in the form of tables." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"A database that stores data in tables containing rows and columns, and that allows queries representing relationships among records in different tables." (Rod Stephens, "Beginning Database Design Solutions", 2008)

"A database where tables hold records containing fields. Fields holding the same values link related tables." (Rod Stephens, "Start Here!™ Fundamentals of Microsoft® .NET Programming", 2011)

"A database for storing structured data. This allows for a fixed format, usually a limited number of values in a defined order." (Kenneth A Shaw, "Integrated Management of Processes and Information", 2013)

"A database in which data is stored in multiple, interrelated tables." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"A database in which the information is organized into tables related to each other by specific rules." (Mike Harwood, "Internet Security: How to Defend Against Attackers on the Web" 2nd Ed., 2015)

"A database that stores related data in rows and columns in tables." (Rod Stephens, "Beginning Software Engineering", 2015)

"Essentially, a collection of tables or lists." (E C Nelson & Stephen L Nelson, "Excel Data Analysis For Dummies ", 2015)

"A database that can be perceived as a set of tables and manipulated in accordance with the relational model of data. Each database includes a set of system catalog tables that describe the logical and physical structure of the data, a configuration file containing the parameter values allocated for the database, and a recovery log with ongoing transactions and archivable transactions." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A database or database management system that stores information in tables as rows and columns of data, and conducts searches by using the data in specified columns of one table to find additional data in another table." (Microsoft Technet)

"A database that can be perceived as a set of tables and manipulated in accordance with the relational model of data." (IBM)

"A database that conforms to the relational model, storing data in a set of simple relations." (Oracle)

26 January 2009

🛢DBMS: Table-Valued Functions [TVFs] (Definitions)

"a function that returns a result set because it returns a SELECT statement from a table." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"User-defined function in which a table is returned as a result, as opposed to a single data value." (Sara Morganand & Tobias Thernstrom , MCITP Self-Paced Training Kit : Designing and Optimizing Data Access by Using Microsoft SQL Server 2005 - Exam 70-442, 2007)

"A user-defined function that returns a table." (Microsoft, "SQL Server 2012 Glossary", 2012)

04 January 2009

🛢DBMS: User-defined Functions [UDFs] (Definitions)

"A Transact- SQL function defined by a user. Functions encapsulate frequently performed logic in a named entity that can be called by Transact-SQL statements instead of recoding the logic in each statement." (Microsoft Corporation, Microsoft SQL Server 2000 system administration, 2001)

"Functions that you write as opposed to built-in functions supplied by a programming language." (Greg Perry, "Sams Teach Yourself Beginning Programming in 24 Hours" 2nd Ed., 2001)

"A user-defined function is a Transact-SQL function defined by a user. Functions encapsulate frequently performed logic in a named entity that can be called by Transact-SQL statements instead of recoding the logic in each statement." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A function that is written by an analyst, user, or database administrator and does not come as part of the default installation of the Oracle server software." (Bob Bryla, "Oracle Database Foundations", 2004)

"A collection of T-SQL statements with a well-defined set of input parameters, but only one output which can be a scalar value or a table. User-defined functions allow the encapsulation of various logical and database operations, but cannot be used to affect changes to a database." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A custom function written by developers in SQL Server." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance 70-444" , 2007)

"A Transact-SQL function defined by a user. Functions encapsulate frequently performed logic in a named entity that can be called by Transact-SQL statements instead of recoding the logic in each statement." (Microsoft Docs, SQL Server 2005)

"Extensible code added to the DBMS to be callable like built-in database functions." (Craig S Mullins, "Database Administration" 2nd Ed, 2012)

"A function that is defined to the DB2 database system by using the CREATE FUNCTION statement and that can be referenced thereafter in SQL statements." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"In SQL Server, a Transact-SQL function defined by a user." (Microsoft Technet)

"A user-defined routine that returns at least one value. (IBM)

03 January 2009

🛢DBMS: Stored Procedures [SP] (Definitions)

"A collection of SQL statements and optional control-of-flow statements stored under a name. SQL Server-supplied stored procedures are called system procedures." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"Transact-SQL statements stored under a name and processed as a unit. Stored procedures are stored within a database and can be executed with one request from an application. A stored procedure can also allow user-declared variables, conditional execution, and other powerful programming features." (Patrick Dalton, Microsoft SQL Server Black Book, 1997)

"Transact-SQL programs stored on the server in a compiled form. Processing for these objects is done at the server level, making them as fast as any other tables the procedure references on the local machine. They can be used to enforce entity integrity, referential integrity, and domain integrity." (Rob Scrimger et al, "MCSE TestPrep: SQL Server 6.5 Administration", 1998)

"A precompiled collection of Transact-SQL statements and optional control-of-flow statements stored under a name and processed as a unit. Stored procedures are stored within a database, can be executed with one call from an application, and allow user-declared variables, conditional execution, and other powerful programming features." (Microsoft Corporation, "Microsoft SQL Server 7.0 System Administration Training Kit", 1999)

"A stored procedure is a group of Transact-SQL statements compiled into a single execution plan." (Microsoft Corporation, "SQL Server 2000 Books Online", 2000)

"A program that resides and executes inside a database system. In Oracle, the term generally encompasses not only procedures but also functions and packages." (Bill Pribyl & Steven Feuerstein, "Learning Oracle PL/SQL", 2001)

"A software program stored in the database itself to be executed on the server based on stipulated conditions." (Paulraj Ponniah, "Data Warehousing Fundamentals", 2001)

"A precompiled collection of Transact-SQL statements stored under a name and processed as a unit." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A collection of T-SQL statements with a well-defined set of inputs, called input parameters, and a well-defined set of outputs, which may be output parameters, return values, or cursors. Stored procedures allow the encapsulation of various database operations." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A batch of SQL commands that are precompiled and saved as a procedure in SQL Server" (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A collection of T-SQL statements with a well-defined set of inputs, called input parameters, and a well-defined set of outputs, which may be output parameters, return values, or cursors. Stored procedures allow the encapsulation of various database operations." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"Also called a database procedure, a chunk of code stored within and executed from within a database, typically on data stored in a database (but not always)." (Gavin Powell, "Beginning Database Design", 2006)

"An executable code module in SQL Server." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance 70-444" , 2007)

"A piece of code stored in the database that can be executed by various pieces of code such as check constraints or application code. Stored procedures are a good place to store business logic that should not be built into the database's structure." (Rod Stephens, "Beginning Database Design Solutions", 2008)

"Centralized, server-based application code. Typically used to standardize business logic and reduce the amount of required programming effort, you can build your own stored procedures or leverage the many built-in stored procedures offered by SQL Server." (Robert D. Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-In-One Desk Reference For Dummies", 2008)

"A precompiled collection of Transact-SQL statements stored under a name and processed as a unit." (Jim Joseph et al, "Microsoft SQL Server 2008 Reporting Services Unleashed", 2009)

"A set of SQL statements (subroutine) available to applications accessing a relational database system. Stored procedures are physically stored in the database." (John Goodson & Robert A Steward, "The Data Access Handbook", 2009)

"A software program stored in the database itself to be executed on the server based on stipulated conditions." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"An executable module stored in a database instance, to be invoked as part of the verification process for particular data elements." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

"A SQL program module that is invoked by an application program using the SQL CALL command." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"(1) A named collection of procedural and SQL statements. (2) Business logic stored on a server in the form of SQL code or some other DBMS-specific procedural language." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A precompiled code routine stored within a Database Management System." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A precompiled routine or program stored within the DBMS and executable using DBMS commands." (Craig S Mullins, "Database Administration" 2nd Ed, 2012)

"A precompiled collection of SQL statements and optional control-of-flow statements stored under a name and processed as a unit. They are stored in an SQL database and can be run with one call from an application." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A database object that can contain executable database server logic. Permissions can be assigned to stored procedures in order to prevent unwanted data modifications and to provide more granular control of security." (Mark Rhodes-Ousley, "Information Security: The Complete Reference" 2nd Ed., 2013)

"A block of code that is stored in an SQL database and can be called by applications. In some systems the code is written in an extended SQL dialect, in others it is written in a standard programming language and contains Embedded SQL. Stored procedures are used for one or more of the following reasons: to improve performance; to provide greater security; or to improve manageability." (Microfocus)

"A block of procedural constructs and embedded SQL statements that is stored in a database and that can be called by name. Stored procedures allow an application program to be run in two parts, one on the client and the other on the server, so that one call can produce several accesses to the database." (IBM)

"A named PL/SQL block or Java program that Oracle Database stores in the database. Applications can call stored procedures by name." (Oracle)

"A precompiled collection of code such as SQL statements and optional control-of-flow statements stored under a name and processed as a unit. Stored procedures are stored within a database; they can be executed with one call from an application and allow user-declared variables, conditional execution, and other powerful programming features." (Microsoft)

"A precompiled collection of Transact-SQL statements that are stored under a name and processed as a unit." (Microsoft Technet)

"A stored procedure is a combination of SQL statements and control and conditional handling statements that provides an interface to the database." (Teradata)

"A stored procedure is a precompiled executable object that contains one or more Transact-SQL statements." (Microsoft) [source]

"A stored procedure is an executable object stored in a database." (Microsoft) [source]

02 January 2009

🛢DBMS: Views (Definitions)

"An alternative way of looking at the data in one or more tables. Usually created as a subset of columns from one or more tables." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"Predefined query that provides an alternative method of looking at the data in one or more underlying tables. Typically views are used to focus a users view of data and for security." (Rob Scrimger et al, "MCSE TestPrep: SQL Server 6.5 Administration", 1998)

"An alternate way of looking at data from one or more tables in the database. A view is a virtual table, usually created as a subset of columns from one or more tables, which is expressed as a SELECT statement. Depending on the definition, data in base tables can be modified through views." (Microsoft Corporation, "Microsoft SQL Server 7.0 System Administration Training Kit", 1999)

"A 'virtual table', usually referring to a database object that has been named and created with SQL's CREATE VIEW statement. Usually created for read-only purposes, it is possible to update the database through some views; as of Oracle8, it is also possible to associate an INSTEAD OF trigger with a view to allow INSERT, UPDATE, and DELETE operations on the view." (Bill Pribyl & Steven Feuerstein, "Learning Oracle PL/SQL", 2001)

"SQL statement that creates logical copies of a table or a complete query that can be used separately in a SELECT statement. Views are semantically independent, so the separate roles of a dimension usually are implemented as views." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit" 2nd Ed., 2002)

"A database object that can be referenced the same way as a table in SQL statements. Views are defined using a SELECT statement and are analogous to an object that contains the result set of this statement." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A view is a relational database object that can be referenced and built by using SELECT statements to join data from one or more base tables. Views are similar to tables in that data can be retrieved and modified and indexes can be built." (Thomas Moore, "MCSE Training Guide 70-229: Designing and Implementing Databases with Microsoft® SQL Server™ 2000 Enterprise Edition", 2003)

"A database object that is based on a SELECT statement against one or more tables or other views in the database. A regular view does not store any data in the database; only the definition is stored. Views are also known as stored queries." (Bob Bryla, "Oracle Database Foundations", 2004)

"A relational database object that can be referenced and built by using SELECT statements to join data from one or more base tables. Views are similar to tables in that data can be retrieved and modified and indexes can be built." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"A query that is stored in the data dictionary and is rerun when called for. A view appears to a user to be a table." (Sikha S Bagui & Richard W Earp, "Learning SQL on SQL Server 2005", 2006)

"virtual representations of a table whose contents are defined by a query" (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A database object in SQL Server used to encapsulate a query. Commonly referred to as a virtual table." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance 70-444" , 2007)

"A virtual table which is derived from base table using a query." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"A virtual grouping of one or more tables, often done to reduce complexity while increasing security and reliability." (Robert D. Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-In-One Desk Reference For Dummies", 2008)

"A stored SQL query from which a virtual table is created for use each time the name of the view is used." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"A virtual table consisting of columns and selected instances from other tables assembled in a systematic way." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

"A presentation of a set of data from one or more physical tables as one logical table. A view can include some or all the rows and columns from each contributing table, and can be defined as the result table from a SELECT statement." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A virtual table, defined as a SQL SELECT statement, to provide a subset of data from one or more tables." (Craig S Mullins, "Database Administration" 2nd Ed, 2012)

"A virtual table that uses the result set of a saved query." (Faisal Akkawi et al, "MCSA Guide to Microsoft SQL Server 2012 - Exam #70-462", 2014)

"A database object that is created from one or more tables and can be used the same as a table. A view is a virtual table that has no storage requirements of its own." (Arie D Jones, "SQL in 24 Hours" 6th Ed., 2015)

"A virtual relation defined by the database administrator in order to keep subjects from viewing certain data." (Adam Gordon, "Official (ISC)2 Guide to the CISSP CBK 4th Ed.", 2015)

SQL Troubles

Pages