SQL Troubles

26 October 2008

GSCM: Kanban (Definitions)

"In lean cellular manufacturing, a visual device, such as a card, floor space (kanban square), or production bin, which communicates to a cell that additional materials or products are demanded from the subsequent cell." (Leslie G Eldenburg & Susan K Wolcott, "Cost Management" 2nd Ed., 2011)

"A card-based techniques for authorizing the replenishment of materials." (Daryl Powell, "Integration of MRP Logic and Kanban Shopfloor Control", 2014)

"A just-in-time technique that uses kanban cards to indicate when a production station needs more parts. When a station is out of parts (or is running low), a kanban card is sent to a supply station to request more parts." (Rod Stephens, "Beginning Software Engineering", 2015)

"A note, card, or signal, a Kanban used to trigger a series of processes, usually downstream in the supply chain, in order complete tasks, products, and/or services. As part of a workflow management systems, timely Kanbans allow for efficient operations that enable agile, just-in-time (JIT), and lean philosophies to work." (Alan D Smith, "Lean Principles and Optimizing Flow: Interdisciplinary Case Studies of Best Business Practices", 2019)

"Agile method to manage work by limiting work in progress. Team members pull work as capacity permits, rather than work being pushed into the process when requested. Stimulates continuous, incremental changes. Aims at facilitating change by minimizing resistance to it." (Jurgen Janssens, "Managing Customer Journeys in a Nimble Way for Industry 4.0", 2019)

"This tool is used in pull systems as a signaling device to trigger action. Traditionally it used cards to signal the need for an item. It can trigger the movement, production, or supply of a unit in a production chain." (Parminder Singh Kang et al, "Continuous Improvement Philosophy in Higher Education", 2020)

"A signal that communicates a requirement for a quantity of product." (Microsoft, "Dynamics for Finance and Operations Glossary")

"A signaling device that gives instruction for production or conveyance of items in a pull system. Can also be used to perform kaizen by reducing the number of kanban in circulation, which highlights line problems." (Lean Enterprise Institute)

25 October 2008

GSCM: Supply Chain Management (Definitions)

"The practice of designing and optimizing supply chain business processes to provide superior service to those customers who drive the bulk of one’s profit." (Steve Williams & Nancy Williams, "The Profit Impact of Business Intelligence", 2007)

"The management of business units in the provision of products and services. It spans the movement and storage of raw materials, work-in-process inventory, and finished goods from point-of-origin to point-of-consumption." (Tony Fisher, "The Data Asset", 2009)

"Software tools or modules used in the planning, scheduling, and control of supply chain transactions (spanning raw materials to finished goods from point of origin to point of consumption), managing supplier relationships, and controlling associated business processes." (Janice M Roehl-Anderson, "IT Best Practices for Financial Managers", 2010)

"To provision products or services to a network of interconnected businesses." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"The management of all of the activities along the supply chain, from suppliers, to internal logistics within a company, to distribution, to customers. This includes ordering, monitoring, and billing." (Linda Volonino & Efraim Turban, "Information Technology for Management 8th Ed", 2011)

"The process of ensuring optimal flow of inputs and outputs." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"In basic terms, supply chain is the system of organizations, people, activities, information and resources involved in moving a product or service from supplier to customer. The configuration and management of supply chain operations is a key way companies obtain and maintain a competitive advantage." (Alan D Smith, "Lean Principles and Optimizing Flow: Interdisciplinary Case Studies of Best Business Practices", 2019)

"Supply chain management (SCM) refers to the processes of creating and fulfilling demands for goods and services. It encompasses a trading partner community engaged in the common goal of satisfying end customers." (Gartner)

24 October 2008

GSCM: Supply Chain (Definitions)

"Fulfillment process from customer purchase through manufacturing, factory, raw material, and component supplier." (Timothy J Kloppenborg et al, "Project Leadership", 2003)

"The network of suppliers that provides raw materials, components, subassemblies, subsystems, software, or complete systems to your company." (Clyde M Creveling, "Six Sigma for Technical Processes: An Overview for R Executives, Technical Leaders, and Engineering Managers", 2006)

"The supply chain refers to the processes and methods supporting the physical existence of a product from the procurement of materials through the production, storage (creating inventory), and movement (logistics) of the product into its chosen distribution channels." (Steven Haines, "The Product Manager's Desk Reference", 2008)

"A pipeline composed of multiple companies that perform any of the following functions: procurement of materials, transformation of materials into intermediate or finished products, distribution of finished products to retailers or customers, recycling or disposal in a landfill." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed, 2011)

"Flow of resources from the initial suppliers (internal or external) through the delivery of goods and services to customers and clients. (510, 646)" (Leslie G Eldenburg & Susan K Wolcott, "Cost Management" 2nd Ed, 2011)

"The optimal flow of product from site of production through intermediate locations to the site of final use." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The people and processes involved in the production and distribution of goods or services. " (DK, "The Business Book", 2014)

"The channel of distribution that enables products to be delivered from the supplier to the final buyer."(Gökçe Ç Ceyhun, "An Assessment for Classification of Distribution Network Design", 2020)

"A system of organizations, people, activities, information, and resources, possibly international in scope, that provides products or services to consumers." (CNSSI 4009-2015)

"Linked set of resources and processes between multiple tiers of developers that begins with the sourcing of products and services and extends through the design, development, manufacturing, processing, handling, and delivery of products and services to the acquirer." (NIST SP 800-37)

"The network of retailers, distributors, transporters, storage facilities, and suppliers that participate in the sale, delivery, and production of a particular product." (NIST SP 800-98)

28 September 2008

W3: Semantic Web (Definitions)

"The Web of data with meaning in the sense that a computer program can learn enough about what the data means to process it." (Tim Berners-Lee, "Weaving the Web", 1999)

"An evolving, collaborative effort led by the W3C whose goal is to provide a common framework that will allow data to be shared and re-used across various applications as well as across enterprise and community boundaries." (J P Getty Trust, "Introduction to Metadata" 2nd Ed, 2008)

"Communication protocols and standards that would include descriptions of the item on the Web such as people, documents, events, products, and organizations, as well as, relationship between documents and relationships between people." (Craig F Smith & H Peter Alesso, "Thinking on the Web: Berners-Lee, Gödel and Turing", 2008)

"The Web of data with meaning in the sense that a computer program can learn enough about what the data means to process it. The principle that one should represent separately the essence of a document and the style is presented." (Craig F Smith & H Peter Alesso, "Thinking on the Web: Berners-Lee, Gödel and Turing", 2008)

"A machine-processable web of smart data, [where] smart data is data that is application-independent, composeable, classified, and part of a larger information ecosystem (ontology)." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

"An evolving extension of the Web in which Web content can be expressed not only in natural language but also in a form that can be understood, interpreted, and used by intelligent computer software agents, permitting them to find, share, and integrate information more easily." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"The next-generation Internet in which all content is tagged with semantic tags defined in published ontologies. Interlinking these ontologies will allow software agents to reason about information not directly connected by document creators." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"is a term coined by World Wide Web Consortium (W3C) director Sir Tim Berners-Lee. It describes methods and technologies to allow machines to understand the meaning - or 'semantics'- of information on the World Wide Web." (Jingwei Cheng et al, "RDF Storage and Querying: A Literature Review", 2016)

"The vision of a Semantic Web world builds upon the web world, but adds some further prescriptions and constraints for how to structure descriptions. The Semantic Web world unifies the concept of a resource as it has been developed in this book, with the web notion of a resource as anything with a URI. On the Semantic Web, anything being described must have a URI. Furthermore, the descriptions must be structured as graphs, adhering to the RDF metamodel and relating resources to one another via their URIs. Advocates of Linked Data further prescribe that those descriptions must be made available as representations transferred over HTTP." (Robert J Glushko, "The Discipline of Organizing: Professional Edition" 4th Ed., 2016)

"A collaborative effort to enable the publishing of semantic machine-readable and shareable data on the Web." (Panos Alexopoulos, "Semantic Modeling for Data", 2020)

16 September 2008

W3: Cyberspace (Definitions)

"A term used to describe the nonphysical, virtual world of computers." (Andy Walker, "Absolute Beginner’s Guide To: Security, Spam, Spyware & Viruses", 2005)

"A metaphoric abstraction for a virtual reality existing inside computers and on computer networks." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The online world of computer networks where people can interact with others without physically being with them. People commonly interact with cyberspace via the Internet." (Darril Gibson, "Effective Help Desk Specialist Skills", 2014)

"The interdependent network of information technology infrastructures, which includes the Internet, telecommunications networks, computer systems, and embedded processors and controllers." (Olivera Injac & Ramo Šendelj, "National Security Policy and Strategy and Cyber Security Risks", 2016)

"A complex hyper-dimensional space involving the state of many mutually dependent computer and network systems with complex and often surprising properties as compared to physical space." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

"Artifacts based on or dependent on computer and communications technology; the information that these artifacts use, store, handle, or process; and the interconnections among these various elements." (William Stallings, "Effective Cybersecurity: A Guide to Using Best Practices and Standards", 2018)

"Refers to a physical and non-physical terrain created by and/or composed of some or all of the following: computers, computer systems, networks, and their computer programs, computer data, content data, traffic data, and users." (Thokozani I Nzimakwe, "Government's Dynamic Approach to Addressing Challenges of Cybersecurity in South Africa", 2018)

"Cyberspace, is supposedly 'virtual' world/network created by links between computers, Internet-enabled devices, servers, routers, and other components of the Internet’s infrastructure." (Sanjeev Rao et al, "Online Social Networks Misuse, Cyber Crimes, and Counter Mechanisms", 2021)

31 August 2008

💎SQL Server: ROWCOUNT in action 🆕

Especially when working with big tables, the default behaviour of Query Analyzer is to not show the output until the last record has been fetched. This can be time and resource consuming and therefore I’ve appreciated the fact that TOAD and SQL Developer are fetching only a certain number of records. Now I can see that same can be done starting with SQL Server 2005 onward by modifying ROWCOUNT server property using Query/Query Options menu functionality.

Query Options under SQL Server 2008 or by running the command: SET ROWCOUNT <number of records>; Of course somebody may limit the number of records returned by a query using TOP function when working with SQL Server and ROWNUM in Oracle, though I find it not always handy – it depends from case to case. There are also technical implications between the two types of usage, according SQL Server Books online it is recommended to TOP with SELECT over ROWCOUNT with regard to scope and query optimization, however in this context only the later makes sense:

"As a part a SELECT statement, the query optimizer can use the value of expression in the TOP clause as part of generating an execution plan for a query. Because SET ROWCOUNT is used outside a statement that executes a query, its value cannot be used to generate a query plan for a query."

Notes:
1. Do not mix the ROWNUM with @@ROWNUM function which returns the number of rows affected by the last statement.
2. Some of us list all the records in order to see the number of records returned by a query, though that’s totally not advisable!

💎SQL Reloaded: AdventureWorks requires FILESTREAM enabled

Surprises, surprises, surprises, programmers’ world is full of them! When you say that everything is ok, you just discover that something went wrong. I was expecting to have Adventure Works database installed though I haven’t checked that. I realized today that it’s missing, so I tried to reinstall it enabling this time the "Restore AdventureWorks DBs" feature, though I got another nice error:

Setup failed for MSSQLSERVER. The following features are missing: FILESTREAM Fix the problems and re-run setup.

Guy Burstein, in his blog, wrote that the STREAM support can be enabled using the following SQL command:

 exec [dbo.sp_filestream_configure] @enable_level = 3;

I tried that and another error came in:

Msg 2812, Level 16, State 62, Line 1 Could not find stored procedure 'sp_filestream_configure'

Checking my local installation of SQL Server Books Online, I found no track of sp_filestream_configure stored procedure, but I found that I can enable the STREAM support using sp_configure stored procedure as below:

EXEC sp_configure filestream_access_level, 2
RECONFIGURE
GO

Once I executed the 3 lines together, I got the following confirmation message which, amusingly, still recommands me to run the RECONFIGURE statement even if I did that. Anyway better more redundant information than nothing…

Configuration option 'filestream access level' changed from 2 to 2. Run the RECONFIGURE statement to install.

Happy coding!

30 August 2008

⌛SQL Reloaded: No records returned by queries (Checklist)

No records returned by a query even if there should be results? Usually. I’m using the following checklist: 1. check if the tables contain data. Silly but effective, especially in Oracle APPS in which some tables got deprecated and were replaced by tables with similar names (PA_PROJECTS_ALL vs. PA_PROJECTS), though that could happen in other environments too;

2. check if the JOIN syntax is correct;

3. check if one of the columns use in JOIN has only NULL values;

4. check if the constraints used in WHERE clause causes makes sense (e.g. wrong values or syntax);

5. for Oracle flavored queries, check if in WHERE clause there is a column not referenced with the table name or alias, and the column is available in more than one table used in the query. This Oracle bug is really dangerous when doing fast query checks!

6. for Oracle (APPS), check whether the query or view uses USERENV function with LANG or LANGUAGE text parameter, normally a constraint like: TABLE1.LANGUAGE = USERENV(‘LANG’).

The problem with such queries comes when user’s system language is other than the one expected, and thus query’s output might not be as expected. Usually it is preferable to hardcode the value, when possible: TABLE1.LANGUAGE = ‘US’ Note: Actually, also the tools you are using to run a query could create issues, for example a query run under Oracle’s SQL Developer was not returning records even if in TOAD did that. The problem was solved with the installation of a newer SQL Developer version.

⌛ Oracle Troubleshooting: ANSI 92 JOIN syntax error

Lately I’ve been working a lot with Oracle APPS, doing mainly ad-hoc reporting. One of my nightmares is an Oracle bug related to ANSI 92 syntax:

“ORA-01445: cannot select ROWID from, or sample, a join without a key-preserved table

Unfortunately, even if the bug was solved by Oracle, it seems the update was missed on some servers and the bug haunts my queries almost on a daily basis.

Having an SQL Server background and, for code clearness, I prefer ANSI 92 JOIN syntax:

SELECT A.column1, B.column2
FROM table1 A JOIN table2 B
 ON A.column1 = B.column2

instead of using the old fashioned writing:

SELECT A.column1, B.column2
FROM table1 A , table2 B
WHERE A.column1 = B.column2

In theory the two queries should provide the same output and have, hopefully, similar performance. The problem with ANSI 92 syntax is that, on some Oracle installations, when the number of joins exceeds a certain limit, usually greater than 7, the above error is thrown.

What one can do is to reduce the number of joins to the main table by restructuring the query and grouping multiple tables into subqueries, which are then joined to the main table. For the tables from which is returned only one column, one can move the table into the SELECT statement.

Happy coding!

💠🛠️SQL Server: Administration (Part III: Troubleshooting Adventure Works installation error on Vista)

I tried to install the Adventure Works OLTP & DW on SQL Server 2008 RTM from CodePlex though I got an error:

“The installer has encountered an unexpected error installing this package. This may indicate a problem with this package. The error code is 2738.”

Initially I thought that the problem was caused by existing Adventure Works installations made on a previous CTP version of SQL Server 2008, forgetting to uninstall them when I uninstalled the CTP version. Totally wrong! Doing a little research, I found first a post on CodePlex Discussion forum mentioning that the problem could be caused by VBScript runtime because, as Toms’Tricks blog highlights, VBScript and Jscript are not registered on Windows Vista. Wonderful! I just run regsvr32 vbscript.dll command and it worked! Another situation in which regsvr32 saved the day!

I wonder why I haven’t got the same problem when I previously installed Adventure Works database on CTP version! Could it be because of Windows Vista SP1 changes (I installed Windows Vista SP1 after SQL Server CTP)?

💠🛠️SQL Server: Administration (Part II: Troubleshooting Microsoft SQL Server 2008 installation error)

This week I tried to install SQL Server 2008 however I got the following error:

“A previous release of Microsoft Visual Studio 2008 is installed on this computer. Upgrade Microsoft Visual Studio 2008 to the SP1 before installing SQL Server 2008.”

It’s true that I have previously installed Visual Studio 2008, though once I did that I checked if there are any updates and thus I installed SP1 too. I did a quick search on Google and the first results pointed me to an article o Microsoft Help and Support website: Visual Studio 2008 SP1 may be required for SQL Server 2008 installations. It didn’t make sense; in the end I’ve installed the server but enabled the installation of the following components:

• Management Tools (Basic or Complete)

• Integration Services

• Business Intelligence Development Studio

The server was installed without problems, so I tried to install the remaining components getting the same error as above. I had to stop at that point and today, giving more thought to the problem, I realized that the error could be caused by Microsoft Visual Studio 2008 Express edition, which I managed to install a few months back. Instead of uninstalling Microsoft Visual Studio 2008 it looked easier to uninstall the Express version, and once I did that, I managed to install the remaining components. Actually I checked before if there is a SP1 for Microsoft Visual Studio 2008 Express, I arrived at Microsoft Visual Studio 2008 Express Editions with SP1 page, though I remembered that I have to install the Web Developer, Visual Basic and C# 2008 separately and in the end I presumed that maybe it would be easier to uninstall the existing versions and try then to install SQL Server remaining components. I haven’t tried to install the Express editions with SP1 as now I have the Professional edition.

04 August 2008

Application Architecture: Enterprise Service Bus (Definitions)

"A layer of middleware that enables the delivery and sharing of services across and between business applications. ESBs are typically used to support communication, connections, and mediation in a service-oriented architecture." (Evan Levy & Jill Dyché, "Customer Data Integration", 2006)

"The infrastructure of a SOA landscape that enables the interoperability of services. Its core task is to provide connectivity, data transformations, and (intelligent) routing so that systems can communicate via services. The ESB might provide additional abilities that deal with security, reliability, service management, and even process composition. However, there are different opinions as to whether a tool to compose services is a part of an ESB or just an additional platform to implement composed and process services outside the ESB." (Nicolai M Josuttis, "SOA in Practice", 2007)

"A middleware software architecture construct that provides foundational services for more complex architectures via an event-driven and standards-based messaging engine (the bus). An ESB generally provides an abstraction layer on top of an implementation of an enterprise messaging system. |" (Alex Berson & Lawrence Dubov, "Master Data Management and Data Governance", 2010)

"The infrastructure of an SOA landscape that enables the interoperability of services. Its core task is to provide connectivity, data transformations, and routing so that systems can communicate via services." (David Lyle & John G Schmidt, "Lean Integration", 2010)

"A software layer that provides data between services on an event-driven basis, using standards for data transmission between the services." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A packaged set of middleware services that are used to communicate between business services in a secure and predictable manner." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

22 February 2008

Business Applications: Customer Relationship Management (Definitions)

"Operational and analytic processes that focus on better understanding and servicing customers in order to maximize mutually beneficial relationships with each customer." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit" 2nd Ed., 2002)

"a popular DSS application designed to streamline customer and/or corporate relationships." (William H Inmon, "Building the Data Warehouse", 2005)

"A database system containing information on interactions with customers." (Glenn J Myatt, "Making Sense of Data", 2006)

"The infrastructure that enables the delineation of and increase in customer value, and the correct means by which to increase customer value and motivate valuable customers to remain loyal, indeed, to buy again." (Jill Dyché & Evan Levy, Customer Data Integration, 2006)

"The tracking and management of all the organization’s interactions with its customers in order to provide better service, encourage customer loyalty, and increase the organization’s long-term profit per customer." (Steve Williams & Nancy Williams, "The Profit Impact of Business Intelligence", 2007)

"A strategy devoted to the development and management of close relationships between customers and the company. In many cases, CRM is referred to as the automation tool that helps bring about this strategy." (Steven Haines, "The Product Manager's Desk Reference", 2008)

"Software intended to help you run your sales force and customer support." (Judith Hurwitz et al, "Service Oriented Architecture For Dummies" 2nd Ed., 2009)

"The technology and processes used to capture the details of interactions with customers and analyze that data to improve customer interaction, assess customer value, and build value and further loyalty." (Tony Fisher, "The Data Asset", 2009)

"A set of technologies and business processes designed to understand a customer, improve customer experience, and optimize customer-facing business processes across marketing, sales, and servicing channels." (Alex Berson & Lawrence Dubov, "Master Data Management and Data Governance", 2010)

"Refers to the set of procedures and computer applications designed to manage and improve customer service in an enterprise. Data warehousing, with integrated data about each customer, is eminently suitable for CRM." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"This is a packaged solution that delivers an end-to-end solution around contacting, understanding, and serving particular customer needs." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"Establishing relationships with individual customers and then using that information to treat different customers differently. Customer buying profiles and churn analysis are examples of decision support activities that can affect the success of customer relationships. Effective CRM is dependent on high quality master data about individuals and organizations (customer data integration)." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The entire process of maximizing the value proposition to the customer through all interactions, both online and traditional. Effective CRM advocates one-to-one relationships and participation of customers in related business decisions." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"Software designed to help you run your sales force and customer support operations." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"The management of current and future customer interactions with a business. This can include sales support, warranty and technical support activity, Internet website, marketing, and product advertising." (Kenneth A Shaw, "Integrated Management of Processes and Information", 2013)

"Customer relationship management is a model for managing a company’s interactions with current and future customers. It involves using technology to organize, automate, and synchronize sales, marketing, customer service, and technical support." (Keith Holdaway, "Harness Oil and Gas Big Data with Analytics", 2014)

"System that manages the customer-related data along with their past, present, and future interactions with the organization." (Hamid R Arabnia et al, "Application of Big Data for National Security", 2015)

"A set of tools, techniques, and methodologies for understanding the needs and characteristics of customers in order to better serve them." (Robert M Grant, "Contemporary Strategy Analysis" 10th Ed., 2018)

"CRM is the practice of helping customers manage sales, service, and marketing processes effectively and efficiently so that they can grow their business and provide excellent customer service." (Srini Munagavalasa, "The Salesforce Business Analyst Handbook", 2022)

13 January 2008

💎SQL Reloaded: Advices on SQL logic split in Web Applications

Yesterday I remembered about my first two small fights on SQL related topics, though in both situations I had to renounce temporarily to my opinions for the sake of armistice. It’s understandable, we can all make mistakes, unfortunately what we know hurts us more then what we don’t know. It’s amusing when I’m thinking about the two issues in discussion at those times, though then it was a little disturbing. What was about?! Actually both issues are related to web applications, my first “professional encounter” with programming.

Issue 1 – using JOINs or backend vs middle tier processing

JOINs are a powerful feature of SQL in combing related data from multiple tables in only one query, this coming with a little (or more) overhead from the database server side.
Web applications make use of lot of data access operations, data being pulled from a database each time a user requests a page, of course that happening when the page needs data from database(s) or execute commands on it, the CRUD (Create/Read/Update/Delete) gamma. That can become costly in time, depending on how data access was architected and requirements. The target is to pull smallest chunk of data possible (rule 1), with a minimum of trips to the database (rule 2).

Supposing that we need Employees data from a database for a summary screen with all employees, it could contain First Name, Last Name, Department and Contact information – City, Country, Email Address and Phone Number. Normally the information could be stored in 4 tables – Employees, Departments, Address and Countries, like in the below diagram.

The easiest and best way to pull the Employee needed data is to do a JOIN between tables:

-- Employee details
SELECT E.EmployeeId
, E.FirstName 
, E.LastName 
, D.Department 
, A.City 
, C.Country 
, A.Phone 
, E.EmailAddress
FROM dbo.Employees E
     JOIN dbo.Departments D
	   ON E.DepartmentId = D.DepartmentId 
	 JOIN dbo.Addresses A
	   ON A.EmployeeId = A.EmployeeId 
	      JOIN dbo.Countries C
		    ON A.CountryId = C.CountryId

More likely that two or more employees will have the same country or department, resulting in “duplication” of small pieces of information within the whole data set, contradicting rule 1. Can be pulled smaller chunks of data targeting only the content of a table, that meaning that we have to pull first all Employees or the ones matching a set of constraints, then all the departments or the only the ones for which an Employee was returned, and same with Addresses and Countries. In the end will have 4 queries and same number of roundtrips (or more). In the web page the code will have to follow the below steps:

Step 1: Pull the Employee data matching the query:

SELECT E.EmployeeID
, E.DepartmentID
, E.FirstName
, E.LastName
, E.EmailAddress
FROM Employees E
WHERE …

Step 2: Build the (distinct) list of Department IDs and a (distrinct) list of Employee IDs.

Step 3: Pull the Department data matching the query:

SELECT D.DepartmentID
, D.Department
FROM Departments D
WHERE DepartmentID IN (<list of Department IDs>)

Step 4: Pull the Address data matching the query:

SELECT A.EmployeeID
, A.CountryID
, A.City
, A.Phone
FROM Addresses A
WHERE EmployeeID IN (<list of Employee IDs>)

Step 5: Build the (distinct) list of Country IDs.

Step 6: Pull the Country data matching the query:

SELECT C.CountryID
, C.Country
FROM Countries C
WHERE CountryID IN (<list of Country IDs>)

And if this doesn’t look like an overhead for you, you have to take into account that for each Employee is needed to search the right Department from the set of data returned in Step 3, and same thing for Addresses and Countries. It’s exactly what the database server does but done on the web server, with no built in capabilities for data matching.
In order to overcome the problems raised by matching, somebody could go and execute for each employee returned in Step 1 a query like the one defined in Step 4, but limited only to the respective Employee, thus resulting an additional number of new roundtrips matching the number of Employees. Quite a monster, isn’t it? Please don’t do something like this!

It’s true that we always need to mitigate between minimum of data and minimum of roundtrips to a web server, though we have to take into account also the overhead created by achieving extremities and balance them in an optimum manner, implementing the logic on the right tier. So, do data matching as much as possible on the database server because it was designed for that, and do, when possible, data enrichment (e.g. formatting) only on the web server.

In theory the easiest way of achieving something it’s the best as long the quality remains the same, so try to avoid writing expensive code that’s hard to write, maintain and debug!

Issue 2: - LEFT vs FULL JOINs

Normally each employee should be linked to a Department, have at least one Address, and the Address should be linked to a Country. That can be enforced at database and application level, though it’s not always the case. There could be Employees that are not assigned to a Department, or without an Address; in such cases then instead of a FULL JOIN you have to consider a LEFT or after case a RIGHT (OUTER) JOIN. So, I’ve rewritten the first query, this time using LEFT JOINs.

-- Employee details (with LEFT JOINs)
SELECT E.EmployeeId
, E.FirstName 
, E.LastName 
, D.Department 
, A.City 
, C.Country 
, A.Phone 
, E.EmailAddress
FROM dbo.Employees E
     LEFT JOIN dbo.Departments D
	   ON E.DepartmentId = D.DepartmentId 
	 LEFT JOIN dbo.Addresses A
	   ON A.EmployeeId = A.EmployeeId 
	      LEFT JOIN dbo.Countries C
		    ON A.CountryId = C.CountryId

Important:
Don't use LEFT JOINs unless the business case requires it, and don’t abuse of them as they can come with performance penalties!7

12 January 2008

💎SQL Reloaded: SQL Server and Excel Data

Looking after information about SQL Server 2008, I stumble over Bob Beauchemin’s blog, the first posting I read Another use for SQL Server 2008 row constructors demonstrating a new use of VALUES clause that allows to insert multiple lines in a table by using only one query, or to group a set of values in a table and use them as source for a JOIN. I was waiting for this feature to be available under SQL Server 2005, though better later than never!

The feature is useful when you need to limit the output of a query based on matrix (tabular) values coming from an Excel or text file. For exemplification I’ll use HumanResources.vEmployee view from AdventureWorks database that co/mes with SQL Server 2005, you can download it from Code Plex in case you don’t have it for SQL Server 2008.
Let’s suppose that you have an Excel file with Employees for which you need contact information from a table available on SQL Server. You have the FirstName, MiddeName and LastName, and you need the EmailAddress and Phone. In SQL Server 2008 you can do that by creating a temporary table-like structure on the fly using VALUES clause, and use it then in JOIN or INSERT statements.

The heart of the query is the below structure, where B(FirstName, MiddleName, LastName) is the new table, each row in its definition being specified by comma delimited triples of form ('FirstName ', 'MiddleName ', ' LastName'):

The construct it’s time consuming to build manually, especially when the number of lines is considerable big, though you can get the construct in Excel with the help of an easy formula.

The formula from column D is = ", ('" & A2 & "','" & B2 & "','" & C2 & "')" and it can be applied to the other lines too. You just need to copy now the data from Column D to SQL Server and use them in Query with a few small changes. Of course, you can create also a custom function (macro) in Excel to obtain the whole structure is a singe cell.

You can do something alike under older versions of SQL Server (or other databases) using a simple trick – concatenating the values from each column by row by using a delimiter like “/”, “~”, “|” or any other delimiter, though you have to be sure that the delimiter isn’t found in your data sources (Excel and table). Using “/” the formula is = ", '" & A2 & "/" & B2 & "/" & C2 & "'".

Then you have to use the same trick and concatenate the columns from the table, the query becoming:

This technique involves small difficulties when:
• The data used for searching have other type than string derived data types, however that can be overcome by casting the values to string before concatenation.
• The string values contain spaces at extremities, so it’s better to trim the values using LTrim and RTrim functions.
• The values from the two sources are slightly different, for example diacritics vs. Latin standard characters equivalents, for this being necessary a transformation of the values to the same format.