WHERE clause constraints
If your query is just a simple statement without joins of any type, then more likely the problem resides in the constraints used in the WHERE clause, at least one of them being evaluated to false. Here are several possible reasons:
- One of the hardcoded values or one of the provided parameters is not found between the values available in the table;
- Not treated NULL cases;
- Wrong use of predicates or parenthesis.
The simplest constraint evaluated to false is “0=1” or something similar, actually can be used any similar values (e.g. ‘A’=’B’, ‘ABC’=’’, etc.). In contrast “1=1” is evaluated always to true. The below query returns no records because the constraint is evaluated for false:
-- single constraint evaluated to false SELECT * FROM Production.Product WHERE 1=0
A more complex scenario is when multiple predicates and at least one equates always to true or to false, the task in such situations being to identify such constraints:
-- multiple constraints, one evaluated to false SELECT * FROM Production.Product WHERE 1=0 AND Class = 'L' AND SafetyStockLevel >100
The constraints could be grouped also using parenthesis:
-- multiple constraints with parenthesis, one evaluated to false SELECT * FROM Production.Product WHERE 1=0 AND ((Class = 'L' AND SafetyStockLevel >100) OR (Class = 'M' AND SafetyStockLevel =100))
Dealing with multiple such constraints is not a question of guessing but pure applied Boolean algebra essential in writing accurate queries. So if you are feeling that you’re not mastering such concepts, then maybe it’s a good idea to consult some material on this topic.
There could be situations in which the total outcome of a set of constraints it’s not so easy to see, in such cases the constraint could be brought in the SELECT statement. Which constraints should be tested depends also on the particularities of your query, here is an example:
--testing constraints' outcome for each record SELECT * , CASE WHEN Class = 'L' AND SafetyStockLevel >100 THEN 1 ELSE 0 END Constraint1 , CASE WHEN Class = 'M' AND SafetyStockLevel =100 THEN 1 ELSE 0 END Constraint2 , CASE WHEN (Class = 'L' AND SafetyStockLevel >100) OR (Class = 'M' AND SafetyStockLevel =100) THEN 1 ELSE 0 END Constraint3 FROM Production.Product
The constraints in which one of the members is NULL and the IS NULL or Null functions are not used, are incorrect evaluated, actually ignored, and in certain cases they might look even correct, though this depends also on the expressions used. For example the below query will return no records, while the next one will return some records. For more on the correct handling of NULLs see Null-ifying the world or similar posts on the web.
--misusing NULL-based constraints SELECT * FROM Production.Product WHERE (NOT 1=NULL) OR (1=NULL)
JOIN constraints
JOINs based on single JOIN constraints should in theory pose no problem unless the wrong attributes are used in the JOIN and that’s so easy to do especially when no documentation is available on the topic or it is incomplete. Functional or other technical specifications, physical or semantic models, metadata or other developers’ knowledge could help the developer to figure out about the relations between the various entities. On the other side it’s in human nature to make mistakes, forgetting to add one constraint, using the wrong type of join or the wrong attributes in the join constraint. In addition all the situations described above for WHERE clauses constraints apply also to join constraints, independently on whether the ANSI or non-ANSI syntax is used, the later based on writing the JOIN constraints in the WHERE clause, making the identification of the JOIN constraints not so easy to see, this being one of the reasons for which I recommend you to use ANSI syntax.
A technique I use on a daily basis in order to test my scripts is to check the changes in the number of records with the adding of each new table to the initial query. In a first phase is enough to count the number of records from the table with the lowest level of detail, and the number of records from the final query, this ignoring the WHERE constraints. For example let’s consider the following query written some time ago (most of attributes were removed for convenience):
-- Purchase Orders SELECT POD.PurchaseOrderDetailID , POD.PurchaseOrderID , POD.ProductID , POH.ShipMethodID , POH.VendorID FROM Purchasing.PurchaseOrderDetail POD JOIN Purchasing.PurchaseOrderHeader POH ON POD.PurchaseOrderID = POH.PurchaseOrderID JOIN Purchasing.ShipMethod PSM ON POH.ShipMethodID = PSM.ShipMethodID JOIN Purchasing.Vendor SMF ON POH.VendorID = SMF.VendorID JOIN Production.Product ITM ON POD.ProductID = ITM.ProductID WHERE POD.StockedQty >100 AND ITM.Class = 'L'
Usually I write the table with the lowest level of detail (in theory also with the highest number of records) first, adding the referenced table gradually, starting with the tables with the lowest number of records, thus the lookup tables will come first. This usually doesn’t affect the way the database engine processes the query (unless special techniques are used), and it allows to use kind of a systematic approach. This allows me also to test the query without making important changes. As I wrote above, the first test is done against the first table, for this commenting the other joins and eventually the WHERE constraints:
-- testing changes in the number of records - test 1 SELECT count(1) NumberRecords FROM Purchasing.PurchaseOrderDetail POD /* JOIN Purchasing.PurchaseOrderHeader POH ON POD.PurchaseOrderID = POH.PurchaseOrderID JOIN Purchasing.ShipMethod PSM ON POH.ShipMethodID = PSM.ShipMethodID JOIN Purchasing.Vendor SMF ON POH.VendorID = SMF.VendorID JOIN Production.Product ITM ON POD.ProductID = ITM.ProductID WHERE POD.StockedQty >100 AND ITM.Class = 'L' */
In the second step is enough to check the number of records returned by the whole query without the WHERE constraints:
-- testing changes in the number of records - test 2 SELECT count(1) NumberRecords FROM Purchasing.PurchaseOrderDetail POD JOIN Purchasing.PurchaseOrderHeader POH ON POD.PurchaseOrderID = POH.PurchaseOrderID JOIN Purchasing.ShipMethod PSM ON POH.ShipMethodID = PSM.ShipMethodID JOIN Purchasing.Vendor SMF ON POH.VendorID = SMF.VendorID JOIN Production.Product ITM ON POD.ProductID = ITM.ProductID /* WHERE POD.StockedQty >100 AND ITM.Class = 'L' */
If the number of records is correct then most probably the query is correct, this because there could be incorrect LEFT JOINS that are not reflected in the number of records but in the fact that the corresponding attributes are missing. So don’t rely entirely on this type of test, but take 1-2 examples for which you are sure that records must be retrieved from the LEFT JOIN, and check whether the expected values are shown. Eventually, only for testing purposes, the LEFT JOIN could be modified as a FULL JOIN to see what records are returned.
Now if there are changes in the number of records from one test to the other, then take the first query and move the comment one join further (see below), and repeat the step until the JOIN that causes the variances is identified.
-- testing changes in the number of records - test 3 SELECT count(1) NumberRecords FROM Purchasing.PurchaseOrderDetail POD JOIN Purchasing.PurchaseOrderHeader POH ON POD.PurchaseOrderID = POH.PurchaseOrderID /* JOIN Purchasing.ShipMethod PSM ON POH.ShipMethodID = PSM.ShipMethodID JOIN Purchasing.Vendor SMF ON POH.VendorID = SMF.VendorID JOIN Production.Product ITM ON POD.ProductID = ITM.ProductID WHERE POD.StockedQty >100 AND ITM.Class = 'L' */
Once the join that causes the variances is found, then might be requested then to add a new join constraint or use a LEFT JOIN instead of a FULL JOIN. For example if in the above query there are ShipMethodID with NULL values, then the query should have been written using a LEFT JOIN. On the other side, if duplicates are found, then do a grouping by using the primary key of the table with the lowest level of detail, in this case the Purchasing.PurchaseOrderDetail.
HAVING clause constraints
HAVING clause constraints behave more like WHERE clause constraints, the difference residing in the different levels where the constraints are applied, so the same technique could be applied in this case too, with the difference that the constraints need to be accommodated to support aggregate functions.
--Vendors with total purchase value >100000 SELECT POH.VendorID , SUM(POH.SubTotal) TotalSubTotal FROM Purchasing.PurchaseOrderHeader POH WHERE POH.Status IN (2,4) -- 2-Approved, 4-Complete GROUP BY POH.VendorID HAVING SUM(POH.SubTotal)>100000
Sometimes it helps to bring the HAVING constraints into the SELECT, though that offers limited functionality, therefore it’s more productive maybe to re-link the aggregated query back to the base table(s) on which the statement is based:
--Vendors with total purchase value >100000 - base records SELECT A.* , B.TotalSubTotal , CASE WHEN B.VendorID IS NOT NULL THEN 1 ELSE 0 END ConsideredFlag FROM Purchasing.PurchaseOrderHeader A LEFT JOIN ( --aggregated data SELECT POH.VendorID , SUM(POH.SubTotal) TotalSubTotal FROM Purchasing.PurchaseOrderHeader POH WHERE POH.Status IN (2,4) -- 2-Approved, 4-Complete GROUP BY POH.VendorID HAVING SUM(POH.SubTotal)>100000 ) B ON A.VendorID = B.VendorID WHERE A.Status IN (2,4) -- 2-Approved, 4-Complete
Also this technique could prove to be limited, though I found it useful in several cases, especially when a complicated formula is included in the aggregated function. The same query could be rewritten with a window aggregate function introduced with SQL Server 2005.
--Vendors with total purchase value >100000 - window aggregate function SELECT POH.* , SUM(POH.SubTotal) OVER (PARTITION BY POH.VendorID) TotalSubTotal , CASE WHEN SUM(POH.SubTotal) OVER (PARTITION BY POH.VendorID)>100000 THEN 1 ELSE 0 END ConsideredFlag FROM Purchasing.PurchaseOrderHeader POH WHERE POH.Status IN (2,4) -- 2-Approved, 4-Complete
When working with amounts, quantities or other number values, it could be useful to check the total amount/quantity based on the query logic against the total amount/quantity based on the base table.
No comments:
Post a Comment