Showing posts with label date functions. Show all posts
Showing posts with label date functions. Show all posts

06 March 2025

💎🤖SQL Reloaded: Copilot Stories (Part X: Working with the Left Function)

Left is one of the dull string functions in SQL Server, returning the left part of a character string with the specified number of characters. Besides mid, right and a few other functions, Left can help retrieve the subparts of a string (see previous post). This post evaluates the answers provided by Microsoft 365 Copilot in the use of the respective function.

Disclaimer: The below code was generated by Copilot via prompts and is used for demonstration purposes!

Prompt:
Give me 10 scenarios with examples in which the Left function can be used in SQL Server.

Output:

-- Example 1: Extracting Initials
SELECT LEFT('John', 1) + LEFT('Doe', 1) AS Initials -- Result: 'JD'

-- Example 2: Getting the First N Characters of a String:
SELECT LEFT('Hello World', 5) AS Substring; -- Result: 'Hello'

-- Example 3: Extracting wear from a Date string
SELECT LEFT('2025-03-06', 4) AS Year; -- Result: '2025'

-- Example 4: Truncating a string for display:
SELECT LEFT('This is a long description', 10) + '...' AS ShortDescription; -- Result: 'This is a ...'

-- Example 5: Extracting Country Code from Phone Number:
SELECT LEFT('+49 1234567890', 3) AS CountryCode; -- Result: '+49'

-- Example 6: Getting the First Word of a Sentence:
SELECT LEFT('Hello World', CHARINDEX(' ', 'Hello World') - 1) AS FirstWord;-- Result: 'Hello'

-- Example 7: Formatting Product Codes:
SELECT LEFT(ProductCode, 3) AS ProductPrefix
FROM Products; -- Result: 'ABC' for ProductCode 'ABC12345'

-- Example 8: Extracting File Extension:
SELECT LEFT(FileName, CHARINDEX('.', FileName) - 1) AS FileExtension
FROM Files; -- Result: 'txt' for FileName 'document.txt'

-- Example 9: Creating Abbreviations:
SELECT LEFT('International Business Machines', 3) AS Abbreviation; -- Result: 'Int'

-- Example 10: Extracting Date Part from DateTime:
SELECT LEFT(CONVERT(VARCHAR, GETDATE(), 120), 10) AS DatePart; -- Result: '2025-03-06'

Example 1: Extracting the initials can prove to be useful when the space in a report is limited and one wants to have the initial of the person who created or updated the record, respectively who is responsible for a part of the process involved. One can also combine an initial with the full name.

Example 2: Getting the first word from a sentence is useful when an encoding can be derived from it. However, it's seldom that the first characters that form the encoding have a fixed number of letters. One can use the technique from Example 6.

Example 3: Getting the year or other date part from a string can be done in this way, at least if the dates are always in the expected format. It's recommended to get the date parts via the DatePart or DateName functions. 

Example 4: This is a generalization of Example 2.

Example 5: One can get the country code from a phone number if the string is correctly delimited, which is seldom the case. Sometimes, one can use the Country field from the address to identify the correct code from the phone number, though there can also be exceptions.

Example 6: This is usually the recommended way to extract the string before a delimiter and this technique has applicability in many scenarios. 

Example 7: Product numbers and other similar encodings are formed via general rules and the formula should take the rules into account.

Example 8: This technique is used quite often in combination with file names and other similar encodings.

Example 9: The rules of abbreviations are usually more complex than this, though this may work occasionally. 

Example 10: That's a valid way to extract the date from a string that contains also the time. One should consider the alternatives as well.

Prompt:
Give me 10 more scenarios with examples in which the Left function can be used in SQL Server.

Output:

-- Example 11: Extracting the First N Digits of a Number:
SELECT LEFT(CAST(1234567890 AS VARCHAR), 5) AS FirstDigits; -- Result: '12345'

-- Example 12: Trimming a String to a Fixed Length:
SELECT LEFT('This is a test string', 10) AS TrimmedString; -- Result: 'This is a '

-- Example 13: Extracting the First Part of an Email Address:
SELECT LEFT('user@example.com', CHARINDEX('@', 'user@example.com') - 1) AS Username; -- Result: 'user'

-- Example 14: Getting the First N Characters of a Column Value:
SELECT LEFT(ColumnName, 8) AS ShortenedValue
FROM TableName;-- Result: 'Example' for ColumnName 'ExampleValue'

-- Example 15: Extracting the First Few Characters of a GUID:
SELECT LEFT(NEWID(), 8) AS ShortGUID; -- Result: '550e8400'

-- Example 16: Example 11: Creating a Shortened Version of a URL:
SELECT LEFT('https://example.com/page', 15) + '...' AS ShortURL; -- Result: 'https://example...'

-- Example 17: Extracting the First N Characters of a File Path:
SELECT LEFT('C:\Users\Name\Documents\File.txt', 12) AS ShortPath; -- Result: 'C:\Users\Nam'

-- Example 18: Getting the First N Characters of a Description:
SELECT LEFT(Description, 20) AS ShortDescription
FROM Products; -- Result: 'This is a product de'

-- Example 19: Extracting the First N Characters of a JSON String:
SELECT LEFT('{"name":"John","age":30}', 10) AS ShortJSON; -- Result: '{"name":"J'

-- Example 20: Creating a Shortened Version of a Log Entry:
SELECT LEFT(LogEntry, 50) + '...' AS ShortLog
FROM Logs; -- Result: 'Error: Unable to connect to the database. Please...'

Example 11: The example makes sense especially when the first n characters have a special meaning. Just selecting n characters from a random numbers brings little benefit, though there might also be exceptions out there.

Example 12: This is similar with example 3.

Example 13: Extracting the first part of an email can be useful for further processing especially when the emails have a predefined form (e.g. first name and last name, usually delimited by a special character).

Example 14: The example is a variation of 13 but with applicability to a given column from a table.

Example 15: It may look like a way to generate unique values, though no part of the GUID is unique when taken in isolation from any other part!

Example 16: This can be used to extract the hostname of a URL.

Example 17: This technique can be used to provide the main path to a folder, though more complex rules to generate it usually apply.

Example 18: Descriptions tend to be long enough to provide identification information for the products. Therefore, selecting only long enough parts of a description can help minimize the volume of data stored, which can be useful in reports of any type. 

Example 19: The technique can be used indeed, though it has little value without the addition of further information (see first example).

Example 20: If Log entries tend to be real long, this can help extracting the essential information. It can allow further on to aggregate the information and identify the number of events stored in the log. 

Not all the examples are representative for the use of the Left function and even if it looks that they repeat themselves, the context brings slight differences. They can be met in practice in combination with CharIndex, date or similar functions. 

Happy coding!

Previous Post <<||>> Next Post 

31 January 2025

🌌🏭KQL Reloaded: First Steps (Part II: Working with Dates)

Knowing how to work with dates and manipulate them accordingly is probably one of the important pieces of knowledge in any query language. Fortunately, KQL has many date-related functions that can help developers in the process and the available documentation is detailed enough to support users in learning their use. The current post focuses on the main uses of dates. 

Please note that the scripts consider the main time units. Please check the documentation for functions' call for other time units.

Create dates via make_datetime, now and ago functions:

// creating dates from parts
print datetime1 = make_datetime(2025,1,1)
, datetime2 = make_datetime(2025,1,1,0,0,0)
, datetime3 = make_datetime(2025,1,1,1,2,3)

// creating dates from values
print date1 = datetime(2024-02-01)
, datetime(2024-02-01 00:00:00)
, datetime(2024-02-01 23:59:59)
, datetime(2024-02-01 23:59:59.123)

// now vs todatetime (return the same value)
print datetime1 = now()
, datetime2 = todatetime(now())
, datetime3 = ago(0h)

Extract information from dates via datetime_part, substring and individual functions: getyear, week_of_year, monthofyear, dayofyear, hourofday, dayofmonth, dayofweek, dayofyear, dayofmonth, dayofweek, endofyear, endofweek, endofmonth, endofday, startofyear, startofweek, startofmonth, startofday:

// date parts front date
print now = now()
, year = datetime_part('year', now())
, month = datetime_part('month', now())
, day = datetime_part('day', now())
, hour = datetime_part('hour', now())
, minute = datetime_part('minute', now())
, second = datetime_part('second', now())

// date parts from string 
let t = datetime("2024-12-31 10:35:59");
print year = substring(t, 0, 4) 
, month = substring(t, 5, 2) 
, day = substring(t, 8, 2) 
, hour = substring(t, 11, 2) 
, minute = substring(t, 14, 2) 
, second = substring(t, 17, 2) 

// date parts via functions
print year = getyear(now())
, week = week_of_year(now())//ISO 8601 compliant
, month = monthofyear(now())
, day = dayofyear(now())
, hour = hourofday(now())

// day functions
print year = dayofyear(now())
, month = dayofmonth(now())
, day = dayofweek(now())

// end of time dates
print year = endofyear(now())
, week = endofweek(now())
, month = endofmonth(now())
, day = endofday(now())

// start of time dates
print year = startofyear(now())
, week = startofweek(now())
, month = startofmonth(now())
, day = startofday(now())

//time units
print hours1 = time(1hour)
, minutes1 = time(1min)
, seconds1 = time(1second)
, hours2 = totimespan("0.01:00:00")
, minutes2 = totimespan("0.00:01:00")
, seconds2 = totimespan("0.00:00:01")

Working with dates via datetime_add and datetime_diff functions:

// adding time units
print year = datetime_add('year',1,now())
, month = datetime_add('month',1,now())
, day = datetime_add('day',1,now())
, hour = datetime_add('hour',1,now())
, minutes = datetime_add('minute',1,now())
, seconds =  datetime_add('second',1,now())

// data differences
print years = datetime_diff('year', now(), ago(1h))
, months = datetime_diff('month', now(), ago(1h))
, days = datetime_diff('day', now(), ago(1h))
, hours = datetime_diff('hour', now(), ago(1h))
, minutes = datetime_diff('minute', now(), ago(1h))
, seconds = datetime_diff('second', now(), ago(1h))

Working with time zones via datetime_local_to_utc and datetime_utc_to_local functions:

// local time across time zones
print ParisTime = datetime_local_to_utc(now(),'Europe/Paris')
, LondonTime = datetime_local_to_utc(now(),'Europe/London')
, LondonTime = datetime_local_to_utc(now(),'Europe/Budapest')
, AthensTime = datetime_local_to_utc(now(),'Europe/Athens') // local time across time zones print ParisTime = datetime_utc_to_local(now(),'Europe/Paris') , LondonTime = datetime_utc_to_local(ago(1h),'Europe/London') , BudapestTime = datetime_utc_to_local(ago(-1h),'Europe/Budapest') , AthensTime = datetime_utc_to_local(ago(-2h),'Europe/Athens')

Applying different formatting with tostring, format_timespan and format_datetime functions:

// date and time to string
print date1 = tostring(datetime(2024-02-01 00:00:00))
, time1 = tostring(totimespan("0.01:02:03"))
, hours1 = tostring(time(1hour))

// formatting timespans
let t = time("25.10:35:59.123456");
print date1 = format_timespan(t, 'dd.hh:mm:ss:FF')
, date2 = format_timespan(t, 'ddd.h:mm:ss [ffff]')

// formatting dates
let t = datetime("2024-12-31 10:35:59");
print date1 = format_datetime(t,'yyyy-MM-dd [HH:mm:ss]')
, date2 = format_datetime(t,'yy-MM-dd HH:mm:ss')
, date3 = format_datetime(t,'yy-MM-dd [HH:mm:ss tt]')

Considering the above functions, one has a good basis for working with dates. 

Happy coding!

Previous Post <<||>> Next Post

05 April 2010

💎SQL Reloaded: The Power of Joins VI (Not-Equal Joins)

    SQL Server 2008 documentation mentions the not-equal joins without giving an explicit definition, the term appearing within the context of joining values in two columns that are not equal. On the other side the not-equal join denomination could be used to refer more generally to the joins whose join constraints involve comparison operators other than equal (“=”), for example <, >, <>, BETWEEN, IN, and its negation NOT IN, LIKE, etc. The difference between the two views resides on the fact that some of the operators (e.g. IN, BETWEEN) could imply the equality of values from joined columns. Another objection could be made on whether the not-equality join constraint could be based only on one of the table’s attributes participating in the join, the other thus missing, such constraints could be considered also as join constraints even if in some scenarios they could have been written in the main WHERE clause as well. For the sake of simplicity and flexibility I would consider a not-equal join or not-equality join as a join in which at least one of the join constraints is based on a operator other than the equal operator. Could be discussed thus about not-equal join constraints referring to the join constraints that involve other operators than equal, and equal join constraints, the two types of joins could coexist in the same join. A not-equal join constraint often involves the possibility of being returned more records then if a equal join constraint was used.

      If we talk about join constraints then more likely we refer to vertical joins, though also anti-joins and semi-joins could include sporadically not-equality operators. In a normal join based on equalities for the set of attributes participating in the join from the left table are matched all the records from the right table, even if this means retrieving more than one record. Same happens also for not-equal constraints even if the business scenarios in which such needs arises are not so intuitive. One relatively simple example I often met is the case when is needed to calculate the incremental aggregation of values, for example to calculate the incremental volume of PO placed over time:
 
SELECT DII.DateSequence 
, IsNull(POH.SubTotal, 0) SubTotal 
FROM dbo.fGetDatesInInterval('2002-01-01', '2002-01-31') DII 
    CROSS APPLY (--incremental PO Volume 
     SELECT SUM(POH.SubTotal) SubTotal 
    FROM Purchasing.PurchaseOrderHeader POH 
   WHERE POH.Status IN (2, 4) -- 2-Approved, 4-Complete  
       AND DATEDIFF(d, POH.OrderDate, DII.DateSequence)&gt;=0 --not-equal constraint 
    ) POH 

    In order to simplify the query I used the dbo.fGetDatesInInterval table-valued function that returns a sequence of dates within a given time interval, and a CROSS APPLY returning the aggregated value for each date returned by the function, for the calculations being considered all the records having an Order Date smaller or equal with the given (sequence) date. As can be seen from the above query the not-equal constraint is based on DateDiff function, instead we could have written POH.OrderDate<=DII.DateSequence, one of the reasons for its use is the fact that the constraint could be easily modified to show for example only the aggregated values for the past 90 days, the BETWEEN operator entering in scene:
 
SELECT DII.DateSequence 
, IsNull(POH.SubTotal, 0) SubTotal 
FROM dbo.fGetDatesInInterval('2002-01-01', '2002-01-31') DII 
     CROSS APPLY (--incremental PO Volume 
    SELECT SUM(POH.SubTotal) SubTotal 
    FROM Purchasing.PurchaseOrderHeader POH 
    WHERE POH.Status IN (2, 4) -- 2-Approved, 4-Complete  
        AND DATEDIFF(d, POH.OrderDate, DII.DateSequence) BETWEEN 0 AND 90--not-equal constraint 
    ) POH 

   The not-equal constraint from the query is part of interval-based constraints category which covers the scenarios in which one of the attributes from the tables participating in a join is checked whether it falls within a given interval, typically that being valid for numeric values as above or date value, as in the next query that shows the Last Standard Cost Details per Product.  

-- Products &amp; Last Standard Cost details SELECT ITM.ProductID  
, ITM.ProductNumber  
, ITM.StandardCost 
, PCH.StartDate 
, PCH.EndDate 
FROM Production.Product ITM 
    JOIN Production.ProductCostHistory PCH 
     ON ITM.ProductID = PCH.ProductID AND GETDATE() BETWEEN PCH.StartDate AND IsNull(PCH.EndDate, GETDATE()) 

    Such queries need often to be written when the records in a table have a given validity, no overlap existing between intervals.

    Using list of values within a join constraint and thus the IN or NOT IN operators occurs when for example the base table stores data coming from multiple sources and more encodings are used for the same source, each of them caring its meaning. For example using the Production.TransactionHistory and several other tables coming with AdventureWorks database, and supposing that for Purchase Orders are two transaction types P and B, and for Sales Orders S and I, a general scope query on Transactions could be written as follows:
   
-- Transaction History 

SELECT PTH.TransactionID 
, PTH.ProductID 
, PTH.ReferenceOrderID 
, CASE PTH.TransactionType 
     WHEN 'W' THEN 'Work Order' 
    WHEN 'S' THEN 'Sales Order' 
    WHEN 'I' THEN 'Internal Sales Order' 
    WHEN 'P' THEN 'Purchase Order' 
    WHEN 'B' THEN 'Blanket Purchase Order' 
    ELSE 'n.a' 
  END TransactionType , CASE  
    WHEN PTH.TransactionType = 'W' THEN StartDate 
    WHEN PTH.TransactionType IN ('S', 'I') THEN SOH.OrderDate 
    WHEN PTH.TransactionType IN ('P', 'B') THEN POH.OrderDate 
  END ReferenceDate FROM Production.TransactionHistory PTH 
    LEFT JOIN Purchasing.PurchaseOrderHeader POH       ON PTH.ReferenceOrderID = POH.PurchaseOrderID   AND PTH.TransactionType IN ('P', 'B') --not-equal constraint 
   LEFT JOIN Sales.SalesOrderHeader SOH 
     ON PTH.ReferenceOrderID = SOH.SalesOrderID 
  AND PTH.TransactionType IN ('S', 'I')  --not-equal constraint 
    LEFT JOIN Production.WorkOrder PWO 
      ON PTH.ReferenceOrderID = PWO.WorkOrderID 
   AND PTH.TransactionType = 'W' 

    The need for writing such a query is seldom and could have been written as 3 distinct queries whose results are joined with two unions, both approaches having their pluses and minuses. 

   In one of the above queries was used a constraint based on DateDiff function, and in theory any other function could be used in a join constraint, including user-defined functions. Such function-based join constraints are handy but should be used with caution because they could impact query’s performance. Pattern-based join constraints used with LIKE operator could be used as well.

    There are cases in which the constraints that typically would be included in the WHERE clause are added in the body of the join, apparently without any good reason. Such based denatured join constraints are base only on the attributes of one of the tables involved in the join, like in the next query:
   
-- PO for Products with StandardCost&gt;10 SELECT POD.PurchaseOrderDetailID  
, POD.ProductID  
, ITM.ProductNumber  
, ITM.StandardCost 
, POD.UnitPrice 
, POD.OrderQty 
FROM Purchasing.PurchaseOrderDetail POD  
    JOIN Production.Product ITM       ON POD.ProductID = ITM.ProductID 
  AND ITM.StandardCost&gt;10  -- denaturated join constraint

16 March 2010

🔏MS Office: Excel for SQL Developers IV (Differences Between Two Datasets)

    One of the tasks which appears from time to time on my table is to determine the differences between two datasets coming from different systems or from the same system/source but taken at different points in time. Even if it’s fairly simple to write such a query, especially when the number of attributes is quite small, there is lot of repetitive work that needs to be done because typically for each set of matched attributes needs to be added a third attribute (difference flag) showing whether there is a difference or not. In time I started to use Excel formulas to create the SELECT clause for such a query, matching tables’ attribute first and categorizing the data types mainly in 4 categories: text, amounts, numeric (other than amounts) and dates. This categorization is requested first of all by the different ways in handling the NULL values for each data type, and secondly by the difference between amounts and numeric, even if they might have the same data type, the difference resides in the fact that amounts might be rounded differently in each data source, therefore a difference of a few cents will not be considered as difference.

    Given two attributes ColumnX and ColumnY from tables A, respectively B, let’s look first on how the difference flag constraint could be written for each category:

--text attributes: 
CASE 
    WHEN IsNull(A.ColumnX , '')<> ISNULL(B.ColumnY, '') THEN 'Y' 
    ELSE 'N' 
END DiffColumnXYFlag 

--amount attributes: 
CASE 
    WHEN IsNull(A.ColumnX, 0) - IsNull(B.ColumnY, 0) NOT BETWEEN -0.05 AND 0.05 THEN 'Y' 
    ELSE 'N' 
END DiffColumnXYFlag 

--numeric attributes: 
CASE 
    WHEN IsNull(A.ColumnX, 0) <> IsNull(B.ColumnY, 0) THEN 'Y' 
    ELSE 'N' 
END DiffColumnXYFlag 

--date attributes: 
CASE 
   WHEN IsNull(DateDiff(d, A.ColumnX, B.ColumnY), -1)<>0 THEN 'Y' 
   ELSE 'N' 
END DiffColumnXYFlag 


Notes:
1.   Bit attributes can be treated as numeric as long as they are considered as having a bi-state, for tri-state values in which also NULL is considered as a distinct value then the constraint must be changed, the most natural way being to translate the NULL to –1:
CASE
    WHEN IsNull(A.ColumnX, -1) <> IsNull(B.ColumnY, -1) THEN 'Y'
    ELSE 'N'

END
DiffColumnXYFlag

2.   In most of the examples I worked with the difference between two pair dates, the difference was calculated at day level, though it might happen that is needed to compare the values at smaller time intervals to the order of hours, minutes or seconds. The only thing that needs to be changed then is the first parameter from DateDiff function. There could be also situations in which a difference of several seconds is acceptable, a BETWEEN operator could be used then as per the case of numeric vs. amount values.
3.    In case one of the attributes is missing, the corresponding difference flag could take directly the value ‘N’ or ‘n/a’.
4.   It could happen that there are mismatches between the attributes’ data type, in this case at least one of them must be converted to a form that could be used in further processing.

    Thus a macro for this purpose would take as input a range with the list of attributes from the two tables, the data type category and the columns participating in the join constraint, two parameters designating the name of the left and right table participating in the FULL OUTER JOIN, the time interval considered, the error margin value interval (e.g. [-e, e]) and a flag indicating whether to show all combined data or only the records for which there is at least a difference found. 

Function GetComparisonQuery(ByVal rng As Range, ByVal LeftTable As String, ByVal RightTable As String, ByVal TimeInterval As String, ByVal ErrorMargin As String, ByVal ShowOnlyDifferencesFlag As Boolean) As String 
'builds the code for a comparison query between two tables 
Dim attributes As String 
Dim differences As String 
Dim constraint As String 
Dim whereConstraints As String 
Dim joinConstraints As String 
Dim columnX As String 
Dim columnY As String 
Dim index As Integer 
For index = 1 To rng.Rows.Count 
    columnX = Trim(rng.Cells(index, 1).Value) 
    columnY = Trim(rng.Cells(index, 2).Value) 
  
    If Len(columnX) > 0 Or Len(columnY) > 0 Then 
       If Len(columnX) > 0 Then 
            attributes = attributes & ", A." & columnX & " LT" & columnX & vbCrLf 
       End If 
       If Len(columnY) > 0 Then 
            attributes = attributes & ", B." & columnY & " RT" & columnX & vbCrLf 
       End If 
       
       constraint = "" 
       If Len(Trim(rng.Cells(index, 4).Value)) = 0 Then 
            If Len(columnX) > 0 And Len(columnY) > 0 Then 
                 'creating the difference flag 
                 Select Case Trim(rng.Cells(index, 3).Value) 
                 Case "text": 
                      constraint = "CASE" & vbCrLf & _ 
                                   "     WHEN IsNull(A." & columnX & " , '') <> IsNUll(B." & columnY & ", '') THEN 'Y'" & vbCrLf & _ 
                                   "     ELSE 'N'" & vbCrLf & _ 
                                   "  END" 
                 Case "amount": 
                      constraint = "CASE" & vbCrLf & _ 
                                   "     WHEN IsNull(A." & columnX & " , 0) - IsNUll(B." & columnY & ", 0) NOT BETWEEN -" & ErrorMargin & " AND " & ErrorMargin & " THEN 'Y'" & vbCrLf & _ 
                                   "     ELSE 'N'" & vbCrLf & _ 
                                   "  END" 
                 Case "numeric": 
                      constraint = "CASE" & vbCrLf & _ 
                                   "     WHEN IsNull(A." & columnX & " , 0) <> IsNUll(B." & columnY & ", 0) THEN 'Y'" & vbCrLf & _ 
                                   "     ELSE 'N'" & vbCrLf & _ 
                                   "  END" 
                 Case "date": 
                      constraint = "CASE" & vbCrLf & _ 
                                   "     WHEN DateDiff(" & TimeInterval & ", A." & columnX & ", B." & columnY & ")<>0 THEN 'Y'" & vbCrLf & _ 
                                   "     ELSE 'N'" & vbCrLf & _ 
                                   "  END" 
                 Case Else: 'error 
                     MsgBox "Incorrect data type provided for " & index & " row!", vbCritical 
                 End Select 
                 
                
                If ShowOnlyDifferencesFlag Then 
                   whereConstraints = whereConstraints & " OR " & constraint & " = 'Y'" & vbCrLf 
                End If 
                
                differences = differences & ", " & constraint & " Diff" & columnX & "Flag" & vbCrLf 
            Else 
                differences = differences & ", 'n/a' Diff" & IIf(Len(columnX) > 0, columnX, columnY) & "Flag" & vbCrLf 
            End If 
                             
        Else 
            joinConstraints = joinConstraints & "    AND A." & columnX & " = B." & columnY & vbCrLf 
        End If 
     
     End If 
Next 
If Len(attributes) > 0 Then 
    attributes = Right(attributes, Len(attributes) - 2) 
End If 
If Len(joinConstraints) > 0 Then 
    joinConstraints = Right(joinConstraints, Len(joinConstraints) - 8) 
End If 
If Len(whereConstraints) > 0 Then 
    whereConstraints = Right(whereConstraints, Len(whereConstraints) - 4) 
End If 
'building the comparison query 
GetComparisonQuery = "SELECT " & attributes & _ 
    differences & _ 
    "FROM " & LeftTable & " A" & vbCrLf & _ 
    "     FULL OUTER JOIN " & RightTable & " B" & vbCrLf & _ 
    "       ON " & joinConstraints & _ 
    IIf(ShowOnlyDifferencesFlag And Len(whereConstraints) > 0, "WHERE " & whereConstraints, "") 
   
End Function 
 

Excel - Comparison Datasets

    The query returned by the macro for the above example based on attributes from Production.Product table from AdventureWorks database and Production.Products table created in Saving Data With Stored Procedures post: 
 
SELECT A.ProductID LTProductID 
, B.ProductID RTProductID 
, A.Name LTName 
, B.Name RTName 
, A.ProductNumber LTProductNumber 
, B.ProductNumber RTProductNumber 
, A.MakeFlag LTMakeFlag 
, B.MakeFlag RTMakeFlag 
, A.FinishedGoodsFlag LTFinishedGoodsFlag 
, B.FinishedGoodsFlag RTFinishedGoodsFlag 
, A.Color LTColor 
, B.Color RTColor 
, A.SafetyStockLevel LTSafetyStockLevel 
, B.SafetyStockLevel RTSafetyStockLevel 
, A.ReorderPoint LTReorderPoint 
, A.StandardCost LTStandardCost 
, B.StandardCost RTStandardCost 
, A.ListPrice LTListPrice 
, B.ListPrice RTListPrice 
, A.Size LTSize 
, A.DaysToManufacture LTDaysToManufacture 
, A.ProductLine LTProductLine 
, A.Class LTClass 
, A.Style LTStyle 
, A.SellStartDate LTSellStartDate 
, B.StartDate RTSellStartDate 
, A.SellEndDate LTSellEndDate 
, B.EndDate RTSellEndDate 
, A.DiscontinuedDate LTDiscontinuedDate 
, B.CreationDate RT 
, B.CreatedBy RT 
, A.ModifiedDate LTModifiedDate 
, B.ModifiedDate RTModifiedDate 
, B.ModifiedBy RT 
, CASE 
    WHEN IsNull(A.Name , '') <> IsNUll(B.Name, '') THEN 'Y' 
    ELSE 'N' 
END DiffNameFlag 
, CASE 
    WHEN IsNull(A.ProductNumber , '') <> IsNUll(B.ProductNumber, '') THEN 'Y' 
    ELSE 'N' 
END DiffProductNumberFlag 
, CASE 
    WHEN IsNull(A.MakeFlag , 0) <> IsNUll(B.MakeFlag, 0) THEN 'Y' 
    ELSE 'N' 
END DiffMakeFlagFlag 
, CASE 
    WHEN IsNull(A.FinishedGoodsFlag , 0) <> IsNUll(B.FinishedGoodsFlag, 0) THEN 'Y' 
    ELSE 'N' 
END DiffFinishedGoodsFlagFlag 
, CASE 
     WHEN IsNull(A.Color , '') <> IsNUll(B.Color, '') THEN 'Y' 
    ELSE 'N' 
END DiffColorFlag 
, CASE 
    WHEN IsNull(A.SafetyStockLevel , 0) <> IsNUll(B.SafetyStockLevel, 0) THEN 'Y' 
    ELSE 'N' 
END DiffSafetyStockLevelFlag 
, 'n/a' DiffReorderPointFlag 
, CASE 
     WHEN IsNull(A.StandardCost , 0) - IsNUll(B.StandardCost, 0) NOT BETWEEN -0.05 AND 0.05       
     THEN 'Y' 
ELSE 'N' 
END DiffStandardCostFlag 
, CASE 
     WHEN IsNull(A.ListPrice , 0) - IsNUll(B.ListPrice, 0) NOT BETWEEN -0.05 AND 0.05 THEN 'Y' 
     ELSE 'N' 
END DiffListPriceFlag 
, 'n/a' DiffSizeFlag 
, 'n/a' DiffDaysToManufactureFlag 
, 'n/a' DiffProductLineFlag 
, 'n/a' DiffClassFlag 
, 'n/a' DiffStyleFlag 
, CASE 
    WHEN DateDiff(d, A.SellStartDate, B.StartDate)<>0 THEN 'Y' 
    ELSE 'N' 
END DiffSellStartDateFlag 
, CASE 
    WHEN DateDiff(d, A.SellEndDate, B.EndDate)<>0 THEN 'Y' 
    ELSE 'N' 
END DiffSellEndDateFlag 
, 'n/a' DiffDiscontinuedDateFlag 
, 'n/a' DiffCreationDateFlag 
, 'n/a' DiffCreatedByFlag 
, CASE 
     WHEN DateDiff(d, A.ModifiedDate, B.ModifiedDate)<>0 THEN 'Y' 
     ELSE 'N' 
END DiffModifiedDateFlag 
, 'n/a' DiffModifiedByFlag 
FROM Production.Product A 
    FULL OUTER JOIN Production.Products B 
       ON A.ProductID = B.ProductID 
WHERE CASE 
     WHEN IsNull(A.Name , '') <> IsNUll(B.Name, '') THEN 'Y' 
     ELSE 'N' 
END = 'Y' 
OR CASE 
    WHEN IsNull(A.ProductNumber , '') <> IsNUll(B.ProductNumber, '') THEN 'Y' 
     ELSE 'N' 
END = 'Y' 
OR CASE 
     WHEN IsNull(A.MakeFlag , 0) <> IsNUll(B.MakeFlag, 0) THEN 'Y' 
     ELSE 'N' 
END = 'Y' 
OR CASE 
     WHEN IsNull(A.FinishedGoodsFlag , 0) <> IsNUll(B.FinishedGoodsFlag, 0) THEN 'Y' 
     ELSE 'N' 
END = 'Y' 
OR CASE 
     WHEN IsNull(A.Color , '') <> IsNUll(B.Color, '') THEN 'Y' 
     ELSE 'N' 
END = 'Y' 
OR CASE 
     WHEN IsNull(A.SafetyStockLevel , 0) <> IsNUll(B.SafetyStockLevel, 0) THEN 'Y' 
     ELSE 'N' 
END = 'Y' 
OR CASE 
     WHEN IsNull(A.StandardCost , 0) - IsNUll(B.StandardCost, 0) NOT BETWEEN -0.05 AND 0.05 
      THEN 'Y' 
ELSE 'N' 
END = 'Y' 
OR CASE 
      WHEN IsNull(A.ListPrice , 0) - IsNUll(B.ListPrice, 0) NOT BETWEEN -0.05 AND 0.05 THEN 'Y' 
      ELSE 'N' 
END = 'Y' 
OR CASE 
      WHEN DateDiff(d, A.SellStartDate, B.StartDate)<>0 THEN 'Y' 
      ELSE 'N' 
END = 'Y' 
OR CASE 
      WHEN DateDiff(d, A.SellEndDate, B.EndDate)<>0 THEN 'Y' 
      ELSE 'N' 
END = 'Y' 
OR CASE 
       WHEN DateDiff(d, A.ModifiedDate, B.ModifiedDate)<>0 THEN 'Y' 
      ELSE 'N' 
END = 'Y' 


Notes:
1.   The macro doesn’t consider an ORDER BY clause, though it could be easily added manually
2.   Not all of the join constraints are so simple so that they can be reduced to one or more simple equalities, on the other side we have to consider that the most time consuming task is listing the attributes and the difference flags.
3.    Sometimes it’s easier to create two extracts – in the first being considered all the records from the left table and the matches from the right table (left join), respectively all the records from the right table and the matches from the left table (right join).
4.     Given the fact that the attributes participating in the join clause should in theory match, each pair of such attributes could be merged in one attribute using the formula: IsNull(A.ColumnX, B.ColumnY) As ColumnXY.
5.     In order to show all the data from the two tables and not only the differences, all is needed to do is to change the value of the last parameter from true to false:
=GetComparisonQuery(A2:D23, "Production.Product", "Production.Products", "d", "0.05", false) 
6.     For TimeInterval parameter should be provided only the values taken by DatePart parameter (first parameter) of SQL Server’s DateDiff function.
7.     Please note that no validation is made for the input parameters.
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.