Today, as I was playing with a data model – although simplistic, the simplicity kicked back when I had to deal with fields in which several values were encoded within the same column. The “challenge” resided in the fact that the respective attributes were quintessential in analyzing and matching the data with other datasets. Therefore, was needed a mix between flexibility and performance. It was the point where the String_Split did its magic. Introduced with SQL Server 2016 and available only under compatibility level 130 and above, the function splits a character expression using specified separator.
Here’s a simplified example of the code I had to write:
-- cleaning up -- DROP TABLE dbo.ItemList -- test data SELECT A.* INTO dbo.ItemList FROM ( VALUES (1, '1001:a:blue') , (2, '1001:b:white') , (3, '1002:a:blue') , (4, '1002:b:white') , (5, '1002:c:red') , (6, '1003:b:white') , (7, '1003:c:red')) A(Id, List) -- checking the data SELECT * FROM dbo.ItemList -- prepared data SELECT ITM.Id , ITM.List , DAT.ItemId , DAT.Size , DAT.Country FROM dbo.ItemList ITM LEFT JOIN (-- transformed data SELECT DAT.id , [1] AS ItemId , [2] AS Size , [3] AS Country FROM( SELECT ITM.id , TX.Value , ROW_NUMBER() OVER (PARTITION BY ITM.id ORDER BY ITM.id) Ranking FROM dbo.ItemList ITM CROSS APPLY STRING_SPLIT(ITM.List, ':') TX ) DAT PIVOT (MAX(DAT.Value) FOR DAT.Ranking IN ([1],[2],[3])) DAT ) DAT ON ITM.Id = DAT.Id
For those dealing with former versions of SQL Server the functionality provided by the String_Split can be implemented with the help of user-defined functions, either by using the old fashioned loops (see this), cursors (see this) or more modern common table expressions (see this). In fact, these implementations are similar to the String_Split function, the difference being made mainly by the performance.
Happy coding!
No comments:
Post a Comment