23 September 2007

SQL Reloaded: 6 out of 49 (Part 2: Modeling the Data)

In a previous post I created a random set of data and stored them in LottoExtracts table, the first step is done, now we have to model the data in a form which could be useful for our analysis. Thus it makes sense to split the table in two tables:     
  • LottoExtracts containing mainly the DrawingDate and eventually additional information about extraction (e.g. Country, System, etc.). For simplicity I’m including only the Drawing Date.     
  • LottoNumbers containing the numbers and their order in extraction. 
Here are the tables:
 
-- dropping the LottoDrawings table
DROP TABLE IF EXISTS dbo.LottoDrawings
GO

--creating the LottoDrawings table 
CREATE TABLE dbo.LottoDrawings( 
  DrawingID int IDENTITY(1,1) NOT NULL
 , DrawingDate smalldatetime NULL) 
GO

 -- dropping the LottoNumbers table
DROP TABLE IF EXISTS dbo.LottoNumbers

 --creating LottoNumbers table 
 CREATE TABLE dbo.LottoNumbers(NumberID int IDENTITY(1,1) NOT NULL
 , DrawingID int NULL
 , Position smallint NULL
 , Number smallint NULL)   

 And, the view which will be used as source for analysis: 

 --creating vLottoNumbers view 
 CREATE VIEW dbo.vLottoNumbers AS 
 -- consolidated drawings by number
 SELECT LN.NumberID 
 , LN.Position 
 , LN.Number 
 , LN.DrawingID 
 , LD.DrawingDate 
 , Year(LD.DrawingDate) DrawingYear , Month(LD.DrawingDate) DrawingMonth 
 , DatePart(dd, LD.DrawingDate) DrawingDay 
 , DatePart(wk, LD.DrawingDate) DrawingWeek 
 FROM dbo.LottoNumbers LN 
      JOIN dbo.LottoDrawings LD 
	    ON LN.DrawingID = LD.DrawingID    

Now that we have the objects to model the data, we need to populate the tables with the data from LottoExtracts table: 

--populating LottoDrawings table 
INSERT dbo.LottoDrawings (DrawingDate) 
SELECT DISTINCT DrawingDate 
FROM dbo.LottoExtracts 

--populating the LottoNumbers table - first drawing number 
INSERT LottoNumbers (DrawingID, Position, Number) 
SELECT DrawingID, 1, N1 
FROM dbo.LottoExtracts LE 
     JOIN dbo.LottoDrawings LD 
	   ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0  

The above query inserted only the first number of the extraction, in order to insert the other 5 numbers, all you have to do is to replace 1 with 2 in the second line and run the query again, the insert for second number becomes: 

--populating the LottoNumbers table – second drawing number 
INSERT LottoNumbers (DrawingID, Position, Number) 
SELECT DrawingID, 2, N2 
FROM dbo.LottoExtracts LE 
       JOIN dbo.LottoDrawings LD
	     ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0

Here's the code for the remaining numbers:

--populating LottoNumbers table – the remaining drawing numbers
INSERT LottoNumbers (DrawingID, Position, Number) 
SELECT DrawingID, 3, N3 
FROM dbo.LottoExtracts LE 
       JOIN dbo.LottoDrawings LD
	     ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0
UNION ALL 
SELECT DrawingID, 4, N4
FROM dbo.LottoExtracts LE 
       JOIN dbo.LottoDrawings LD
	     ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0
UNION ALL 
SELECT DrawingID, 5, N5 
FROM dbo.LottoExtracts LE 
       JOIN dbo.LottoDrawings LD
	     ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0
UNION ALL 
SELECT DrawingID, 6, N6
FROM dbo.LottoExtracts LE 
       JOIN dbo.LottoDrawings LD
	     ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0

Checking the output, you should have 12000 records: 

--checking the data 
SELECT * 
FROM dbo.vLottoNumbers 
ORDER BY DrawingDate
, Position

Now we have the data for analysis (see the next post).

Happy coding!

Previous Post <<||>> Next Post

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.