About Me

IT Professional with more than 16 years experience in IT especially in the area of full life-cycle of Web/Desktop Applications Development, Database Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP support, etc.

Saturday, September 22, 2007

6 out of 49 – Modeling the data (Part 1)

    In a previous posting I created a random set of data and stored them in LottoExtracts table, the first step is done, now we have to model the data in a form which could be useful for our analysis. Thus it makes sense to split the table in two tables:     LottoExtracts containing mainly the DrawingDate and eventually additional information about extraction (e.g. Country, System, etc.). For simplicity I’m including only the Drawing Date.     LottoNumbers containing the numbers and their order in extraction. Here are the tables:

--creating LottoDrawings table CREATE TABLE LottoDrawings( DrawingID int IDENTITY(1,1) NOT NULL, DrawingDate smalldatetime NULL ) --creating LottoNumbers table CREATE TABLE LottoNumbers( NumberID int IDENTITY(1,1) NOT NULL, DrawingID int NULL, Position int NULL, Number int NULL )     And the view which will be used as source for analysis: --creating vLottoNumbers view CREATE VIEW vLottoNumbers AS SELECT LN.NumberID , LN.Position , LN.Number , LN.DrawingID , LD.DrawingDate , Year(LD.DrawingDate) DrawingYear , Month(LD.DrawingDate) DrawingMonth , DatePart(dd, LD.DrawingDate) DrawingDay , DatePart(wk, LD.DrawingDate) DrawingWeek FROM LottoNumbers LN JOIN LottoDrawings LD ON LN.DrawingID = LD.DrawingID     Now that we have the objects to model the data, we need to populate the tables with the data from LottoExtracts table: --pupulating LottoDrawings table INSERT LottoDrawings (DrawingDate) SELECT DISTINCT DrawingDate FROM LottoExtracts --pupulating LottoNumbers table - first extraction number INSERT LottoNumbers (DrawingID, Position, Number) SELECT DrawingID, 1, N1 FROM LottoExtracts LE JOIN LottoDrawings LD ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0     The above query inserted only the first number of the extraction, in order to insert the other 5 numbers, all you have to do is to replace 1 with 2 in the second line and run the query again, the insert for second number becomes: --pupulating LottoNumbers table – second drawing number INSERT LottoNumbers (DrawingID, Position, Number) SELECT DrawingID, 2, N2 FROM LottoExtracts LE JOIN LottoDrawings LD ON DateDiff(d, LE.DrawingDate, LD.DrawingDate) = 0

Run it again after you replaced 2 with 3, 4, 5, respectively 6. And then checking the output (you should have 200 records): --checking the data SELECT * FROM vLottoNumbers     Now we have the data, it’s time for analysis, in a next posting!

No comments: