22 September 2007

6 out of 49 – Getting the data (Part 1)

    To study numbers’ occurrences we need some data, sometimes that’s difficult to get so they satisfy our purpose. Fortunately, each lottery has a website from which data can be taken drawing by drawing or when possible, as a data dump. Some of the lotteries have variations for 6/49. Lottery Canada Lotto Germany Lotto France (*) Lotto Belgium (*) Lotto Romania (*) Lotto Austria (*) Lotto UK     For other lotteries you can check The Lottery Site.     Because I don’t want to be involved in legal litigations, I will use a self made set of data, for this let’s create a table called LottoExtracts as below: --creating the LottoExtracts table CREATE TABLE LottoExtracts(     DrawingDate datetime NULL,     N1 int NULL,     N2 int NULL,     N3 int NULL,     N4 int NULL,     N5 int NULL,     N6 int NULL )     This format will be used to upload the data in a staging table which is used later as source for our data model. As can be seen it has a Drawing Date in which is stored Drawing’s date, while in N1, …, N6 are stored the drawings’ numbers (in the order they were picked). --creating the data DECLARE @index int DECLARE @counter int DECLARE @N1 int DECLARE @N2 int DECLARE @N3 int DECLARE @N4 int DECLARE @N5 int DECLARE @N6 int SET @index = 0 WHILE @index < 200 BEGIN -- creating first number SET @N1 = Cast(1000 * RAND() as int) % 48 + 1 -- creating second number SET @counter = 1 WHILE @counter < 50 or @N1=@N2 BEGIN SET @N2 = Cast(1000 * RAND() as int) % 48 + 1 SET @counter = @counter + 1 END -- creating third number SET @counter = 1 WHILE @counter < 50 or @N3 IN (@N1, @N2) BEGIN SET @N3 = Cast(1000 * RAND() as int) % 48 + 1 SET @counter = @counter + 1 END -- creating fourth number SET @counter = 1 WHILE @counter < 50 IN (@N1, @N2, @N3) BEGIN SET @N4 = Cast(1000 * RAND() as int) % 48 + 1 SET @counter = @counter + 1 END -- creating fifth number SET @counter = 1 WHILE @counter < 50 IN (@N1, @N2, @N3, @N4) BEGIN SET @N5 = Cast(1000 * RAND() as int) % 48 + 1 SET @counter = @counter + 1 END -- creating sixth number SET @counter = 1 WHILE @counter < 50 color="#000099">IN (@N1, @N2, @N3, @N4, @N5) BEGIN SET @N6 = Cast(1000 * RAND() as int) % 48 + 1 SET @counter = @counter + 1 END --inserting a new drawing INSERT LottoExtracts(DrawingDate, N1, N2, N3, N4, N5, N6) VALUES (DateAdd(d, @index*7, '01/01/2000'), @N1, @N2, @N3, @N4, @N5, @N6) SET @index = @index + 1 END --checking the data SELECT * FROM LottoExtracts     The following statement creates a random number in 1 … 49 range: SET @N1 = Cast(1000 * RAND() as int) % 48 + 1     Thus can be created all 6 numbers, however because the same number can’t be extracted twice in the same drawing, I had to add additional duplication checks: -- creating third number SET @counter = 1 WHILE @counter < 50 or @N3 IN (@N1, @N2) BEGIN SET @N3 = Cast(1000 * RAND() as int) % 48 + 1 SET @counter = @counter + 1 END     I created only 200 records, which equates with almost 4 years of history, and should be enough for our purpose. Of course, you can increase or decrease the number of drawings! Notes:     The links could change in time, I provided them just to make easier the search, if they change in time, then tough luck!     (*) denotes that historical data can be downloaded.

No comments: