Unfortunately (or fortunately, for the challenge-searchers), not all JSON data files have a simple (matrix) structure, while the data might not even have a proper (readable) definition. It's the case of the unemployment data provided by the Cologne municipality (source). However with a language page translator and some small effort one can identify the proximate data definition:
Source Field | Target Field | Data Type |
AM_ALO_INSG_AA | ALO_Total | int |
AM_ALO_SGB2_AA | ALO_SGB2 | int |
AM_ALO_UNTER25_AA | ALO_Under25 | int |
AM_ALO_INSG_AP | ALO_Total_Perc | float |
AM_ALO_SGB2_AP | ALO_SGB2_Perc | float |
AM_ALO_UNTER25_AP | ALO_Under25_Perc | float |
AM_ALO_INSG_HA | ALO_Total_Hist | int |
AM_ALO_SGB2_HA | ALO_SGB2_Hist | int |
AM_ALO_UNTER25_HA | ALO_Under25_Hist | int |
AM_ALO_INSG_HP | ALO_Total_HistPerc | float |
AM_ALO_SGB2_HP | ALO_SGB2_HistPerc | float |
AM_ALO_UNTER25_HP | ALO_Under25_HistPerc | float |
AM_SVB_INSG_AA | SVB_Total | int |
AM_SVB_MANN_AA | SVB_Men | int |
AM_SVB_FRAU_AA | SVB_Women | int |
AM_SVB_DEUTSCH_AA | SVB_German | int |
AM_SVB_AUSLAND_AA | SVB_AUSLAND | int |
AM_SVB_U25J_AA | SVB_Under25Yo | int |
AM_SVB_UEBER55J_AA | SVB_Over55Yo | int |
AM_SVB_INSG_AP | SVB_Total_Perc | float |
AM_SVB_MANN_AP | SVB_Men_Perc | float |
AM_SVB_FRAU_AP | SVB_Women_Perc | float |
AM_SVB_DEUTSCH_AP | SVB_German_Perc | float |
AM_SVB_AUSLAND_AP | SVB_AUSLAND_Perc | float |
AM_SVB_U25J_AP | SVB_Under25Yo_Perc | float |
AM_SVB_UEBER55J_AP | SVB_Over55Yo_Perc | float |
AM_SVB_INSG_HA | SVB_Total_Hist | int |
AM_SVB_MANN_HA | SVB_Men_Hist | int |
AM_SVB_FRAU_HA | SVB_Women_Hist | int |
AM_SVB_DEUTSCH_HA | SVB_German_Hist | int |
AM_SVB_AUSLAND_HA | SVB_AUSLAND_Hist | int |
AM_SVB_U25J_HA | SVB_Under25Yo_Hist | int |
AM_SVB_UEBER55J_HA | SVB_Over55Yo_Hist | int |
AM_SVB_INSG_HP | SVB_Total_HistPerc | float |
AM_SVB_MANN_HP | SVB_Men_HistPerc | float |
AM_SVB_FRAU_HP | SVB_Women_HistPerc | float |
AM_SVB_DEUTSCH_HP | SVB_German_HistPerc | float |
AM_SVB_AUSLAND_HP | SVB_AUSLAND_HistPerc | float |
AM_SVB_U25J_HP | SVB_Under25Yo_HistPerc | float |
AM_SVB_UEBER55J_HP | SVB_Over55Yo_HistPerc | float |
SHAPE.AREA | SHAPE.AREA | int |
SHAPE.LEN | SHAPE.LEN | int |
As previously stated (see post), it makes sense to build the logic over several iterations, making first sure that the references to file's columns were used correctly (observe the way the various elements were referenced in the queries):
SELECT DAT.ObjectId , DAT.Nummer , DAT.Name , DAT.ALO_Total , DAT.ALO_SGB2 FROM OPENROWSET (BULK 'D:\data\Arbeitsmarkt Statistik Koeln Stadtteil.json',CODEPAGE='65001', SINGLE_CLOB) as jsonfile CROSS APPLY OPENJSON(BulkColumn,'$.features') WITH( ObjectId int '$.properties.OBJECTID' , Nummer int '$.properties.NUMMER' , Name nvarchar(max) '$.properties.NAME' , ALO_Total int '$.properties.AM_ALO_INSG_AA' , ALO_SGB2 int '$.properties.AM_ALO_SGB2_AA' ) AS DAT;
ObjectId | Nummer | Name | ALO_Total | ALO_SGB2 |
1 | 211 | Godorf | 121 | 92 |
2 | 308 | Lövenich | 156 | 83 |
3 | 307 | Weiden | 552 | 383 |
4 | 306 | Junkersdorf | 305 | 164 |
5 | 309 | Widdersdorf | 200 | 93 |
6 | 404 | Vogelsang | 310 | 208 |
7 | 505 | Weidenpesch | 527 | 351 |
8 | 502 | Mauenheim | 190 | 127 |
12 | 207 | Hahnwald | 15 | 3 |
13 | 213 | Meschenich | 644 | 554 |
-- importing the JSON file SELECT DAT.ObjectId , DAT.Nummer , dbo.ReplaceCodes2Umlauts(DAT.Name) Name , DAT.ALO_Total , DAT.ALO_SGB2 , DAT.ALO_Under25 , DAT.ALO_Total_Perc , DAT.ALO_SGB2_Perc , DAT.ALO_Under25_Perc , DAT.ALO_Total_Hist , DAT.ALO_SGB2_Hist , DAT.ALO_Under25_Hist , DAT.ALO_Total_HistPerc , DAT.ALO_SGB2_HistPerc , DAT.ALO_Under25_HistPerc , DAT.SVB_Total , DAT.SVB_Men , DAT.SVB_Women , DAT.SVB_German , DAT.SVB_AUSLAND , DAT.SVB_Under25Yo , DAT.SVB_Over55Yo , DAT.SVB_Total_Perc , DAT.SVB_Men_Perc , DAT.SVB_Women_Perc , DAT.SVB_German_Perc , DAT.SVB_AUSLAND_Perc , DAT.SVB_Under25Yo_Perc , DAT.SVB_Over55Yo_Perc , DAT.SVB_Total_Hist , DAT.SVB_Men_Hist , DAT.SVB_Women_Hist , DAT.SVB_German_Hist , DAT.SVB_AUSLAND_Hist , DAT.SVB_Under25Yo_Hist , DAT.SVB_Over55Yo_Hist , DAT.SVB_Total_HistPerc , DAT.SVB_Men_HistPerc , DAT.SVB_Women_HistPerc , DAT.SVB_German_HistPerc , DAT.SVB_AUSLAND_HistPerc , DAT.SVB_Under25Yo_HistPerc , DAT.SVB_Over55Yo_HistPerc , DAT.Shape_Area , DAT.Shape_Len INTO dbo.Unemployment_Cologne FROM OPENROWSET (BULK 'D:\data\Arbeitsmarkt Statistik Koeln Stadtteil.json',CODEPAGE='65001', SINGLE_CLOB) as jsonfile CROSS APPLY OPENJSON(BulkColumn,'$.features') WITH( ObjectId int '$.properties.OBJECTID' , Nummer int '$.properties.NUMMER' , Name nvarchar(max) '$.properties.NAME' , ALO_Total int '$.properties.AM_ALO_INSG_AA' , ALO_SGB2 int '$.properties.AM_ALO_SGB2_AA' , ALO_Under25 int '$.properties.AM_ALO_UNTER25_AA' , ALO_Total_Perc float '$.properties.AM_ALO_INSG_AP' , ALO_SGB2_Perc float '$.properties.AM_ALO_SGB2_AP' , ALO_Under25_Perc float '$.properties.AM_ALO_UNTER25_AP' , ALO_Total_Hist int '$.properties.AM_ALO_INSG_HA' , ALO_SGB2_Hist int '$.properties.AM_ALO_SGB2_HA' , ALO_Under25_Hist int '$.properties.AM_ALO_UNTER25_HA' , ALO_Total_HistPerc float '$.properties.AM_ALO_INSG_HP' , ALO_SGB2_HistPerc float '$.properties.AM_ALO_SGB2_HP' , ALO_Under25_HistPerc float '$.properties.AM_ALO_UNTER25_HP' , SVB_Total int '$.properties.AM_SVB_INSG_AA' , SVB_Men int '$.properties.AM_SVB_MANN_AA' , SVB_Women int '$.properties.AM_SVB_FRAU_AA' , SVB_German int '$.properties.AM_SVB_DEUTSCH_AA' , SVB_AUSLAND int '$.properties.AM_SVB_AUSLAND_AA' , SVB_Under25Yo int '$.properties.AM_SVB_U25J_AA' , SVB_Over55Yo int '$.properties.AM_SVB_UEBER55J_AA' , SVB_Total_Perc float '$.properties.AM_SVB_INSG_AP' , SVB_Men_Perc float '$.properties.AM_SVB_MANN_AP' , SVB_Women_Perc float '$.properties.AM_SVB_FRAU_AP' , SVB_German_Perc float '$.properties.AM_SVB_DEUTSCH_AP' , SVB_AUSLAND_Perc float '$.properties.AM_SVB_AUSLAND_AP' , SVB_Under25Yo_Perc float '$.properties.AM_SVB_U25J_AP' , SVB_Over55Yo_Perc float '$.properties.AM_SVB_UEBER55J_AP' , SVB_Total_Hist int '$.properties.AM_SVB_INSG_HA' , SVB_Men_Hist int '$.properties.AM_SVB_MANN_HA' , SVB_Women_Hist int '$.properties.AM_SVB_FRAU_HA' , SVB_German_Hist int '$.properties.AM_SVB_DEUTSCH_HA' , SVB_AUSLAND_Hist int '$.properties.AM_SVB_AUSLAND_HA' , SVB_Under25Yo_Hist int '$.properties.AM_SVB_U25J_HA' , SVB_Over55Yo_Hist int '$.properties.AM_SVB_UEBER55J_HA' , SVB_Total_HistPerc float '$.properties.AM_SVB_INSG_HP' , SVB_Men_HistPerc float '$.properties.AM_SVB_MANN_HP' , SVB_Women_HistPerc float '$.properties.AM_SVB_FRAU_HP' , SVB_German_HistPerc float '$.properties.AM_SVB_DEUTSCH_HP' , SVB_AUSLAND_HistPerc float '$.properties.AM_SVB_AUSLAND_HP' , SVB_Under25Yo_HistPerc float '$.properties.AM_SVB_U25J_HP' , SVB_Over55Yo_HistPerc float '$.properties.AM_SVB_UEBER55J_HP' , Shape_Area float '$.properties."SHAPE.AREA"' , Shape_Len float '$.properties."SHAPE.LEN"' ) AS DAT;
No comments:
Post a Comment