The basis for being able to manipulate data via SQL scripting is a good knowledge of using joins and subqueries as seems fit for the purpose and data models. The following scripts are based on the tables created in a previous post.
-- FULL JOIN SELECT * FROM dbo.T_Courses C JOIN dbo.T_Students S ON C.CourseId = S.CourseId ORDER BY C.CourseName , S.StudentName -- FULL JOIN (deprecated) SELECT * FROM dbo.T_Courses C , dbo.T_Students S WHERE C.CourseId = S.CourseId ORDER BY C.CourseName , S.StudentName --LEFT JOIN SELECT * FROM dbo.T_Courses C LEFT JOIN dbo.T_Students S ON C.CourseId = S.CourseId WHERE S.CourseId IS NULL ORDER BY C.CourseName , S.StudentName -- RIGHT JOIN SELECT * FROM dbo.T_Courses C RIGHT JOIN dbo.T_Students S ON C.CourseId = S.CourseId WHERE C.CourseId IS NULL ORDER BY C.CourseName , S.StudentName -- FULL OUTER JOIN SELECT * FROM dbo.T_Courses C FULL OUTER JOIN dbo.T_Students S ON C.CourseId = S.CourseId --WHERE C.CourseId IS NULL --WHERE S.CourseId IS NULL --WHERE (C.CourseId IS NULL OR S.StudentId IS NULL) --WHERE (C.CourseId IS NULL AND S.StudentId IS NULL) ORDER BY C.CourseName , S.StudentName
The IN, NOT IN, EXISTS and NOT EXISTS allow using correlated queries, their use being indicated when there are no actual data needed from the tables involved in the correlated queries:
-- EXISTS (correlated subquery) SELECT * FROM dbo.T_Courses C WHERE EXISTS (SELECT StudentId FROM dbo.T_Students S WHERE C.CourseId = S.CourseId) ORDER BY C.CourseName -- NOT EXISTS (correlated subquery) SELECT * FROM dbo.T_Courses C WHERE EXISTS (SELECT StudentId FROM dbo.T_Students S WHERE C.CourseId = S.CourseId) ORDER BY C.CourseName -- IN (subquery) SELECT * FROM dbo.T_Courses C WHERE CourseId IN (SELECT CourseId FROM dbo.T_Students S) ORDER BY C.CourseName
Joining multiples tables is done using the same principles as above:
-- joins with more tables SELECT A.CourseId , C.CourseName , A.StudentId , S.StudentName , A.StartDate , A.EndDate FROM dbo.T_Allocations A JOIN dbo.T_Courses C ON A.CourseId = C.CourseId JOIN dbo.T_Students S ON A.StudentId = S.StudentId ORDER BY C.CourseName , S.StudentName
One can obtain the same result via correlated subqueries (a technique often met between Oracle developers). From readability reasons I avoid writing such queries, unless there’s a special purpose to do so.
-- correlated subquery for individual values SELECT A.CourseId , (SELECT C.CourseName FROM dbo.T_Courses C WHERE A.CourseId = C.CourseId) CourseName , A.StudentId , (SELECT S.StudentName FROM dbo.T_Students S WHERE A.StudentId = S.StudentId) StudentName , A.StartDate , A.EndDate FROM dbo.T_Allocations A ORDER BY CourseName , StudentName
When displaying values within a SELECT via a correlated subqueries, some developers feel the need to use MAX or MIN functions to make sure only one value will be returned. For data analysis it may be acceptable, however if the data model imposes it, then a redesign of the solution is more likely necessary.
-- correlated subquery for individual values SELECT A.CourseId , (SELECT Max(C.CourseName) FROM dbo.T_Courses C WHERE A.CourseId = C.CourseId) CourseName , A.StudentId , (SELECT Max(S.StudentName) FROM dbo.T_Students S WHERE A.StudentId = S.StudentId) StudentName , A.StartDate , A.EndDate FROM dbo.T_Allocations A ORDER BY CourseName , StudentName
Another technique not recommended is displaying one or more attributes from the same table with the same conditions via individual correlated queries. The use of aggregate functions is more appropriate however with numerical or date values.