The basis for being able to manipulate data via SQL scripting is a good knowledge of using joins and subqueries as seems fit for the purpose and data models. The following scripts are based on the tables created in a previous post.
-- FULL JOIN
SELECT *
FROM dbo.T_Courses C
JOIN dbo.T_Students S
ON C.CourseId = S.CourseId
ORDER BY C.CourseName
, S.StudentName
-- FULL JOIN (deprecated)
SELECT *
FROM dbo.T_Courses C
, dbo.T_Students S
WHERE C.CourseId = S.CourseId
ORDER BY C.CourseName
, S.StudentName
--LEFT JOIN
SELECT *
FROM dbo.T_Courses C
LEFT JOIN dbo.T_Students S
ON C.CourseId = S.CourseId
WHERE S.CourseId IS NULL
ORDER BY C.CourseName
, S.StudentName
-- RIGHT JOIN
SELECT *
FROM dbo.T_Courses C
RIGHT JOIN dbo.T_Students S
ON C.CourseId = S.CourseId
WHERE C.CourseId IS NULL
ORDER BY C.CourseName
, S.StudentName
-- FULL OUTER JOIN
SELECT *
FROM dbo.T_Courses C
FULL OUTER JOIN dbo.T_Students S
ON C.CourseId = S.CourseId
--WHERE C.CourseId IS NULL
--WHERE S.CourseId IS NULL
--WHERE (C.CourseId IS NULL OR S.StudentId IS NULL)
--WHERE (C.CourseId IS NULL AND S.StudentId IS NULL)
ORDER BY C.CourseName
, S.StudentName
The IN, NOT IN, EXISTS and NOT EXISTS allow using correlated queries, their use being indicated when there are no actual data needed from the tables involved in the correlated queries:
-- EXISTS (correlated subquery)
SELECT *
FROM dbo.T_Courses C
WHERE EXISTS (SELECT StudentId
FROM dbo.T_Students S
WHERE C.CourseId = S.CourseId)
ORDER BY C.CourseName
-- NOT EXISTS (correlated subquery)
SELECT *
FROM dbo.T_Courses C
WHERE EXISTS (SELECT StudentId
FROM dbo.T_Students S
WHERE C.CourseId = S.CourseId)
ORDER BY C.CourseName
-- IN (subquery)
SELECT *
FROM dbo.T_Courses C
WHERE CourseId IN (SELECT CourseId
FROM dbo.T_Students S)
ORDER BY C.CourseName
Joining multiples tables is done using the same principles as above:
-- joins with more tables
SELECT A.CourseId
, C.CourseName
, A.StudentId
, S.StudentName
, A.StartDate
, A.EndDate
FROM dbo.T_Allocations A
JOIN dbo.T_Courses C
ON A.CourseId = C.CourseId
JOIN dbo.T_Students S
ON A.StudentId = S.StudentId
ORDER BY C.CourseName
, S.StudentName
One can obtain the same result via correlated subqueries (a technique often met between Oracle developers). From readability reasons I avoid writing such queries, unless there’s a special purpose to do so.
-- correlated subquery for individual values
SELECT A.CourseId
, (SELECT C.CourseName
FROM dbo.T_Courses C
WHERE A.CourseId = C.CourseId) CourseName
, A.StudentId
, (SELECT S.StudentName
FROM dbo.T_Students S
WHERE A.StudentId = S.StudentId) StudentName
, A.StartDate
, A.EndDate
FROM dbo.T_Allocations A
ORDER BY CourseName
, StudentName
When displaying values within a SELECT via a correlated subqueries, some developers feel the need to use MAX or MIN functions to make sure only one value will be returned. For data analysis it may be acceptable, however if the data model imposes it, then a redesign of the solution is more likely necessary.
-- correlated subquery for individual values
SELECT A.CourseId
, (SELECT Max(C.CourseName)
FROM dbo.T_Courses C
WHERE A.CourseId = C.CourseId) CourseName
, A.StudentId
, (SELECT Max(S.StudentName)
FROM dbo.T_Students S
WHERE A.StudentId = S.StudentId) StudentName
, A.StartDate
, A.EndDate
FROM dbo.T_Allocations A
ORDER BY CourseName
, StudentName
Another technique not recommended is displaying one or more attributes from the same table with the same conditions via individual correlated queries. The use of aggregate functions is more appropriate however with numerical or date values.
Notes:
The queries work also in SQL databases in Microsoft Fabric.
Happy coding!
Previous Post <<||>> Next Post