Data Science Series |
It's strange to see the same topics from 20-25 years ago
reappearing over and over again despite the advancements made in the area of
database engines. Each version of SQL Server brought something new in what
concerns the performance, though without some good experience and understanding
of the basic optimization and troubleshooting techniques there's little overall
improvement for the average data professional in terms of writing and tuning queries!
Especially with the boom of Data Science topics, the volume
of material on SQL increased considerably and many discover how easy is to
write queries, even if the start might be challenging for some. Writing a query
is easy indeed, though writing a performant query requires besides the language
itself also some knowledge about the database engine and the various techniques
used for troubleshooting and optimization. It's not about knowing in advance
what the engine will do - the engine will often surprise you - but about
knowing what techniques work, in what cases, which are their advantages and
disadvantages, respectively on how they might impact the processing.
Making a parable with writing literature, it's not enough to
speak a language; one needs more for becoming a writer, and there are so many
levels of mastery! However, in database world even if creativity is welcomed,
its role is considerable diminished by the constraints existing in the database
engine, the problems to be solved, the time and the resources available. More
important, one needs to understand some of the rules and know how to use the building
blocks to solve problems and build reliable solutions.
The learning process for newbies focuses mainly on the
language itself, while the exposure to complexity is kept to a minimum. For some
learners the problems start when writing queries based on multiple tables - what joins to use, in what order, how to structure
the queries, what database objects to use for encapsulating the code, etc. Even
if there are some guidelines and best practices, the learner must walk the path
and experiment alone or in an organized setup.
In university courses the focus is on operators algebras, algorithms, on general database technologies and architectures without much hand on experience. All is too theoretical and abstract, which is acceptable for research purposes, but not for the contact with the real world out there! Probably some labs offer exposure to real life scenarios, though what to cover first in the few hours scheduled for them?
This was the state of art when I started to learn SQL a quarter
century ago, and besides the current tendency of cutting corners, the increased
confidence from doing some tests, and the eagerness of shouting one’s shaking knowledge
and more or less orthodox ideas on the various social networks, nothing seems
to have changed! Something did change – the increased complexity of the problems
to solve, and, considering the recent technological advances, one can afford now
an AI learn buddy to write some code for us based on the information provided in
the prompt.
This opens opportunities for learning and growth. AI can be used in the learning process by providing additional curricula for learners to dive deeper in some topics. Moreover, it can help us in time to address the challenges of the ever-increase complexity of the problems.
No comments:
Post a Comment