Showing posts with label DBA. Show all posts
Showing posts with label DBA. Show all posts

13 February 2024

Business Intelligence: A One Man Show V (Focus on the Foundation)

Business Intelligence Suite
Business Intelligence Suite

I tend to agree that one person can't do anymore "everything in the data space", as Christopher Laubenthal put it his article on the topic [1]. He seems to catch the essence of some of the core data roles found in organizations. Summarizing these roles, data architecture is about designing and building a data infrastructure, data engineering is about moving data, database administration is mainly about managing databases, data analysis is about assisting the business with data and reports, information design is about telling stories, while data science can be about studying the impact of various components on the data. 

However, I find his analogy between a college's functional structure and the core data roles as poorly chosen from multiple perspectives, even if both are about building an infrastructure of some type. 

Firstly, the two constructions have different foundations. Data exists in a an organization also without data architects, data engineers or data administrators (DBAs)! It's enough to buy one or more information systems functioning as islands and reporting needs will arise. The need for a data architect might come when the systems need to be integrated or maybe when a data warehouse needs to be build, though many organizations are still in business without such constructs. While for the others, the more complex the integrations, the bigger the need for a Data Architect. Conversely, some systems can be integrated by design and such capabilities might drive their selection.

Data engineering is needed mainly in the context of the cloud, respectively of data lake-based architectures, where data needs to be moved, processed and prepared for consumption. Conversely, architectures like Microsoft Fabric minimize data movement, the focus being on data processing, the successive transformations it needs to suffer in moving from bronze to the gold layer, respectively in creating an organizational semantical data model. The complexity of the data processing is dependent on data' structuredness, quality and other data characteristics. 

As I mentioned before, modern databases, including the ones in the cloud, reduce the need for DBAs to a considerable degree. Unless the volume of work is big enough to consider a DBA role as an in-house resource, organizations will more likely consider involving a service provider and a contingent to cover the needs. 

Having in-house one or more people acting under the Data Analyst role, people who know and understand the business, respectively the data tools used in the process, can go a long way. Moreover, it's helpful to have an evangelist-like resource in house, a person who is able to raise awareness and knowhow, help diffuse knowledge about tools, techniques, data, results, best practices, respectively act as a mentor for the Data Analyst citizens. From my point of view, these are the people who form the data-related backbone (foundation) of an organization and this is the minimum of what an organization should have!

Once this established, one can build data warehouses, data integrations and other support architectures, respectively think about BI and Data strategy, Data Governance, etc. Of course, having a Chief Data Officer and a Data Strategy in place can bring more structure in handling the topics at the various levels - strategical, tactical, respectively operational. In constructions one starts with a blueprint and a data strategy can have the same effect, if one knows how to write it and implement it accordingly. However, the strategy is just a tool, while the data-knowledgeable workers are the foundation on which organizations should build upon!

"Build it and they will come" philosophy can work as well, though without knowledgeable and inquisitive people the philosophy has high chances to fail.

Previous Post <<||>> Next Post

Resources:
[1] Christopher Laubenthal (2024) "Why One Person Can’t Do Everything In Data" (link)

Business Intelligence: A One Man Show IV (Data Roles between Past and Future)

Business Intelligence Series
Business Intelligence Series

Databases nowadays are highly secure, reliable and available to a degree that reduces the involvement of DBAs to a minimum. The more databases and servers are available in an organization, and the older they are, the bigger the need for dedicated resources to manage them. The number of DBAs involved tends to be proportional with the volume of work required by the database infrastructure. However, if the infrastructure is in the cloud, managed by the cloud providers, it's enough to have a person in the middle who manages the communication between cloud provider(s) and the organization. The person doesn't even need to be a DBA, even if some knowledge in the field is usually recommended.

The requirement for a Data Architect comes when there are several systems in place and there're multiple projects to integrate or build around the respective systems. It'a also the question of what drives the respective requirement - is it the knowledge of data architectures, the supervision of changes, and/or the review of technical documents? The requirement is thus driven by the projects in progress and those waiting in the pipeline. Conversely, if all the systems are in the cloud, their integration is standardized or doesn't involve much architectural knowledge, the role becomes obsolete or at least not mandatory. 

The Data Engineer role is a bit more challenging to define because it appeared in the context of cloud-based data architectures. It seems to be related to the data movement via ETL/ELT pipelines and of data processing and preparation for the various needs. Data modeling or data presentation knowledge isn't mandatory even if ideal. The role seems to overlap with the one of a Data Warehouse professional, be it a simple architect or developer. Role's knowhow depends also on the tools involved, because one thing is to build a solution based on a standard SQL Server, and another thing to use dedicated layers and architectures for the various purposes. Engineers' number should be proportional with the number of data entities involved.

Conversely, the existence of solutions that move and process the data as needed, can reduce the volume of work. Moreover, the use of AI-driven tools like Copilot might shift the focus from data to prompt engineering. 

The Data Analyst role is kind of a Cinderella - it can involve upon case everything from requirements elicitation to reports writing and results' interpretation, respectively from data collection and data modeling to data visualization. If you have a special wish related to your data, just add it to the role! Analysts' number should be related to the number of issues existing in organization where the collection and processing of data could make a difference. Conversely, the Data Citizen, even if it's not a role but a desirable state of art, could absorb in theory the Data Analyst role.

The Data Scientist is supposed to reveal the gems of knowledge hidden in the data by using Machine Learning, Statistics and other magical tools. The more data available, the higher the chances of finding something, even if probably statistically insignificant or incorrect. The role makes sense mainly in the context of big data, even if some opportunities might be available at smaller scales. Scientists' number depends on the number of projects focused on the big questions. Again, one talks about the Data Scientist citizen. 

The Information Designer role seems to be more about data visualization and presentation. It makes sense in the organizations that rely heavily on visual content. All the other organizations can rely on the default settings of data visualization tools, independently on whether AI is involved or not. 

Previous Post <<||>> Next Post

Business Intelligence: A One Man Show II (In the Cusps of Complexity)

Business Intelligence Series
Business Intelligence Series

I watched today on YouTube Power BI Tips' "One Person to Do Everything" episode I missed last week. The main topic is based on Christopher Laubenthal's article "Why one person can't do everything in the data space". Author's arguments are based on an analogy between the various data areas and a college's functional structure. Reading the article, I must say that it takes a poorly chosen analogy to mess messy things more!

One of the most confusing things is that there are so many data-related context-dependent roles with considerable overlapping, that it becomes more and more difficult to understand what they cover. The author considers the roles of Data Architect, Data Engineer, Database Administrator (DBA), Data Analyst, Information Designer and Data Scientist. However, to the every aspect of a data architecture there are also developers on the database (backend) and reporting side (front-end). Conversely, there are other data professionals on the management side for the various knowledge areas of Data Management: Data Governance, Data Strategy, Data Security, Data Operations, etc. There are also roles at the border between the business and the technical side like Data Stewards, Business Analysts, Data Citizen, etc. 

There are two main aspects here. According to the historical perspective, many of these roles appeared when a new set of requirements or a new layer appeared in the architecture. Firstly, it was maybe the DBA, who was supposed to primarily administer the database. Being a keeper of the data and having some knowledge of the data entities, it was easy for him/her to export data for the various reporting needs. In time such activities were taken over by a second category of data professionals. Then the data were moved to Decision Support Systems and later to Data Warehouses and Data Lakes/Lakehoses, this evolution requiring other professionals to address the challenges of each layer. Every activity performed on the data requires a certain type of knowledge that can result in the end in a new denomination. 

The second perspective results from the management of data and the knowledge areas associated with it. If in small organizations with one or two systems in place one doesn't need to talk about Data Operations, in big organizations, where a data center or something similar is maybe in place, Data Operations can easily become a topic on its own, a management structure needing to be in place for its "effective and efficient" management. And the same can happen in the other knowledge areas and their interaction with the business. It's an inherent tendency of answering to complexity with complexity, which on the long term can be in the detriment of any business. In extremis, organizations tend to have a whole team in each area, which can further increase the overall complexity by a small to not that small magnitude. 

Fortunately, one of the benefits of technological advancement is that much of the complexity can be moved somewhere else, and these are the areas where the cloud brings the most advantages. Parts or all architecture can be deployed into the cloud, being managed by cloud providers and third-parties on an on-demand basis at stable costs. Moreover, with the increasing maturity and integration of the various layers, the impact of the various roles in the overall picture is reduced considerably as areas like governance, security or operations are built-in as services, requiring thus less resources. 

With Microsoft Fabric, all the data needed for reporting becomes in theory easily available in the OneLake. Unfortunately, there is another type of complexity that is dumped on other professionals' shoulders and these aspects need to be furthered considered. 

Previous Post <<|||>> Next Post

Resources:
[1] Christopher Laubenthal (2024) "Why One Person Can’t Do Everything In Data" (link)
[2] Power BI tips (2024) Ep.292: One Person to Do Everything (link)


30 January 2009

DBMS: Database Administrator (Definitions)

"The person responsible for creating, installing, configuring, and maintaining databases. Often the system administrator serves as the database administrator." (Microsoft Corporation, "Microsoft SQL Server 7.0 Data Warehouse Training Kit", 2000)

"An individual responsible for management of an RDBMS. Responsibilities may include user management, indexing, tuning, and backup and recovery." (Christopher Adamson, "Mastering Data Warehouse Aggregates", 2006)

"A person or group of people responsible for the design and supervision of a data base." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"An individual responsible for the installation, configuration, administration, monitoring, and maintenance of the DBMS and its physical databases." (Craig S Mullins, "Database Administration", 2012)

"The person who manages a database. The administrator determines the content, internal structure, and access strategy for a database, defines security and integrity, and monitors performance." (Microsoft, "SQL Server 2012 Glossary", 2012)

"One who is responsible for databases used in the business environment; duties might include performing database updates, tuning database performance, coordinating on server hardware and OS issues, defining backup and restore strategies, and of course, security." (Mike Harwood, "Internet Security: How to Defend Against Attackers on the Web" 2nd Ed., 2015)

"The person (or persons) who has (have) the highest possible privileges inside the Oracle database, responsible for its care and feeding, and accountable if there is a loss or corruption of data. In charge of tasks such as space allocation, overall performance tuning, database security and privilege allocation, and physical database design. Some DBAs also perform dual duty as application developers." (Bill Pribyl & Steven Feuerstein, "Learning Oracle PL/SQL", 2001)

"The organizational function charged with the day-to-day monitoring and care of the databases. The DBA function is more closely associated with physical database design than the DA [Data Administrator] is." (William H Inmon, "Building the Data Warehouse", 2005)

"The person in a software organization who is in charge of designing, creating, and maintaining the databases of an enterprise. The DBA makes use of a variety of software tools provided by a DBMS." (Toby J Teorey, ", Database Modeling and Design" 4th Ed., 2010)

"The IT professional role responsible for database administration, usually in the context of a particular DBMS, hence often prefixed with the name of the DBMS, e.g., Oracle DBA. Functional responsibilities vary based on further definitions." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The person responsible for planning, organizing, controlling, and monitoring the centralized and shared corporate database. The DBA is the general manager of the database administration department." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed", 2011)

"A person who is responsible for the design, development, operation, maintenance, and use of a database." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A person who has the responsibility for maintaining a database." (Jan L Harrington, "Relational Database Design and Implementation" 3rd Ed., 2009)

20 November 2005

Database Management: Who's gonna rule the world?

Database Management
Database Management Series

This morning I started to read an article from Information Week about the future of database administrators, as it seems they will rule the world, at least the world of data. It's not a joke, their role will become more and more important over the next years, why this? 

 1. Databases become bigger and more complex      
The size and complexity of databases is increasing from year to year, same their importance in making faster decision based on current or historical data. The bigger the size and complexity, bigger also the need for ad-hoc or one mouse click reports, and who’s the person who has the knowledge and resources to do this?! 

2. Data mining      
Historical data doesn't resume only to simple reporting, but data mining is becoming more popular, having as purpose the discovery of trends in the business, facilitating somehow predictions for business' future.     

Let’s not forget that the big players on the database market started to offer data mining solutions, incorporated or not in their database platforms, the data mining features are a start in providing new views over the data.      

Big companies will opt probably for data analysts, but for medium-sized and small companies, the role of a data analyst could be easily taken by a database administrator. 

3. The knowledge that matters      
Even if databases are backend for complex UI, in time, if not immediately, the database administrator becomes aware of underlaying overall structure of the database and, why not, leaded by curiosity or business' requirements he becomes aware of the content of the data. Even if is a paradox, the database administrator has sometimes more data knowledge than a manager and the skills to do with data that magic that is important for managers. 

4. Data Cleansing      
More and more companies are becoming aware of the outliers from their data, so data cleansing is becoming a priority, and again the database administrator has an important role in this process. 

5. Database Security      
Database platforms are growing, unfortunately the same thing happens with their security holes and the number of attacks. A skilled database administrator must have the knowledge to overcome these problems and solve the issues as soon as possible. 

6. Training      
Even if can be considered a consequence of the previously enumerated facts, it deserves to be considered separately. Companies are starting to realize how important it to have a skilled database administrator, so probably they will provide the requested training or help the DBA to achieve this. 

 7. On site/offshore compromise      
If in the past years, has been popular the concept of off shoring people, including database administrators; probably soon companies will become aware that reducing direct maintenance costs doesn’t mean a reduction of overall costs, which may contain also indirect costs. The delayed response time generated by communication issues, availability or resources plays an important role in the increase indirect costs, most of the big companies will be forced then to opt for a compromise between on site and offshore.      

In this case the direct benefit for a database administrator is not so obvious, except for the financial benefit there is also an image advantage, even if is disregarded, the image and the fact a person is not anymore isolated is important.

Reviewing the post almost 20 years later with the knowledge of today, I can just smile. Besides data analysts, the DBAs were at that time the closest to the data. Probably, many DBAs walked the new paths created by the various opportunities. Trying to uncover the unknown through the known is a foolish attempt, but it’s the best we can do to prepare us for what comes next.

Created: Nov-2005, Last Reviewed: Mar-2024
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.