A Software Engineer and data professional's blog on SQL, data, databases, data architectures, data management, programming, Software Engineering, Project Management, ERP implementation and other IT related topics.
Pages
- 🏠Home
- 🗃️Posts
- 🗃️Definitions
- 🏭Fabric
- ⚡Power BI
- 🔢SQL Server
- 📚Data
- 📚Engineering
- 📚Management
- 📚SQL Server
- 📚Systems Thinking
- ✂...Quotes
- 🧾D365: GL
- 💸D365: AP
- 💰D365: AR
- 👥D365: HR
- ⛓️D365: SCM
- 🔤Acronyms
- 🪢Experts
- 🗃️Quotes
- 🔠Dataviz
- 🔠D365
- 🔠Fabric
- 🔠Engineering
- 🔠Management
- 🔡Glossary
- 🌐Resources
- 🏺Dataviz
- 🗺️Social
- 📅Events
- ℹ️ About
30 October 2018
💠🛠️SQL Server: Administration (Troubleshooting Login Failed for User)
Login failed for user '<domain>\<computer>$'. Reason: Could not find a login matching the name provided. [CLIENT: <local machine>]
Error: 18456, Severity: 14, State: 5.
From the text it seemed like a permission problem, thing confirmed by the documentation (see [1]), the Error Number and State correspond to a „User Id is not valid“ situation. In a first step I attempted to give permissions to the local account (dollar sign included). The account wasn’t found in the Active Directory (AD), though by typing the account directly in the “Login name” I managed to give temporarily sysadmin permission to the account. The error continued to appear in the error log. I looked then at the accounts under which the SQL Services run - nothing suspect in there.
Except the error message, which was appearing with an alarming frequency (a few seconds apart), everything seemed to be working on the server. The volume of records (a few hundred thousands over a few days) bloating the error log, as well the fact that I didn’t knew what’s going on made me take the time and further investigate the issue.
Looking today at the Windows Logs for Applications I observed that the error is caused by an account used for the Microsoft SQL Server IaaS Agent and IaaS Query Service. Once I gave permissions to the account the error disappeared.
The search for a best practice on what permissions to give to the IaaS Agent and IaaS Query Service lead me to [2]. To quote, the “Agent Service needs Local System rights to be able to install and configure SQL Server, attach disks and enable storage pool and manage automated security patching of Windows and SQL server”, while the “IaaS Query Service is started with an NT Service account which is a Sys Admin on the SQL Server”. In fact, this was the only resource I found that made a reference to the IaaS Query Service.
This was just one of the many scenarios in which the above error appears. For more information see for example [3], [4] or [5].
References:
[1] Microsoft (2017) MSSQLSERVER_18456 [Online] Available from: https://docs.microsoft.com/en-us/sql/relational-databases/errors-events/mssqlserver-18456-database-engine-error?view=sql-server-2017
[2] SQL Database Engine Blog (2018) SQL Server IaaS Extension Query Service for SQL Server on Azure VM, by Mine Tokus Altug [Online] Available from: https://blogs.msdn.microsoft.com/sqlserverstorageengine/2018/10/25/sql-server-iaas-extension-query-service-for-sql-server-on-azure-vm/
[3] Microsoft Support (2018) "Login failed for user" error message when you log on to SQL Server [Online] Available from: https://support.microsoft.com/en-sg/help/555332/login-failed-for-user-error-message-when-you-log-on-to-sql-server
[4] Microsoft Technet (2018) How to Troubleshoot Connecting to the SQL Server Database [Online] Available from: Engine https://social.technet.microsoft.com/wiki/contents/articles/2102.how-to-troubleshoot-connecting-to-the-sql-server-database-engine.aspx
[5] Microsoft Blogs (2011)Troubleshoot Connectivity/Login failures (18456 State x) with SQL Server, by Sakthivel Chidambaram [Online] Available from: https://blogs.msdn.microsoft.com/sqlsakthi/2011/02/06/troubleshoot-connectivitylogin-failures-18456-state-x-with-sql-server/
29 October 2018
💠🛠️SQL Server: Administration (Searching the Error Log)
The following are the parameter available with the stored procedure:
Parameter
|
Name
|
Type
|
Description
|
1 | FileToRead | int | 0 = Current, 1 or 2, 3, … n Archive Number |
2 | Logtype | int | 1 = SQL Error Log and 2 = SQL Agent log |
3 | String1 | varchar(255) | the string to match the logs on |
4 | String2 | varchar(255) | a second string to match in combination with String1 (AND) |
5 | StartDate | datetime | beginning date to look from |
6 | EndDate | datetime | ending date to look up to |
7 | ResultsOrder | ASC or DESC sorting |
Note:
If the SQL Server Agent hasn’t been active, then there will be no Agent log and the call to the stored procedure will return an error.
Here are a few examples of using the stored procedure:
-- listing the content of the current SQL Server error log EXEC xp_readerrorlog 0, 1 -- listing the content of the second SQL Server error log EXEC xp_readerrorlog 1, 1 -- listing the content of the current SQL Server Agent log EXEC xp_readerrorlog 0, 2 -- searching for errors EXEC xp_readerrorlog 0, 1, N'error' -- searching for errors that have to do with consistency checks EXEC xp_readerrorlog 0, 1, N'error', N'CHECKDB' -- searching for errors that have to do with consistency checks EXEC xp_readerrorlog 0, 1, N'failed', N'backups' -- searching for warnings EXEC xp_readerrorlog 0, 1, N'warning' -- searching who killed a session EXEC xp_readerrorlog 0, 1, N'kill' -- searching for I/O information EXEC xp_readerrorlog 0, 1, N'I/O' -- searching for consistency checks EXEC xp_readerrorlog 0, 1, N'CHECKDB' -- searching for consistency checks performed via DBCC EXEC xp_readerrorlog 0, 1, N'DBCC CHECKDB' -- searching for failed logins EXEC xp_readerrorlog 0, 1, N'Login failed' -- searching for EXEC xp_readerrorlog 0, 1, N'[INFO]' -- searching for shutdowns EXEC xp_readerrorlog 0, 1, N'shutdown' -- searching for a database instant file initialization event EXEC xp_readerrorlog 0, 1, N'database instant file initialization'
If the error log is too big it’s important to narrow the search for a given time interval:
-- searching for errors starting with a given date DECLARE @StartDate as Date = DateAdd(d, -1, GetDate()) EXEC xp_readerrorlog 0, 1, N'error', N'', @StartDate -- searching for errors within a time interval DECLARE @StartDate as Date = DateAdd(d, -14, GetDate()) DECLARE @EndDate as Date = DateAdd(d, -7, GetDate()) EXEC xp_readerrorlog 0, 1, N'', N'', @StartDate, @EndDate, N'desc'
The output can be dumped into a table especially when is needed to perform a detailed analysis on the error log. It might be interesting to check how often an error message occurred, like in the below example. One can take thus advantage of more complex pattern searching.
-- creating the error log table CREATE TABLE dbo.ErrorLogMessages ( LogDate datetime2(0) , ProcessInfo nvarchar(255) , [Text] nvarchar(max)) -- loading the errors INSERT INTO dbo.ErrorLogMessages EXEC xp_readerrorlog 0, 1 -- checking the results SELECT * FROM dbo.ErrorLogMessages -- checking messages frequency SELECT [Text] , count(*) NoOccurrences , Min(LogDate) FirstOccurrence FROM dbo.ErrorLogMessages GROUP BY [Text] HAVING count(*)>1 ORDER BY NoOccurrences DESC -- getting the errors and their information SELECT * FROM ( SELECT * , Lead([Text], 1) OVER (PARTITION BY LogDate, ProcessInfo ORDER BY LogDate) PrevMessage FROM dbo.ErrorLogMessages ) DAT WHERE [Text] LIKE '%error:%[0-9]%' -- cleaning up --DROP TABLE IF EXISTS dbo.ErrorLogMessages
For those who don’t have admin permissions it is necessary to explicitly give execute permissions on the xp_readerrorlog stored procedure:
-- giving explicit permissions to account GRANT EXECUTE ON xp_readerrorlog TO [<account_name>]
Personally, I’ve been using the stored procedure mainly to check whether error messages were logged for a given time interval and whether the consistency checks run without problems. Occasionally, I used it to check for failed logins or sessions terminations (aka kills).
Notes:
Microsoft warns that undocumented objects might change in future releases. Fortunately, xp_readerrorlog made it since SQL Server 2005 to SQL Server 2017, so it might make it further…
The above code was tested also on SQL Server 2017.
Happy coding!
22 October 2018
💠🛠️SQL Server Administration: Troubleshooting Root Blocking Sessions
Unless special monitoring tools like Redgate’s SQL Monitor are available, when it comes to troubleshooting blocking sessions one can use the sp_who2 and sp_whoisactive stored procedures or the further available DMVs, which can prove useful upon case. In the case one just needs to find the root session that blocks other sessions, as it happens in poorly designed and/or heavily used databases, one can follow the chain of blockings provided by the sp_who2, though it requires time. Therefore, based on sp_who2 logic I put together a script that with the help of common tables expressions walks through the hierarchical chain to identify the root blocking session(s):
;WITH CTERequests AS ( -- taking a snapshot of requests SELECT ER.session_id , ER.request_id , ER.blocking_session_id , ER.database_id , ER.status , ER.sql_handle , ER.command , ER.start_time , ER.total_elapsed_time/(1000*1000) total_elapsed_time , ER.cpu_time/(1000*1000) cpu_time , ER.wait_type , ER.wait_time/(1000*1000) wait_time , ER.reads , ER.writes , ER.granted_query_memory/128.0 granted_query_memory_mb FROM sys.dm_exec_requests ER with (nolock) WHERE ER.session_id > 50 – user sessions ), CTE AS ( -- building the tree SELECT ER.session_id , ER.request_id , ER.blocking_session_id , ER.database_id , ER.status , ER.command , ER.sql_handle , Cast(NULL as varbinary(64)) blocking_sql_handle , 0 level , CAST(ER.session_id as nvarchar(max)) sessions_path , ER.start_time , ER.total_elapsed_time , ER.cpu_time , ER.wait_type , ER.wait_time , ER.reads , ER.writes , ER.granted_query_memory_mb FROM CTERequests ER -- WHERE ER.blocking_session_id <>0 -- only blocked sessions UNION ALL SELECT ER.session_id , ER.request_id , BR.blocking_session_id , ER.database_id , ER.status , ER.command , ER.sql_handle , BR.sql_handle blocking_sql_handle , ER.level + 1 level , ER.sessions_path + '/' + Cast(BR.session_id as nvarchar(10)) sessions_path , ER.start_time , ER.total_elapsed_time , ER.cpu_time , ER.wait_type , ER.wait_time , ER.reads , ER.writes , ER.granted_query_memory_mb FROM CTERequests br JOIN CTE ER ON BR.session_id = ER.blocking_session_id AND BR.session_id <> ER.session_id ) -- the final query SELECT CTE.session_id , CTE.blocking_session_id , CTE.request_id , DB_NAME(CTE.database_id) database_name , CTE.status , CTE.start_time , BES.last_request_end_time , CTE.level , CTE.sessions_path , CTE.command , CTE.sql_handle , est.text sql_text , CTE.blocking_sql_handle , best.text blocking_sql_text , CTE.total_elapsed_time , CTE.cpu_time , CTE.wait_type , CTE.wait_time , CTE.reads , CTE.writes , CTE.granted_query_memory_mb , TSU.request_internal_objects_alloc_page_count/128.0 allocated_mb , TSU.request_internal_objects_dealloc_page_count/128.0 deallocated_mb , ES.login_name , ES.[host_name] , BES.[host_name] blocking_host_name FROM CTE LEFT JOIN sys.dm_exec_sessions ES with (nolock) ON CTE.session_id = ES.session_id LEFT JOIN sys.dm_exec_sessions BES with (nolock) ON CTE.blocking_session_id = BES.session_id LEFT JOIN ( -- allocated space SELECT session_id , request_id , SUM(internal_objects_alloc_page_count) AS request_internal_objects_alloc_page_count , SUM(internal_objects_dealloc_page_count)AS request_internal_objects_dealloc_page_count FROM sys.dm_db_task_space_usage with (nolock) WHERE session_id > 50 GROUP BY session_id , request_id ) TSU ON TSU.request_id = CTE.request_id AND TSU.session_id = CTE.session_id OUTER APPLY sys.dm_exec_sql_text(CTE.sql_handle) as est OUTER APPLY sys.dm_exec_sql_text(CTE.blocking_sql_handle) as best WHERE CTE.level IN (SELECT MAX(B.level) -- longest path per session FROM CTE B WHERE CTE.session_id = B.session_id GROUP by B.session_id) ORDER BY CTE.session_id
Notes:
The script considers only the user sessions (session_id>50), however the constraint can be left out in order to get also the system sessions.
Over the years I discovered also 1-2 exceptions where the logic didn’t work as expected. Unfortunately, not having historical data made it difficult to find the cause.
Killing the root blocking session usually makes the other blockings disappear, however it’s not always the case.
There can multiple root blocking sessions as well.
For the meaning of the above attributes see the MSDN documentation: sys.dm_exec_requests, sys.dm_exec_sessions, sys.dm_db_task_space_usage, sys.dm_exec_sql_text.
Be careful when killing the sessions (see my other blog post)!
Happy coding!
About Me
- Adrian
- Koeln, NRW, Germany
- IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.