Thursday 8 August 2013

Finding Correlation Errors on a SP2013 farm

Background: SP 2013 has rich and expansive logging/tracing capabilities.  Logging is done via the Unified Log Service (ULS).  This will add logs to the tracelogs (often refered to as the ULS logs or ULS trace logs or ULS, it doesn't matter except you need to understand the ULS service is not only the trace log) and the Windows Event Viewer.  Anything logged in the Event Viewer log will also be in the ULS trace logs.
It is worth check how your logging is setup on your farm.  I change my default location for my ULS trace logs.  Change the logging so it matches your farms requirements.

On a small farm, it's normally pretty easy to take a Correlation Id / the unique GUID generate for the SharePoint request, open the trace log using notepad and find the error.  The default is to create a trace log every 30 minutes, these log files have a lot of data in them on busy production farms, and as you may have a large farm you also have multiple logs to check. I use Microsofts'd unsupported ULSViewer to look at all my logs regardless of farm size.  You can trace the logs in a live format and then filter out what you need.  Another option is to open existing errors to get historical issues.  If you know the datetime and server where the error occured, you open the correct log file (it is labled with a datetimestamp) and then either filter for the correlationId or look around the time the error occured.

Lastly, timer jobs ship entries from the ULS logs into the SharePoint Logging Database (SP_UsageandHealth).  You can directly query the SP_UsageandHealth database using T-SQL.

Tracing Correlation Errors on a SharePoint 2013 Farm.
User passes you a correlation Id and the date/time when the error occured, find the apprioate ULS trace log.  Open the log using ULSViewer and filter for the CorrelationId.  If you can reproduce the bug, you have the developer dashboard that can be turned on (performane penalty) selectively, their is a new SP2013 tab "ULS" this will show you the ULS trace snippet relating to this request.

On a big farm you may want to1st find out which server in the farm had the error:
Merge-SPLogFile -Path ".\error.log" -Correlation "5ca5555c-8555-4555-555b-f555af4d5555"
Tip: Be aware this is a heavy process, so restrict which logs you will merge.
 
Use ULSViewer to find the correlationId and review the logs.

Use IT tools or Fiddler to examine the http response from SharePoint to get the correlation Id, this is the SPRequestGUID (assuming it is not show on the error message).

 

More Info:

Tobias Zimmergren has a great post on working with Correlation Id's.

http://www.sharepointblog.co.uk/2012/09/logging-capabilities-of-sharepoint/

http://habaneroconsulting.com/insights/An-Even-Better-Way-to-Get-the-Real-SharePoint-Error

List of the ULS viewers

http://www.sharepointblog.co.uk/2012/09/logging-capabilities-of-sharepoint/

0 comments:

Post a Comment