Showing posts with label Grafana. Show all posts
Showing posts with label Grafana. Show all posts

Monday 30 October 2023

Thoughts on Logging and Monitoring

Overview:  I mainly work in the Microsoft stack, so my default for logging as Azure Monitor.  Log Analytics/Workspace and Application Insights fall under the term Azure Monitor.  

Going forward MS are storing App Insight logging data within a Log analytics instance.

There are 4 options for displaying/analysis logs in Azure:

  1. Azure Dashboards
  2. Power BI
  3. Grafana
  4. Workspaces

SIEM tools take in logs from various sources such as Azure Log Analytics, Defender, other vendors Prometheus logs or Open Telemetry.  

Grafana can be used on most SIEMS including Dynatrace, NewRelic, Microsoft Sentinel, or Azure Monitor.  Grafana supports PromQL and has fantastic dashboarding.

Tuesday 20 June 2023

App Insights for Power Platform - Part 7 - Monitoring Azure Dashboards

Series

App Insights for Power Platform - Part 1 - Series Overview 

App Insights for Power Platform - Part 2 - App Insights and Azure Log Analytics 

App Insights for Power Platform - Part 3 - Canvas App Logging (Instrumentation key)

App Insights for Power Platform - Part 4 - Model App Logging

App Insights for Power Platform - Part 5 - Logging for APIM 

App Insights for Power Platform - Part 6 - Power Automate Logging

App Insights for Power Platform - Part 7 - Monitoring Azure Dashboards (this post)

App Insights for Power Platform - Part 8 - Verify logging is going to the correct Log analytics

App Insights for Power Platform - Part 9 - Power Automate Licencing

App Insights for Power Platform - Part 10 - Custom Connector enable logging

App Insights for Power Platform - Part 11 - Custom Connector Behaviour from Canvas Apps Concern

Overview: Azure Dashboards are excellent but, if you want beautiful dashboards use the Azure Grafana service or Power BI dashboards.

It's always difficult to identify KPI's and graphs to allow support and stakeholders to quickly digest monitoring information.  An option is to have an overview for the PM, PO, Business owners,,, and a separate set of dashboards for support.

Identify What is important?  End-to-end testing is always nice especially if running CI to detect abnormalities.

For instance, this Azure Dashboard, fires of a Canvas App recorded Test (done using test studio) and shows the speed (performance) of the run, i then warn the user if the performance is too slow.  I also grab the oldest successful run from 7 days ago to check if performance is considerably different.  

Power automate has it's own logging, but integrating log entries when a error occurs via Log analytics, allows me to see if any of my workflows have a problem.  This is discussed in part 6 of the logging series on Flows.

Azure services are often used when building solutions using the Power Platform.  The most common are Functions, App Services, APIM, Service Bus, maybe sendgrid.  So we need to know that they are working, and help the user see performance or issues.  Here are a couple of examples.





Series

App Insights for Power Platform - Part 1 - Series Overview 

App Insights for Power Platform - Part 2 - App Insights and Azure Log Analytics 

App Insights for Power Platform - Part 3 - Canvas App Logging (Instrumentation key)

App Insights for Power Platform - Part 4 - Model App Logging

App Insights for Power Platform - Part 5 - Logging for APIM 

App Insights for Power Platform - Part 6 - Power Automate Logging

App Insights for Power Platform - Part 7 - Monitoring Azure Dashboards (this post)

App Insights for Power Platform - Part 8 - Verify logging is going to the correct Log analytics

App Insights for Power Platform - Part 9 - Power Automate Licencing

App Insights for Power Platform - Part 10 - Custom Connector enable logging

App Insights for Power Platform - Part 11 - Custom Connector Behaviour from Canvas Apps Concern


Friday 9 June 2023

App Insights for Power Platform - Part 1 - Series Overview

Overview: Microsoft have great capabilities for logging and monitoring.  In this series of posts I will be examining the various parts of logging that may be useful in building solutions that are well monitored, provide alerting, easy tracing, and identifies issues or potential issues as soon as possible.

I am looking at App Insights for Power Platform monitoring.  So this includes: 

  • Power Apps (Canvas, and model apps),
  • Power Automate,
  • APIM, 
  • Azure Functions, 
  • Azure Service Bus, and
  • App Insights.

I shall be setting up a demo environment and these are the logical components being covered.


All the components making up the solution shall log into Log Analytics (left-hand side of the diagram).

For Continuous Integration, my clients will be Postman monitor (it's awesome and so easy to use all those postman collections), DevOps is great and I'll use it to run smoke tests after new releases.  I also use flows, to report on flows (sounds nuts but i love it).  These are at the bottom of the diagram. 

Lastly on the right of the diagram, I look at extracting logs for reporting (Power BI), and Monitoring using Azure DevOps (p.s. think about Grafana instead of DevOps Dashboards, it so nice).

Couple of extras are: Availability Logging, alerting, automating Canvas app testing, Playwright.  

From the diagram, you can see the data is now held in Log analytics and it can be queried via Log Analytics or App Insights using Kusto.  Note: the syntax is slightly different.

Series

App Insights for Power Platform - Part 1 - Series Overview (this post)

App Insights for Power Platform - Part 2 - App Insights and Azure Log Analytics 

App Insights for Power Platform - Part 3 - Canvas App Logging (Instrumentation key)

App Insights for Power Platform - Part 4 - Model App Logging

App Insights for Power Platform - Part 5 - Logging for APIM (this post)

App Insights for Power Platform - Part 6 - Power Automate Logging

App Insights for Power Platform - Part 7 - Monitoring Azure Dashboards 

App Insights for Power Platform - Part 8 - Verify logging is going to the correct Log analytics

App Insights for Power Platform - Part 9 - Power automate licencing

App Insights for Power Platform - Part 10 - Custom Connector enable logging

App Insights for Power Platform - Part 11 - Custom Connector Behaviour from Canvas Apps Concern

Tip: The Power Platform Admin Centre has a good overview of the Power Platform, but to make logging and monitoring better push data into Azure Log analytics and monitor and alert centrally.

Also seeView and download Dataverse analytics - Power Platform | Microsoft Learn

Sunday 2 January 2022

App Insights Overview for SaaS logging and tracing

Overview:  App Insights provides independent infrastructure for logging and tracing activities.  It is tightly coupled with Azure services including PaaS.  This allows for consistent scalable logging.  App Insights now stores logs in Azure Log Analytics, these are all under the umbrella of Azure Monitor, 

On a SaaS solution, I am looking for App Insights to log any errors have the ability to log trace information.  I want a unique correlationId (to allow for distributed tracing) on the front end if there is an error so support can identify the exact issue/transactions.  A unique correlationId in the http header allows for identifying a transaction and this is useful for tracing and performance monitoring.  Using the App Insights SDK's and implementing a common logging module is a good idea.  There are two common areas that need call out to ensure the ability to trace transactions:

  1. SPA's (Requirement to generate a unique operation/correlationId per operation not per pageview), and
  2. Long running operation such as timer jobs or service bus calls.

Support & DevOps:

Having a correlationId allows first line to log the correlationId and quickly follow the request without asking for replication steps.  This context tracing approach is common on newer applications. Third line support has full traceability of an issue to support who can empirically see the perceived performance parts broken down using the correlationId in the header.

Key API's can be continuously monitored for errors and slow down in performance, alerts can be configured around this monitoring. 

Building a first line support tool that displays the errors in a hierarchy, has help scripts and knowledge bases is a good option for streamlining support.

App Insights has live monitoring and also has Kusto query language is useful for monitoring specific queries.


Summary Report for Support

// I'm sure there are nicer ways to write/improve my Kusto, so pls let me let me know where the code can be improved
let dayminus0 = datetime(now);
let dayminus1 = ago(24h);
let dayminus2 = ago(48h);
let result0 = requests
    | where timestamp > dayminus1 and timestamp < dayminus0
    | summarize requestCount=sum(itemCount), avgDuration=avg(duration) by performanceBucket
    | where performanceBucket == "15sec-30sec" or performanceBucket == "7sec-15sec"
        or performanceBucket == "30sec-1-min" or performanceBucket == "1min-2min";
let dayminus1a = ago(24h);
let dayminus2a = ago(48h);
let result1 = requests
    | where timestamp > dayminus2a and timestamp < dayminus1a
    | summarize requestCount1=sum(itemCount), avgDuration1=avg(duration) by performanceBucket
    | where performanceBucket == "15sec-30sec" or performanceBucket == "7sec-15sec"
        or performanceBucket == "30sec-1-min" or performanceBucket == "1min-2min";
let dayminus1b = ago(2d);
let dayminus2b = ago(3d);
let result2 = requests
    | where timestamp > dayminus2b and timestamp < dayminus1b
    | summarize requestCount2=sum(itemCount), avgDuration2=avg(duration) by performanceBucket
    | where performanceBucket == "15sec-30sec" or performanceBucket == "7sec-15sec"
        or performanceBucket == "30sec-1-min" or performanceBucket == "1min-2min";
let resultTemp = result0
    | join kind=inner result1 on performanceBucket 
    | project performanceBucket, ['Today'] = avgDuration, ['Yesterday'] = avgDuration1;
let resultTemp2 = resultTemp;
resultTemp2
| join kind=inner result2 on performanceBucket 
| project
    performanceBucket,
    ['1) Today']= (round(['Today'], -2) / 1000),
    ['2) Yesterday'] = (round(['Yesterday'], -2) / 1000),
    ['3) Two Day ago'] = (round(avgDuration2, -2) / 1000) 
| render columnchart
    with (
    kind=unstacked,
    ytitle="Seconds Taken",
    xtitle="Performance Group",
    title="Ensure the 'Today' bar is not significantly higher than pervious days");


Monitoring:  Azure dashboards are great for monitoring application health and performance.  Easy to customise, make unique dashboards and security is easy to control.  sentry.io monitors API's, I have not used it.  I like all the Azure stuff coming out for testing and I feel continuously running Postman collections and reporting to App Insights is the best way to go.  Azure Dashboards can be limiting, Azure Grafana can be a great alternative/enhancement.  Check out Azure Managed Grafana.
source cloudiqtech

Alerting: I all to often see an overuse of alerting resulting in recipients ignoring a plethora of emails.  I believe in minimising alerts especially via email, and SMS type messaging.  For me, I like to create a dedicate channel for alerting that includes all DevOps members and either notify via a Teams card, and even easier is to email the channel.  This can be broken down further but to start I create a channel for alerting for each DTAP environment.

Note: The default channel setup only allows members of the teams channel to send email so the alerts from Azure monitor using rules won't be accepted.  On the channel, and admin needs to go to the "advance settings" and change the option from "Only members of this Team" and change it the setting to "Anyone can send".

Options:  There are great services for logging so my default tends to be Azure Monitor.  The main players in Application & API observability and monitoring include: 

  • Microsoft: Azure Monitor includes Application Insights & Azure Log Analytics
  • Dynatrace (really good if you use multicloud) or Dynatrace AWS cloudwatch,  Dynatrace - Saas offering is on AWS.  Can be on-prem.  OneAgent is deployed on the Compute i.e. VM, Kubernetes.  Can import logs from other SIEMs or Azure Monitor, so you can eventually get Azure service logs such as App Service or Service Bus.  Does Full stack and includes code-level and applications and infrastructure monitoring, also can show User monitoring.  Dynatrace offers scalable API's that are sitting on Kubernetes.  "Davis" is the AI engine used to help figure out the problems.  Alerting is solid.  
High-level Architecture

Dynatrace Admin Monitoring
  • AWS: Amazon CloudWatch Synthetics
  • AppDynamics,
  • Datadog (excellent),
  • New Relic,
  • SolarWinds (excellent)
SolarWinds admin UI from circa 2013/2014 

Dynatrace