Showing posts with label NFR. Show all posts
Showing posts with label NFR. Show all posts

Monday 11 July 2022

What is technical debt? and how to handle it

Overview: Technical Debt generally refers to a buildup of deficiencies that makes changing code or optimizing systems difficult.  The key is to identify what in you organisation/program/project makes up technical debt.  

Technical Debt generally refers to poor or missing NFRs such as Performance, Security, Maintainability, Reliability, Scalability,  Testability, or Resiliency.  But it also can go further into future architecture, so if this part of our system is popular can we easily adjust and keep releasing features.  So as you can see, technical debt can be very wide and it's far better to focus a subset otherwise PO and PM's tend to scope everything under technical debt and wit gets nasty telling them about "additional technical debt".

I find the easier way to go about defining what is technical debt to avoid long discussions to to list out what cannot be considered technical debt.  This would be my minimum starting point:

  1. Bugs (Functional defects);
  2. Technical Skill Debt;
  3. Process Defects (Lack of process or poor process, such as Configuration Management);
  4. Feature Debt (Wrong or delayed features or missing functionality (recent favorite example is "how can a system not have customer off boarding it's obviously technical debt", this is feature debt, make sure stakeholders know or it falls into the old IT/Dev are weak and missed things description.); and
  5. UI/UX Defects (Inconsistent or poor or changing user experience).

Another items is spaghetti code that falls under the NFR of code maintainability, with old systems you have to be pragmatic, if the product brings in $100k per year it's not a good idea to spend $120k a year making the code more readable but not improving the technology as a general rule.  On old systems, I try to keep code maintainability out of the technical debt,.  You should put it to another more detailed section, just don't lump everything especially when it is huge changes all under technical debt.  Dev teams loose focus and it causes problems don the line.  All too often, over exercised bundling debt pushed into technical debt results in "even more interest to pay later".


Sunday 1 August 2021

JMeter - The basics

JMeter is an easy to use open source load testing tool by simulating network requests.  JMeter is good for figuring out how well the server side responses are working under different test conditions.  JMeter is built with Java and can run on Linux, Mac or Windows using a Java Virtual Machine (JVM). 


JMeter is Single Agent:

  • JMeter runs from the machine it is installed on so it does not have multiple agents.  Saying that it can simulate hundreds of users on fairly low spec machines.  
  • To avoid network latency, test on the same subnet or data center.  A simple VM in Azure (with 2 vCPU's and 8 GB RAM) can mimic over a thousand requests per second.
  • You can run tests off multiple machines to generate extreme loads (first I would use 8 cores and 64GB ram until the network traffic is maxed).
  • Install the Windows JDK 11 before installing installing JMeter.
Updated 2 May 2023: The current version of JMeter is Apache JMeter 5.5

Download and record web tests using the JMeter GUI tool.
Azure Load Testing needs the recorded tests generated by the JMeter GUI.
Create a new Azure Load Test Resource and use the recorded JMX/Test script file.


JMeter GUI
Open /bin/jmeter.bat


Friday 13 March 2015

Capturing NFRs for SharePoint

Problem: Gathering Non Functional Requirements (NFRs) are always a tricky situation in IT projects.  This is because it is always difficult to estimate how the system will be used before you build it.  I often get business users stating extreme NFRs in the attempt to negotiate or show how world class they are (I generally think the opposite when hearing unreasonable NFR's). 

An example is a CIO at a fairly small NGO telling me the on-prem. SP 2010 infrastructure needs to be up all the time so an SLA of 99.99999.  This equates to 3.2 seconds downtime a year.  In reality, higher SLA's start to cost a lot of money.  SP2013 and SQL 2012 introduce Always On Availability Groups (AOAG) which helps improve SLA uptime but this costs in licensing infrastructure and management.  I need redundancy and the ability to deal with performance issues, so the smallest possible farm consists to 6 server, 2 for each layer in SP namely: WFE, App and SQL.

Here is an old post of SP2010 SLA's but still relevant today.

The key is gather you NFR's and ensure all your usage/applications on the production farm meet expected behaviours.  I have a checklist below.  Going thru the Microsoft's SP Boundaries, Limits and Thresholds document shall help highlight any issues.

The high level items I cover include the following topics:
  • Availability
  • Capacity
  • Compatibility (Browser, device, mobile)
  • Concurrency
  • Performance
  • Disaster Recovery (RTO, RPO)
  • Scalability
  • Search
  • Security
  • SLA

Capacity Example

Item
Day 1
Year 1
Year 3
Year 5
Site Collections
10
100
250
400
Database Size in GB
> than 1GB
490 GB
1220 GB
1960 GB
Search Index Size in GB
> than 1GB
120 GB
310 GB
490 GB
No of Content Databases
1
1
4
8
No of Search Items
10,000
10 Million
25 Million
40 Million
No of Index Partitions
1
1
3
4


Item
Day 1
Year 1
Year 2
Year 3
Number of Users
1,000
50,000
80,000
130,000

*Also calculate peak and average concurrency numbers

Average concurrency, for 20,000 users, the assumption is that 10% (2,000) users will be actively using the solution at the same time, and that 1% of the total user base (200) users will be actively making requests.  For for performance testing you are looking to handle 200 users without delays and a page response time of under 5 seconds.  Based on the simple guideline I've always used from Microsoft.

Peak concurrency depends on your situation for example the NFL playoffs game schedule in the when announced is not the simple 4 times the average concurrency tha would be suitable for most internal business applications.  Although this example may be considered a load spike rather than a peak concurrency.  

It also worth doing a usage distribution pattern for your users experience, so 80% may be light users, login, read 10 pages in your site and perform a single search with 1 minute gaps between interactions (wait times).  the remaining 20% perform a login, upload a 100kb document, view 10 pages and perform 2 searches.

RPO & RTO:

RPO - Max amount of lost data (in time)
RTO - Max time lost (rebuild farm and get the latest backups restored) to make the system operational again.   

SQL Server Sizing:
Option 1: work out the rows and bytes for storage and multiple by the number of rows and then add the tables together to get the size.
Option 2: Assume 100 bytes for each row, count the number of rows and get the storage requirements.

More Info:
https://technet.microsoft.com/en-us/library/ff758647.aspx