Problem: Gathering Non Functional Requirements (NFRs) are always a tricky situation in IT projects. This is because it is always difficult to estimate how the system will be used before you build it. I often get business users stating extreme NFRs in the attempt to negotiate or show how world class they are (I generally think the opposite when hearing unreasonable NFR's).
An example is a CIO at a fairly small NGO telling me the on-prem. SP 2010 infrastructure needs to be up all the time so an SLA of 99.99999. This equates to 3.2 seconds downtime a year. In reality, higher SLA's start to cost a lot of money. SP2013 and SQL 2012 introduce Always On Availability Groups (AOAG) which helps improve SLA uptime but this costs in licensing infrastructure and management. I need redundancy and the ability to deal with performance issues, so the smallest possible farm consists to 6 server, 2 for each layer in SP namely: WFE, App and SQL.
Here is an old post of SP2010 SLA's but still relevant today.
The key is gather you NFR's and ensure all your usage/applications on the production farm meet expected behaviours. I have a checklist below. Going thru the Microsoft's SP Boundaries, Limits and Thresholds document shall help highlight any issues.
The high level items I cover include the following topics:
More Info:
https://technet.microsoft.com/en-us/library/ff758647.aspx
An example is a CIO at a fairly small NGO telling me the on-prem. SP 2010 infrastructure needs to be up all the time so an SLA of 99.99999. This equates to 3.2 seconds downtime a year. In reality, higher SLA's start to cost a lot of money. SP2013 and SQL 2012 introduce Always On Availability Groups (AOAG) which helps improve SLA uptime but this costs in licensing infrastructure and management. I need redundancy and the ability to deal with performance issues, so the smallest possible farm consists to 6 server, 2 for each layer in SP namely: WFE, App and SQL.
Here is an old post of SP2010 SLA's but still relevant today.
The key is gather you NFR's and ensure all your usage/applications on the production farm meet expected behaviours. I have a checklist below. Going thru the Microsoft's SP Boundaries, Limits and Thresholds document shall help highlight any issues.
The high level items I cover include the following topics:
- Availability
- Capacity
- Compatibility (Browser, device, mobile)
- Concurrency
- Performance
- Disaster Recovery (RTO, RPO)
- Scalability
- Search
- Security
- SLA
Capacity Example
Item
|
Day 1
|
Year 1
|
Year 3
|
Year 5
|
Site
Collections
|
10
|
100
|
250
|
400
|
Database Size
in GB
|
> than 1GB
|
490 GB
|
1220 GB
|
1960 GB
|
Search Index
Size in GB
|
> than 1GB
|
120 GB
|
310 GB
|
490 GB
|
No of Content
Databases
|
1
|
1
|
4
|
8
|
No of Search
Items
|
10,000
|
10 Million
|
25 Million
|
40 Million
|
No of Index Partitions
|
1
|
1
|
3
|
4
|
Item
|
Day 1
|
Year 1
|
Year 2
|
Year 3
|
Number of
Users
|
1,000
|
50,000
|
80,000
|
130,000
|
*Also calculate peak and average concurrency numbers
Average concurrency, for 20,000 users, the assumption is that 10% (2,000) users will be actively using the solution at the same time, and that 1% of the total user base (200) users will be actively making requests. For for performance testing you are looking to handle 200 users without delays and a page response time of under 5 seconds. Based on the simple guideline I've always used from Microsoft.
Peak concurrency depends on your situation for example the NFL playoffs game schedule in the when announced is not the simple 4 times the average concurrency tha would be suitable for most internal business applications. Although this example may be considered a load spike rather than a peak concurrency.
Peak concurrency depends on your situation for example the NFL playoffs game schedule in the when announced is not the simple 4 times the average concurrency tha would be suitable for most internal business applications. Although this example may be considered a load spike rather than a peak concurrency.
It also worth doing a usage distribution pattern for your users experience, so 80% may be light users, login, read 10 pages in your site and perform a single search with 1 minute gaps between interactions (wait times). the remaining 20% perform a login, upload a 100kb document, view 10 pages and perform 2 searches.
RPO & RTO:
RPO - Max amount of lost data (in time)
RTO - Max time lost (rebuild farm and get the latest backups restored) to make the system operational again.
RPO & RTO:
RPO - Max amount of lost data (in time)
RTO - Max time lost (rebuild farm and get the latest backups restored) to make the system operational again.
SQL Server Sizing:
Option 1: work out the rows and bytes for storage and multiple by the number of rows and then add the tables together to get the size.
Option 2: Assume 100 bytes for each row, count the number of rows and get the storage requirements.
More Info:
https://technet.microsoft.com/en-us/library/ff758647.aspx
No comments:
Post a Comment