Saturday 11 January 2014

SharePoint 2013 Search Limits with an example

Overview:  This post aims to provide guidelines for building SharePoint 2013 Search farms.  There are 6 Search components (labelled C1-C6 below) and 4 database types (labelled DB1-DB4).  Index partitions are a big factor is search planning.

Example: Throughout this post I provide an example of a 60 million item search farm with redundancy/High Availability (HA).

Index partitions: Add 1 index partition per 10 million items is the MS recommendation, this really depends on IOPs and how the query is used.  An twinned partition (partition column) is needed for HA, this will improve query time over a single partition.  
Example: So assuming a max of 10 million items per index, to have a HA farm for 30 million items requires 6 partitions.

Index component (C1): 2 index components for each partition.
Example: 12 index components.

Query component (C2): Use 2 query processing components for HA/redundancy, add an additional 2 query components at 80 million items increase. 
Example: 2 Query components.

Crawl database (DB1): Use 1 crawl database per 20 million items.  This is probably the most commonly overlooked item in search farms.  The crawl database contains tracking and historical information about the crawled items. It also contains info such as the last crawl id, time etc, crawl history.  Crawl component feeds into the crawl database.  Medium usage should be under 100GB.  Add more content database before 20 million or 100GB database size.  My initial size is mdf 100 MB (growth 50MB) and the ldf is 300 MB (growth 50MB).
Example: 3 crawl databases at 20 million items each allows for a search farm containing 60 million items.

Link database (DB2): Use 1 link database per 60 million items.  I believe 1 link database will handle up to 100 million items.  Mdf 100 MB (growth 50 MB) and Ldf 25 MB (growth 25 MB).
Example: 1 link database.

Analytics reporting database (DB3): Add 1 search analytics reports database for each 500,000 unique items, viewed each day or every 10-20 million total items.  This is the heavy search database.  Add a new database to keep each Analytics reporting database under +-250GB.  Mdf 100 MB (growth 50 MB) and Ldf 25 MB (growth 25 MB).
Example: Start with 1 and grow as needed.

Analytics Processing Component (C3):
Example: 2-4 Anaytics Processing components.

Content Processing Component (C4): processes crawled items and moves the item data to the index component. It's function is to parses documents, performs property mapping and entity extraction, perform language processing, and ultimately moves crawled items into indexed items.
Example: 4 Content Processing components.

Admin component (C5): Use 1 administration components or 2 search for redundancy/HA.  For all farm sizes.
Example:2 Admin components.

Admin database (DB4): Low usage, even in big farms, you only need 1 database.  Should stay well under 100GB.  Holds crawled and managed properties, query rules, topology and history.  My initial size is mdf 100 MB (growth 10MB) and the ldf is 100 MB (growth 50MB).
Example: 1 Admin database. 

Crawl Component (C6):  The crawl component crawls content sources and delivers crawled items including metadata to the Content Processing component.  In SP2013 you don't specify the relationship between the crawl database and the crawl component.  The crawl component will distribute to all available crawl databases.  The 3 types of crawls available in SP2013 are: Full, Incremental and Continuous (only works for SP2013 content).  Schema changes still require a full crawl to pickup the change in SP2013.  Crawl does not do as much analysis as was the case in SP2010 so it is a much lighter/faster process.
Example:2 Crawl components allows for HA and improved performance

Database Hardware: for the example use 8CPUs, 16GB of Ram, disk size depends on content but it is smaller than SP2010.

Placing components on VMs for the example
Group your search roles onto servers:
  • Index & Query Processing
  • Analytics & Content Processing
  • Crawl, Content processing & Search Admin 

Note: I have included suggested mdf and ldf sizing and growth assuming the full recovery model as I use AOAG (if you are using the default simple recovery model, only worry about the mdf sizes), these are based on my farms usage so your will need to vary but it is a good guide for a starting point untill you can monitor your own database growth patterns. Change the ldf and mdf settings as the default database settings are completely inappropriate. Growth must be in fixed MB (never percentages) and you do not as little ldf growth on the fly as possible. A good guidline is 100 MB mdf initial size with 50 MB growth and ldf are 25-50% of the size of the ldf and I would use 50MB min for ldf initial sizing, then set the ldf growth to be 50MB on all 4 search databases. Search ldfs are pretty hectic so in this post you will noting much higher ldf setting than I am mention in this note. Also checkout the SQL Checklist for SP2013 post. Backup frequency affects log/ldf file usage so check out this post to understand how your system database needs to be set (In short, backups frequency requires smaller ldf's).

Database mdf mdf growth ldf ldf grow
SP_Search_Admin 100 MB 10 MB 100 MB 50 MB
SP_Search_CrawlStore 100 MB 50 MB 300 MB 100 MB
SP_Search_AnalyticsReportingStore 100 MB 50 MB 25 MB 25 MB
SP_Search_LinksStore 100 MB 50 MB 25 MB 25 MB

More Info:

Troubleshooting Crawl

0 comments:

Post a Comment