Showing posts with label Content databases. Show all posts
Showing posts with label Content databases. Show all posts

Saturday 7 June 2014

Content database Sizing & Cleanup

Problem:  At a customer site, a content database is massive.  Various Site collections are using the same content database. 

Initial Hypothesis:  Smaller site collections can be moved to new separate content databases.  This reduces the size to some degree.  The SQL log (ldf) is in good shape.  The excessive size is due to 3 unavoidable issues, multiple versions of large blobs (we need the versioning so not an option).  The recycle bin is set to the default 50% and my dumping older content is brings down the size and lastly, delete content databases are still sored within SQL. 

Resolution: Remove the previously deleted site collections fro the content database.   Using CA, I tried to run the "Gradual Site Delete timer job", it no difference, the delete site was still lingering about. 

Used PowerShell to remove the deleted site collections as shown above.  E.g. PS> Remove-SPDeletedSite -Identity ""

More Info:
http://blogbaris.blogspot.co.uk/2013/01/delete-sharepoint-2010-site-collection.html
 

Saturday 28 May 2011

Scanning, Storage & RBS

Problem:  The client has millions of physical documents, that need to be available via SharePoint, additionally documentation still arrives in physical form and needs to be scanned and classified.

Initial Hypothesis: SP2010 can store documents in the SQL database in blob format however, it's not really made for large blob storage performance wise, additionally SQL storage is expensive (RAID, HA). Remote Blob Storage (RBS) helps with storing blobs but does not get around limitations imposed by MS guidanace.  RBS can reduce storage and improve performace if you data storage involves a lot of large blobs (over 256kb  is a good size).  My rough sums show a huge data requirement so for example 600,000 customers transact with the client.  On average each customer has 3 physical documents a year.  So we are talking 1,8 million scanned documents a year.

Documents need to be scanned in at 300 dpi so they can be printed and stored adequately.  With compression and converting these files into tiff/pdf files we are assuming an average of 1 MB per file. So our storage requirement per year would require 1.8 million scanned documents at 1MB per file meaning my storage on 1,800GB

As we have a restriction of 200GB per content database in SP2010 (threshold that MS will support up to).  So we would require 9 new site collections on a new content db per year to meet this requirement. 

Tip: Also worth considering are thresholds and bounderies provide by the SharePoint team.  Site collections max size is 100GB, this scenario has a caviet in that a single Site Collection using a single document library/site supports up to 1TB in the Content Database.  You can have subsites nested in a site collection but 2000 per view is recommended.  Max of 300 content db's per Web Applicaion.  Max 5000 site collections per content database.

Our storage cost is much higher as our disks are RAID so at a minimum we would use 3 times this in actual physical disk space.  On top of this my indexes will be about 25% of the storage requirement.  So price, performance are getting out of control pretty quickly.

Resolution: Using RBS my estimate on these blobs is will will reduce the content database by 90% however content database size is calcualte including RBS so our storage requirement will be cheaper using RBS that is resilient however the content database sizing will not be reduced by using RBS. 

Updated: 21/07/2011 - RBS sizing Calc

Scanning tips for SP:
  • Tiff or pdf are the common base storage file type;
  • 300dpi is good print quality most requirements can be lower;
  • Black and white is far smaller then grey scale scanning.
  • Pdf's if stored correctly can be indexed by the search crawler.
More Info Scanning:
http://www.psigen.com/ - scanning and capture for SP2010.
Capturx from www.adapx.com/sharepoint is a pen that automates data capture on forms.
CoSign does digital signatures and looks to have pretty decent integration with http://www.arx.com/digital-signature/sharepoint
www.kodak.com/go/sharepoint
http://www.goscan.com/connectors-sharepoint.php
http://www.kofax.com/solutions/microsoft.asp

More Info Sizing:
HP Sizer for SP2010
Capacity management for SP2010 - Sw boundries

Wednesday 15 December 2010

SharePoint 2010 boundries and thresholds

I attended a suguk.org event in London about a week ago.  John Timney did the 1st presentation session and asked a couple of questions on SharePoint limits.  I didn't know the answers, tried to think back to MOSS and what I'd seen previously.  The simplest question that I should know the answer to:

Qu: What is the maximum content database size supported by SharePoint 2010?
Ans: Microsoft supports Content databases up to 200GB in size.  In MOSS it was 100GB.  It is fairly common to see content databases considerably bigger than 100GB in MOSS that work.   The issue is how long does it take to perform operations on these content DB's such as backups moving content db's.  If you have a dedicated SAN, there is no reason not to go to much larger content databases however, they are not supported by MS.

More info on SharePoint's boundaries and thresholds from MS

Qu: What I/O speed does MS recommend for your SharePoint 2010 SQL database?
Ans:  I/O operations per second (IOPS).  The faster that SQL can handle request, results in faster return time and reduced que requests, so pretty important and a fairly common bottleneck.  This is often a reason why people choose not to virtualise SQL Server, it I/O intensive in SharePoint and really important to be fast.  Tip: Ensure VM's are thick provisioned for SQL Server. 

To determine you IOPS  use SQLIO Disk Subsystem Benchmark Tool (http://go.microsoft.com/fwlink/?LinkID=105586).

I guest the answer is as fast as possible but you can determine your IOPS requirement using the tool and you usage.  I go with ldf files on the fastest disk on the TempDB followed by ldf files for the content dbs on spinning disks.

Update 09/06/2011
Qu: Should I using seperate disks for mdf (data files) & ldf (transaction logs)?
Ans:  On small SQL server farms ensure that the transaction logs are stored on a different physical drive to the content databases as this will reduce contention and increate performance signigicantly.  Larger SQL instances like SANS have multiple disks so there is no need to seperate the files as this is already done by the nuber of disk readers.  You can also check the performance of a drive by watching the "disk seconds per read/write counters" which should be less than 20ms.  If the disk seconde per read/write is approachiing 20ms consider improving the disk speed or increasing the number of read points.
Update 22/08/2012 - Bigger architectures may use SSD/Flash memory as opposed to disks.  The IOPS are hugely improve as the is no disk search time.  http://technet.microsoft.com/en-us/library/cc298801.aspx#Section1_5a

Qu: What is the default SQL Server database growth setting sizes?
Ans: SQL Server 2008 will grow data files by 1MB and transaction logs by 10% increments.  I would start with an initial content database size of 100MB(adjust according to your anticipated demand) and autogrowth to be 50MB (adjust according to your system).  This general prinipal will result in the growth to the db's being infrequent so the associated performance hit is reduced, unused space being optimised as the percentage growth in the transaction log has huge incremental hit that are generally never reached after initial growth and less fragmented databases results in faster performance. 

More Info:
Summary of limits and thresholds
http://blah.winsmarts.com/2010-5-How_big_can_my_SharePoint_2010_installation_be.aspx
SQL Checklist for SharePoint 2013

Tuesday 15 June 2010

Moving a site collection to a new database

Problem: IT department created my development machine using a base image with a 20GB C:\ drive. The company insists I use SQL Server 2010 express for development. I installed SQL Express onto the C:\ drive. The databases are all stored on the same c drive along with the Windows footprint of 13GB. Very quickly I ran out of space.
Initial Hypothesis: When creating content databases thru Central Administration (CA), SharePoint will use the SQL default file location for *.mdf and *.ldf files. Therefore I need to change the location where the data and log files will be setup thru the UI.
I have a large D:\ drive so I should move the data and log files to the D:\ drive. I need to:
  • Change the default location of the files to the D:\ drive for data\log files;
  • Create a new content database to host the existing site collections; and
  • Move the existing data (site collections) to the newly created content database.
Resolution:
The base image is causing issues, this couple with me putting the default location on my C drive and the inability of this environment to resize virtual machine drives I had had to use the resolution below.

1.> Change the default location using T-SQL (I'm sure there is a better solution using Power Shell for Windows using SMO);
SQL Server Management Studio


2.> Open Power Shell (PS) for SharePoint



PS> Move-SPSite -Identity http://mysharepointsite.com.au/sites/user -destinationdatabase WSS-Content-NewUserDB


Your should work, I will solve the SQL permissions issue in my next post.

More Info:
SQL default location info

Moving site collections using PowerShell or move the entire Content DB