Showing posts with label blobs. Show all posts
Showing posts with label blobs. Show all posts

Thursday 22 August 2013

SharePoint BLOB storage for Dummies

Overview: BLOB Storage in SP2007 SP used EBS, SP2010 provided Remote Blob Storage (RBS) capability and SP2010 was backward compatible with EBS, SP2013 does not support EBS.  A common problem with using RBS on SharePoint is orphaned blobs (we need to keep blobs that are outdated by version no or deleted in case you need to do a restore).  This can lead to orphaned blobs – and you need to periodically go and clean-up the blobs.   RBS is enabled at a content database level whereas EBS is per farm. 3rd party RBS providers only need to be installed and configured on the WFE’s not the app servers.

FileStream (local) RBS:  Microsoft SQL server provides Filestream technology which effectively allows SQL to move blobs based on size to remote storage (so no longer stored in the SQL database).  You need to manage backup and recovery yourself and this tends to lead to unused blobs and orphaned blobs.  It also adds overhead to SQL's resources.  The diagram below is how I see Filestream working, the entire SP record is pushed to SQL, SQL adds the stub to the Content database and the blob is moved onto a network file share.



Why Remote RBS?
Microsoft does not have any Remote/externalised blob providers.  All data is stored in your SQL relational databases and not efficient at storing large BLOBs.

With large BLOBs in your content databases:
  • the size becomes harder to manage (think backup and recovery times);
  • database I/O is much more intensive when storing BLOBs; and
  • databases use expensive storage.
Enter the 3rd Party RBS providers : DocAve (AvePoint) has a nice BLOB provider, it can work in real time or retrospectively, backups are cleanly handles and the retention policy is great at cleaning up the old blobs.  This diagram shows how AvePoint provider stores the blob, instead of letting SQL handle the entire record storage, the provider at a SharePoint level, 1st will add the blob to external storage, the location/stub info is sent back to the SP server.  SP then sends the record to SQL including the stub location.  SQL does a lot less work.

Other vendors include: Metalogix, Stealth Software, StorageEdge from (Alachisoft), SimpleStor

Note: RBS does not change the Content Database size restrictions, the BLOBs are still added to the content DB size calc. The 200GB/4TB limit still use the external BLOB size as the records meta data reports the real/external file size, not the 1 kb stub size.

The thinking behind the 200GB/4TB content DB limit is chosen/used because of database density and not database size, i.e. the average number of objects (docs, items, lists etc…) at 200GB/4TB is approaching a point where indexes and query performance suffer. So externalising does not mean that the density is dropped as the stub is still an object that’s in the DB that’s indexed.  However, using blobs suggest there on average are fewer objects compared to another farm that would have smaller average sized objects within it's content db.  My head hurts just writing this.

The advantage of RBS on your farm is that fragmentation is reduced, the cost of storage is cheaper and performance/stability at larger concurrency is improved. Additionally, DR times are also greatly improved.

High Availability: Blobs are not HA, if the blob store/fileshare goes down you loose the blobs.  DFS can be used to replicate the blobs.  Note: Your blob storage pointers in the SQL Content databases will point to the DFS share.  The image below shows you how to setup a DFS share.  Tip: Use domain based DFS.


Windows Distribute File Share (DFS) provides a logical share for adding files/blobs.  The blobs/files are replicate to all physical disks.  If 1 goes down you don't notice and service continues.


More Info:
http://www.metalogix.com/Libraries/Product_Collateral/Using_Windows_DFS_with_SharePoint_Remoted_BLOBs.pdf
http://technet.microsoft.com/en-us/library/cc771058.aspx