Infrastructure considerations must be address before launching your Sitecore architecture. In the first of this three part series, database sizing requirements for Sitecore analytics are listed.

The Sitecore Customer Engagement Platform (CEP) provides rich capabilities that can transform the way you engage with your customers online. The technical foundation to that capability is Sitecore’s ability to capture and store visitor behavior. Planning for the storage of this invaluable data should be top of mind for any Sitecore architect. At first glance the Analytics database of the Digital Marketing Suite (DMS) is a very simple data-model. 

While the data-model is simple, the sizing characteristics of the database are not as clear cut. Driven primarily by the features you have deployed on your site and your visitor’s behavior, some level of modeling is required to estimate capacity needs. In order to tackle the sizing of the database, I have segmented this post into a series. The first post will focus on some basic lookup tables and the data required to track visits, visitors and page views. 

Technical note: the information in this post is based on Sitecore 6.5 revision 111230.

Lookup Data

The following DMS database tables track data sets that - in most instances - will have little influence on the size of the database.  The level of effort to model these tables is likely not worth the return, but have been included to let you judge for yourself.

Table Name Purpose
Browsers Contains a list of distinct web browsers detected in use by website visitors. The browser name, major and minor version is tracked.
ItemUrls Is the GUID and full path of the item. The table will only be of concern for installations with a large number of content items.
OS Contains a list of distinct operating systems in use by website visitors.
ReferringSites Contains a list of referring sites. If there is a large amount of traffic that is incoming from external sources this table can be several hundred MB.
Screens Contains a list of distinct screen resolutions in use by website visitors.
TrafficType Is a pre-populate list of traffic types.
UserAgents Contains a list of user agents. Because of the relationship with visitor classifications and visitors some index space may need to be considered.

Visits, Visitors and Page Views

In the simplest deployment, the DMS capabilities will track sessions (visits), visitors and the pages viewed.  For high volume sites the amount of data stored can be quite substantial.  Our testing indicates that the following data footprint is required:

  • 0.0888 KB / visitor
  • 1.45 KB / visit
  • 0.5909 KB / page view

Sitecore recommends the Analytics database be provisioned with 3-6 months of space pre-allocated to ensure the fragmentation of the database is kept to a minimum.   For a moderately busy site with 100,000 visitors / month and an average of 10 page views / session the data required for a single month would be: (100,000 x 0.0888) + (100,000 x 1.45) + (100,000 x 0.5909) = 744,780 KB or 727 MB. Giving room for growth over 3-6 months you would provision the initial database with 2.2 GB – 4.3 GB.    

Conclusion

For the initial launch of the CEP capabilities, modeling Visits, Visitors and Page Views should provide enough information to ensure a successful launch.  For high volume sites , deployment of the more robust capabilities of CEP and for long term planning you will also need to consider:

  1. How other features of CEP such as goals, and page events impact size
  2. Additional considerations for dedicated reporting, availability, backup and disaster recovery
  3. Growth over time and archival

comments powered by Disqus