Friday 4 April 2014

Database Snapshots


A Snapshot is a read only copy of the source database as it existed when we created snapshot. 

The following list are some of the properties of database snapshots:
  • Snapshot database is completely dependent of the source database.
  • If a source database corrupts, the snapshot will also be corrupted.
  • We can even restore the entire source database from the snapshot to the time of snapshot creation with below syntax
RESTORE DATABASE db_name
FROM DATABASE_SNAPSHOT = 'snapshotdb_name'

We can run the below query to list all the database snapshots created in a particular database instance.
SELECT * FROM sys.databases 

  •  In the result of the above query, the value of 'source_db_id' column will be NULL for all the databases. Because these are original databases.
  • So, if the column 'source_db_id' has some value other than null, it indicates that that is a database snapshot. 
We can directly list the database snapshots with the below query.
SELECT * FROM sys.databases 
WHERE 
source_database_id IS NOT NULL 

 Concept of Database Snapshots

  • When a database snapshot is created, an empty 'sparse file' will be created with exactly the number of     pages that the source database consists.
  • So, essentially a snapshot starts up empty.
  • The database snapshot preserves each page of the source database when the time it (snapshot) was created.
  • If few pages are modified in the source database, then those pages ( just before writing the modifications to the disk ) will be copied to the snapshot ( into empty space of those particular pages in the previously created sparse file ).
You can visualize the before mentioned concepts in the below figure:


Not able to understand? Try seeing the below figure:


Got it..!
You can also visualize the above concept in the below animation.

  • Snapshot size on disk increases with the updates or modifications made on the source database.
  • Source database and the snapshot database will be accessing the same data files.
  • That is the reason why, we can run the snapshot on a different host ( source database should be present in this host up and running or available always ) so that the work load can be shared between the source and snapshot database as long as the read-only operations as concern.
  • Database snapshots not only give the point in time view of the source database, it also gives point in time transactionally consistent view of the source database.
i.e, if there are any active transactions at the time of snapshot creation, then they must be rolled back.
If a page is copied into the snapshot because of a transaction, then if this transaction rolls back, then the page that is added into the snapshot will not be removed ( One-way operation ). 
This page copying is done synchronously which affects the performance.  

  •  Source database and the snapshot database should be located in the same instance. 
Now when a query is submitted on the snapshot database and if the required pages are not present in it, then those pages are brought from source database to serve the query.
No locks taken on reads from the source database when copying the page to snapshots.

GROUPBY VS PARTITIONBY