SQL Server Column Level Encryption and Query Performance

— by Lori Brown  @SQLSupahStah

In my last blog post I showed how to implement TDE encryption on a database and the performance impact of doing so. https://sqlrx.wordpress.com/2015/06/25/sql-server-tde-encryption-and-query-performance/ Today’s post will show how to implement encryption at the column level and the performance impact as well as a way to help make queries using encrypted columns faster.

When should data be encrypted?

Data is usually encrypted when a database holds sensitive information. The common example of sensitive data is credit card numbers (CCN) or in the US a Social Security Number (SSN). However, with identity theft on the rise many companies are finding it necessary to protect all of their customer’s personally identifiable information (PII) such as name, address, phone number, etc.   Basically any information held in a database that can be used to identify or find an individual is a good candidate for encryption.

How to encrypt columns of a table

To demonstrate, I created a database for my ABCompany then grabbed some data out of an AdventureWorks database to create a fictional Employee table that contains EmpID, FirstName, LastName, Address, City, StateProv, PostalCode and SSN. Notice all the PII info that we will need to encrypt. And finally I created a couple of indexes as well. The table holds 8,688 records so it is not big but should work for this demo.

USE [ABCompany]

GO

CREATE TABLE [dbo].[Employees](

[EmpID] [int] IDENTITY(1,1) NOT NULL,

[FirstName] [varchar](60) NULL,

[LastName] [varchar](60) NULL,

[Address] [varchar](60) NULL,

[City] [varchar](30) NULL,

[StateProv] [varchar](3) NULL,

[PostalCode] [varchar](15) NULL,

[SSN] [varchar](12) NULL

) ON [PRIMARY]

GO

CREATE CLUSTERED INDEX [CIX_Employees] ON [dbo].[Employees]

([EmpID]) ON [PRIMARY]

GO

CREATE NONCLUSTERED INDEX [IX_Employees_Lname_Fname] ON [dbo].[Employees]

([LastName], [FirstName]

) ON [PRIMARY]

GO

Demo query

For this demo I am going to use a simple query and will set STATISTICS IO ON and STATISTICS TIME ON as well as have the execution plan displayed after each run of the query.

SELECT EmpID, FirstName, LastName, StateProv, SSN

FROM Employees

WHERE LastName = ‘Bell’

Here are the results for this query with no encryption on columns:

SQL Server parse and compile time:

CPU time = 10 ms, elapsed time = 10 ms.

(29 row(s) affected)

Table ‘Employees’. Scan count 1, logical reads 70, physical reads 0

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 83 ms.

Blog_20150709_1

Encryption

Now let’s begin the process to encrypt PII columns. We have to create a database master key, a certificate and a symmetric key with passwords. All have to be opened and the database master key and certificate should be backed up.

— Create Database Master Key

USE ABCompany

GO

CREATE MASTER KEY ENCRYPTION BY PASSWORD = ‘Pa$$w0rd’

GO

— Have to open the master key before you can do anything else.

OPEN MASTER KEY DECRYPTION BY PASSWORD = ‘Pa$$w0rd’

GO

— Backup DMK

BACKUP MASTER KEY

TO FILE = ‘C:\KeyBkup\ExportedMK.dmk’

ENCRYPTION BY PASSWORD = ‘Sq1&x!’;

GO

— Create cert for column encryption

CREATE CERTIFICATE ABCoCert

WITH SUBJECT = ‘Cert for Column Level Encryption’;

GO

— Backup cert

BACKUP CERTIFICATE ABCoCert

TO FILE = ‘C:\KeyBkup\Certbkup.cer’

WITH PRIVATE KEY

(FILE = ‘C:\KeyBkup\Certbkup.pvk’,

ENCRYPTION BY PASSWORD = ‘St40ngPw6!’);

GO

— Create symmetric key

CREATE SYMMETRIC KEY ABCoSymKey

WITH ALGORITHM = AES_256

ENCRYPTION BY CERTIFICATE ABCoCert

GO

— Grant control permission to a user or a role so the key and certificate can be used

GRANT CONTROL ON CERTIFICATE::[ABCoCert] TO [Domain\User]

GRANT VIEW DEFINITION ON SYMMETRIC KEY::[ABCoSymKey] TO [Domain\User]

GO

— Open the symmetric key

USE ABCompany

GO

OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION

BY CERTIFICATE ABCoCert

GO

This sets the stage to encrypt things. Next we have to add columns that will hold encrypted data. The new columns must be the VARBINARY data type. You can do this either in SSMS or through T-SQL. I actually use a hybrid approach. I will generally add the new columns which by default will be created at the end of the list of columns already present. Once data from the target columns are encrypted into the new ones I will then drop the original unencrypted column and then rename the new encrypted column to the original column’s name. However this will make it so that the new columns will not be in the same ordinal position which if application or stored procedure code uses anything like “SELECT * FROM Employees” or like “…ORDER BY 3,2,5” then code is likely to break because the columns will not be in the expected order any more. Some people may want to put their columns back in the original order which is more complicated.

— Add new columns for encryption

USE ABCompany

GO

ALTER TABLE dbo.Employees ADD FName varbinary(128) NULL

GO

ALTER TABLE dbo.Employees ADD LName varbinary(128) NULL

GO

ALTER TABLE dbo.Employees ADD Add1 varbinary(128) NULL

GO

ALTER TABLE dbo.Employees ADD cty varbinary(128) NULL

GO

ALTER TABLE dbo.Employees ADD st varbinary(128) NULL

GO

ALTER TABLE dbo.Employees ADD zip varbinary(128) NULL

GO

ALTER TABLE dbo.Employees ADD ss varbinary(128) NULL

GO

After the new columns are added, I will generally use SSMS to reorder the columns. Just right click on the table then go to Design and move the columns into the order that you want them. Yes this does simply drop and recreate the table but it does it for you including dropping and recreating column defaults, indexes, foreign keys and anything else on the table. If you don’t want to have SSMS do it all then you can make it generate the code to do all the work at a later time. I also tend to change the column order before adding encrypted data to the new columns so that it is having to work on as little data as possible. You will have to change the column order when users are not in the database so plan an outage.

Blog_20150709_2

Blog_20150709_3

Next we encrypt all PII data by pushing data from the unencrypted columns and using ENCRYPTBYKEY to encrypt and update the new varbinary columns. I found that this step is what can take quite a while. The last time I did this exercise I was working on a 10 million record table and encrypted about the same number of columns as in this example. It took over 6 hours to complete. So, plan on lots of time to do this if you have a large table.

— Encrypt PII columns

UPDATE Employees

SET FName = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, FirstName) )

GO

UPDATE Employees

SET LName = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, LastName) )

GO

UPDATE Employees

SET Add1 = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, Address) )

GO

UPDATE Employees

SET cty = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, City) )

GO

UPDATE Employees

SET st = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, StateProv) )

GO

UPDATE Employees

SET zip = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, PostalCode) )

GO

UPDATE Employees

SET ss = ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, SSN) )

GO

Now it’s time to drop the old unencrypted columns and rename the new encrypted columns to the original name. Be aware that if you drop a column that is part of an existing index, you will have to either remove the column from the index or you will have to drop the index first before you can drop and rename the column. In my case I have to drop my non-clustered index then I can rename things. I really am not going to need those columns to be part of an index any longer since indexing an encrypted column will not help query performance.

DROP INDEX [IX_Employees_Lname_Fname] ON [dbo].[Employees]

GO

ALTER TABLE [dbo].[Employees] DROP COLUMN [FirstName]

GO

EXECUTE sp_rename N’dbo.Employees.FName’, N’FirstName’, ‘COLUMN’

GO

ALTER TABLE [dbo].[Employees] DROP COLUMN [LastName]

GO

EXECUTE sp_rename N’dbo.Employees.LName’, N’LastName’, ‘COLUMN’

GO

ALTER TABLE [dbo].[Employees] DROP COLUMN [Address]

GO

EXECUTE sp_rename N’dbo.Employees.Add1′, N’Address’, ‘COLUMN’

GO

ALTER TABLE [dbo].[Employees] DROP COLUMN [City]

GO

EXECUTE sp_rename N’dbo.Employees.cty’, N’City’, ‘COLUMN’

GO

ALTER TABLE [dbo].[Employees] DROP COLUMN [StateProv]

GO

EXECUTE sp_rename N’dbo.Employees.st’, N’StateProv’, ‘COLUMN’

GO

ALTER TABLE [dbo].[Employees] DROP COLUMN [PostalCode]

GO

EXECUTE sp_rename N’dbo.Employees.zip’, N’PostalCode’, ‘COLUMN’

GO

ALTER TABLE [dbo].[Employees] DROP COLUMN [SSN]

GO

EXECUTE sp_rename N’dbo.Employees.ss’, N’SSN’, ‘COLUMN’

GO

Demo Query – with encryption

The query being used to test performance now has to change to be able to display encrypted data. We have to use the DECRYPTBYKEY function to decrypt data especially so that we can search for values. If we don’t decrypt then searching for LastName = ‘Bell’ will return 0 records. The query now looks like this:

OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION

BY CERTIFICATE ABCoCert

GO

SELECT EmpID,

CONVERT(VARCHAR(60),DECRYPTBYKEY(FirstName)),

CONVERT(VARCHAR(60),DECRYPTBYKEY(LastName)),

CONVERT(VARCHAR(3),DECRYPTBYKEY(StateProv)),

CONVERT(VARCHAR(12),DECRYPTBYKEY(SSN))

FROM Employees

WHERE CONVERT(VARCHAR(60),DECRYPTBYKEY(LastName)) = ‘Bell’

Don’t forget … Before running any query that encrypts or decrypts you have to open the symmetric key. I did this before running the above query. Here are the performance results of the query with encrypted columns and no indexes.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 3 ms.

(29 row(s) affected)

Table ‘Employees’. Scan count 1, logical reads 872, physical reads 0

SQL Server Execution Times:

CPU time = 32 ms, elapsed time = 107 ms.

Blog_20150709_4

The logical reads increased by 800 reads and the Execution time for both CPU and elapsed went up. This could be more efficient if it had a better index to use. The clustered index was used to scan the table. However searches are commonly performed on last name and\or first name and those columns are now encrypted and not good candidates for indexes. In my case in working with the 10 million record table, the query response went from 200 ms to 3 minutes to get my query to return results. This was unacceptable!! What to do????……

Partial plain-text value solution (PPV)

In researching this problem I came across the partial plain-text value solution that worked really well. Partial plain-text means to take part of the value in the encrypted column(s), obfuscating the value by hashing it in some way and placing the obfuscated value into a searchable unencrypted column that can be indexed. Searches would be done using the searchable column and will make performance better.

Add a column to hold the searchable value.

— This column will be used for name searches

ALTER TABLE dbo.Employees ADD NSrch varchar(50) NULL

GO

Add a stored procedure or function that is used when a new value is put into the column. In my case I made a random string that can be used to make the partial values virtually unrecognizable.

CREATE PROCEDURE dbo.StrRand

@RandStr VARCHAR(44) OUTPUT

AS

DECLARE @Len INT

DECLARE @PoolLength INT

DECLARE @LoopCount INT

DECLARE @CharPool VARCHAR(200)

SET @Len = RAND() * 20 + 15 — Set length here – min length 15, max length 35

— define allowable characters – easy to read this way and easy to omit easily confused chars

— can be a string of any characters

SET @CharPool = ‘abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ’ — Alpha characters only

SET @PoolLength = LEN(@CharPool)

SET @LoopCount = 0

SET @RandStr = ”

— loop through grabbing random chars and adding to the random string until it reaches max length

WHILE (@LoopCount < @Len) BEGIN

SET @RandStr = @RandStr + SUBSTRING(@Charpool, CONVERT(INT,RAND()*@PoolLength), 1)

SET @LoopCount = @LoopCount + 1

END

GO

Now do a mass update using the code below of the NSrch column to create a column that can be used in searches.   In this case I have decided to concatenate the first 3 characters of the last name + the random string + the last 3 characters of the first name as my searchable value. A record that holds info for the employee Julia Bell would have a searchable value that would not be encrypted that looks like this – BeldewAqpqbxBRfjsjdSBASJul. I know…I know…This is done using a dreaded cursor but it works for a mass update. I ran this a few times to get all the records in the table set up with a searchable column.

OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION

BY CERTIFICATE ABCoCert

GO

CREATE TABLE #namework (empid INT, namesearchable VARCHAR(50))

DECLARE @col1 INT

DECLARE @col2 VARCHAR(60), @col3 VARCHAR(60), @strrand VARCHAR(50)

DECLARE srchcur CURSOR FOR

SELECT TOP 1000 EmpID, — set TOP x number for the number of records to work on

LEFT(CONVERT(VARCHAR(60),DECRYPTBYKEY(FirstName)),3),

LEFT(CONVERT(VARCHAR(60),DECRYPTBYKEY(LastName)),3)

FROM Employees

WHERE NSrch IS NULL

OPEN srchcur

WHILE (1=1)

BEGIN

EXECUTE StrRand @RandStr = @strrand OUT

FETCH NEXT FROM srchcur INTO @col1, @col2, @col3

IF @@FETCH_STATUS < 0 BREAK

IF LEN(@col2) < 3  –check to see if FirstName is less than 3 characters, if so then add a space

SET @col2 = @col2+SPACE(1)

IF LEN(@col3) < 3  –check to see if LastName is less than 3 characters, if so then add a space

SET @col3 = @col3+SPACE(1)

SET @strrand = @col3+@strrand+@col2

INSERT INTO #namework VALUES (@col1, @strrand)

END

CLOSE srchcur

DEALLOCATE srchcur

UPDATE t1

SET t1.NSrch = t2.namesearchable

FROM Employees t1

JOIN #namework t2

ON t1.EmpID = t2.empid

DROP TABLE #namework

GO

And finally, add an index to the NSrch column.

— add index to NSrch column to make searches on column faster

CREATE NONCLUSTERED INDEX IX_Employees_NSrch ON [Employees](NSrch)

GO

Here is what the NSrch column looks like with the encrypted first and last name columns…

Blog_20150709_5

Demo Query – with encryption and searchable column

The query that was originally looking for employees with the last name = Bell has to be changed a bit to use the searchable column in the WHERE clause.

OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION

BY CERTIFICATE ABCoCert

GO

SELECT EmpID,

CONVERT(VARCHAR(100),DECRYPTBYKEY(FirstName)) AS FirstName,

CONVERT(VARCHAR(100),DECRYPTBYKEY(LastName)) AS LastName,

CONVERT(VARCHAR(100),DECRYPTBYKEY(StateProv)) AS StateProv,

CONVERT(VARCHAR(100),DECRYPTBYKEY(SSN)) AS SSN

FROM Employees

WHERE NSrch like ‘Bel%’

Here are the performance results of the query with encrypted columns and using the index on the NSrch column.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

(29 row(s) affected)

Table ‘Employees’. Scan count 1, logical reads 100, physical reads 0

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 95 ms.

Blog_20150709_6

Performance is much better! It is not as good as when there was no encryption but that is to be expected. The index on the NSrch column was used and less work was performed which is important. I put together a table of all the query runs so it is easier to see. On my 10 million record table that was encrypted using a searchable column in the query made response time go from 3 minutes back to a little over 320 ms. On larger tables using an unencrypted searchable column in query WHERE clauses will be very important.

Blog_20150709_7

Changes to code because of encryption

Your application or database code would have to be changed to decrypt encrypted data and encrypt new data anywhere that encrypted columns are used in code. Additionally the symmetric key must be opened and as a best practice should be closed when sessions are finished querying data.

— Open symmetric key

OPEN SYMMETRIC KEY ABCoSymKey DECRYPTION BY CERTIFICATE ABCoCert

— Decrypt

CONVERT(VARCHAR(50),DECRYPTBYKEY(<<encrypted column>>))

— Encrypt

ENCRYPTBYKEY(KEY_GUID(‘ABCoSymKey’),CONVERT(varbinary, <<encrypted column>>))

— Close symmetric key

CLOSE SYMMETRIC KEY ABCoSymKey

Clean up

As a last thing I always like to give the statements that remove keys should you ever want to.

— Remove encryption keys

USE ABCompany

GO

CLOSE SYMMETRIC KEY ABCoSymKey

GO

DROP SYMMETRIC KEY ABCoSymKey

GO

DROP CERTIFICATE ABCoCert

GO

DROP MASTER KEY

GO

Original PPV reference – https://wateroxconsulting.com/archives/optimizing-protected-indexes/

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRxSupport@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. We love to talk tech with anyone in our SQL family!

Both comments and trackbacks are currently closed.
%d bloggers like this: