Cloud Confidentiality

Today, no CSP can guarantee that your data will be secured “For Your Eyes Only.” Encryption algorithms and compliance policies can only achieve so much.

From the CSP perspective, we have to take reasonable measures to ensure customer data is not used in any way by the provider that is unintended by the customer.As a way to mitigate exposure of customer data, some CSPs encrypt data at rest using encrypted hard drives or encrypted file systems. The other part of the equation for risk mitigation is proper device destruction either logically, using an appropriate method of sanitization likeDoD 5220.22-M or physically DSS Clearing and Sanitization Matrix and Special Publication 800-88: Guidelines for Media Sanitization.

And then, there are the backups. CSPs manage several copies of data to prevent total failure in both onsite and offsite facilities, and more than likely, the data stored on tape or other hdds is encrypted.

Once you have encrypted drives and encrypted backups, you have to deal with a little problem called key management. This turns out to be a problem because some CSPs use only a handful (if that many) of encryption keys that are known to a select few of the system administrators in charge of ops and compliance. That begs the question: If a system administrator leaves (on their own free will or not), are the keys changed? More than likely, yes. But what about the backups? How long is the time window during which the now departed administrator can access customer data? I usually hear crickets when I ask this question of CSPs - even those with plenty of acronyms and certifications behind their name.

From the customer perspective, I don't want any of my data readable by anyone except me. If my CSP encrypts my data when it's at rest, I want to make sure no one can access it in other ways, such as from a backup my CSP performs regularly. What this means to me is, before data leaves my control and into my CSP, all data must be encrypted using standard encryption algorithms and key(s) that I manage.

For eg:- S3 is a HDFS-like entity made up of NameNodes and DataNodes.

Currently, Amazon offers Service Side Encryption (SSE) for S3. When data is put into S3, a process blocks your data (into X MB chunks) where each block of data is encrypted by a key chosen from this process. Every file has a separate key and after T amount of time, the file is re-encrypted with a new key.

Except that we don’t know where the keys are stored. Are there backups of the keys for disaster recovery and high availability purposes (think of HDFS NameNodes)? We are back in the same conundrum where we started with encrypted backups.

What Can Be Done Today

Two options based on their tolerance for risk:

Leave data unencrypted at rest and trust the CSP
Encrypt all data before it is sent to the CSP

In this post, I wanted to bring out the issues with confidential customer data in the cloud, as this is not a solved problem. Encryption is only meant to provide some assurance that no one other than key holders can view the data.

Big Data and Cloud Computing

Search This Blog

Cloud Confidentiality

What Can Be Done Today

Comments

Post a Comment

Popular posts from this blog

Python and Parquet Performance

Kubernetes Configuration Provider to load data from Secrets and Config Maps

Andriod Bug