There’s a change coming in public cloud computing. During the first few years of the public cloud being a viable option for IT architects, the main concern for businesses was security: “If I can’t touch the servers where my stuff is, I’m worried.” Now we’ve come to a place where most businesses see that overall the security of major public clouds is better than what they can achieve on-premises. Data at rest is protected with encryption with various levels of customer control and data in transit is also protected with TLS encryption, again with various levels of customer control. But, there’s still one area where valid security concerns exist: controlling access to data that’s being processed.
Azure Confidential Computing (ACC) aims to address this problem. ACC isn’t a single service or product, rather it’s a collection of tools and services that businesses can use to address the confidentiality of their data as it’s being processed in Azure. ACC is in public preview at the time of writing
Terminology and Concepts
Intel’s Server Guard Extensions (SGX) is a new feature on some newer Xeon processors that provides a part of the processor where only special code is allowed to run; no other code, no matter the privilege level, has access to it and the memory it uses is encrypted and only visible to the application running in the enclave. These protected places where you can run secure code are also known as Trusted Execution Environments (TEEs).
No one except for the application creators have access to the data in the enclave, not administrators of the virtual machine (VM) where the code is running, nor Microsoft’s engineers (if you run the code in Azure VMs). There’s a new series of VMs that supports enclaves, the DC series, which incidentally are the first appearance of Hyper-V Generation 2 VMs in Azure — this is to provide the security (SecureBoot with virtual UEFI) and the pass-through to underlying hardware required.
Attestation is another important part of ACC. Not only do you need to be able to write your application to use the enclave for the sensitive processing, you also need to guarantee that the code is actually running in an enclave you trust. And if that enclave is updated with new code you need to verify that it’s still trusted, the mechanisms here use quotes or reports (based on standard JSON Web Tokens [JWTs]), along with policies that you can set to the standard your business requires. Finally, you also need to ensure that the data you upload into the enclave and the data you get back is protected. Microsoft is doing a lot of work in this area to reduce friction and make enterprise scenarios more accessible.
Not all applications are candidates for enclaves, there are many reasons for this, but one is definitely that you have to write an application from scratch to architect the part that’s running in the host (normal code and data) and the part that’s running in the enclave (the code or data that needs to be secured).
The most obvious use case is a business that needs to process sensitive data while guaranteeing the privacy of that data or the algorithm that’s used for the processing. Noticeable candidates are the financial trade and the medical industries.
Another use case is enterprise blockchain scenarios where several businesses need to share data or code, but they don’t fully trust the other partners in the consortium. Here, all partners can inspect the code that runs in the enclave and agree that they trust it and, thus, trust whatever results the code gives.
SQL Server is another use case as today Microsoft provides Always Encrypted, where a special driver runs on the client and only decrypts the data as required. This technology will be extended to use enclave technology to protect sensitive data on the server, even from a DBA with full access to the SQL Server. This is coming first in SQL Server 2019 on-premises and then in Azure SQL Database.
A fourth use case is where several different entities have data that could be useful for training a machine learning (ML) model to be more accurate, for instance with medical data from several different hospitals. Understandably, there are regulations in place where private medical data can’t be shared in this manner. Using enclaves, the data can be encrypted and then uploaded from the different entities, thus ensuring that no other organization has access to it, be decrypted in the enclave, analyzed to build a more accurate ML model while still ensuring that each organization keeps its data private and inaccessible to the other entities.
Yet another potential use case is in IoT edge scenarios where a device needs to process sensitive data before aggregating it in the cloud. A practical example could be sales terminals where the part that manages credit card data is executed in an enclave.
Open Enclave SDK
As so often is the case with Microsoft, the company is building a platform for developers to extend on top of. In this case it’s the Open Enclave SDK (which is going to be released open source). Currently, it supports C and C++ with other languages coming. You can build an application with it and it’ll run on any SGX hardware, not just in Azure. Today, software enclaves in Azure that are built on Hyper-V Virtualization Based Security (VBS) and Intel SGX are supported, but ARM is developing TrustZone, which will also be supported as that technology matures. It’s probably safe to assume that AMD will develop comparable technology for its CPUs.
One of the reasons that Microsoft is developing this SDK is to abstract the different underlying hardware implementations to make your code more portable. Another reason is to make moving code between Linux and Windows easier, the aim being to require a simple recompile.
Intel’s current SGX SDK causes some friction when developing applications as you need a business relationship with Intel to acquire certificates for your applications, something that’s not required for the Open Enclave SDK (the certificates used still map back to Intel’s root certificate).
The trick (and developers who learn this early will be able to command large salaries) will be exactly how to architect applications with the right proportion of code and data in the enclave and the rest in the host application. There are limitations on the size of the data and code that can live in an enclave, performance is also slower in an enclave.
It’s early days yet for ACC and because the technology requires writing applications from scratch or significant refactoring of existing programs, I don’t expect it to be an overnight success. However, for the right scenarios it’ll be another tool in your belt and I wouldn’t be surprised if standards such as PCI will require secure processing environments in the future.