Two vulnerabilities found by Quarkslab in the TPM2.0 reference implementation and reported in November 2022 are now publicly revealed and could affect Billion of devices.
Who can be affected?
-Large Tech vendors
-Organizations using Enterprise PCs, many servers and embedded systems that include a TPM
What can you do next?
Last Tuesday, February 28th 2023, after a lenghty coordinated disclosure process both CERT/CC and TCG published security advisories describing the issues and solutions to be considered :
We published the technical details of the TPM 2.0 vulnerabilities discovered last November by Quarkslab engineer Francisco Falcón.
To set the context: On October 2021, Microsoft released Windows11. One of the installation requirements that stood out was the need for a Trusted PlatformModule (TPM) 2.0. An implication of this requirement is that, in order to be able to run Windows 11 within a virtual machine, virtualization software must provide a TPM to the VMs, either by doing passthrough to the hardware TPM on the host machine, or by supplying a virtual (software only) TPM to them.
We found this to be an interesting topic for vulnerability research, since the addition of virtual TPMs means an extended attack surface on virtualization software that can be reached from within a guest, and so it could potentially be used for a virtual machine escape. Interestingly, as we found out, the bugs we discovered turned out to have a way larger reach than we initially thought given that they originate in the reference implementation code published by the Trusted Computing Group, these security issues affect not only every virtualization software we tested, but hardware implementations as well.
In the blog post we discuss the details of the vulnerabilities we discovered in the Trusted Platform Module (TPM) 2.0 reference implementation code, which is provided by the Trusted Computing Group (TCG), the nonprofit organization that publishes and maintains the TPM specification. These two vulnerabilities, an out-of-bounds write and an out-of-bounds read identified as CVE-2023-1017 and CVE-2023-1018 respectively, can be triggered from user- mode applications by sending malicious TPM 2.0 commands with encrypted parameters to a TPM whose firmware is based on the TCG reference implementation. The bugs affected several TPM 2.0 software implementations (such as the ones used by virtualization software) as well as a number of hardware TPMs. Depending on the case the impact may range from making the TPM stop working until the system is rebooted (a denial of service) to being able to extract sensitive data like encryption keys and other cryptographic material from the chip, or even running an attacker's program on the TPM.
However, getting visibility of what's happening at runtime in the firmware of a TPM, running in a separate chip, is difficult. Even doing static analysis of the firmware of a hardware TPM proved to be diﬃcult: the few TPM firmware updates we attempted to analyze happened to be encrypted. Therefore, the lack of specific assessment on hardware TPMs doesn't mean that they are not affected; it just means that we couldn't evaluate how they are impacted due to the lack of observability.
Interestingly, although all affected TPMs share the exact same vulnerable function, which stems from the reference implementation code, the likeliness of successful exploitation depends on how the command buffer is implemented, and that part is left to each implementation. From what we saw, everyone seems to handle it in a different way: some clear out the command buffer between received requests, but others don't; some allocate the command buffer in the heap via, while others use a global variable for it.
We were able to verify that these vulnerabilities are present in the software TPMs included in major desktop virtualization solutions such as VMware Workstation, MicrosoftHyper-V and Qemu. Virtual TPMs available in major cloud computing providers were also likely affected. Finally, we expected most TPM hardware vendors to be affected too, but this proved difficult to verify. TPM chips are not built for users to be able to inspect them and see what computations the are performing. They do not provide debugging or monitoring interfaces. They are meant to be resistant to any attempts to inspect or tamper their internals. That lack of a debugging setup to get visibility on what's going on in the TPM firmware at runtime made it harder to confirm the presence of the vulnerabilities in a physical chip. Static analysis could be an alternative to assess whether a hardware TPM is vulnerable or not, but in the few TPM firmware updates we managed to get our hands on, they happened to be encrypted.
Faced with the possibility that many TPM hardware vendors could be affected but lacking the capacity to verify the status of all of them, in November 2022 we decided to contact CERT/CC to ask for help to reach out to as many potentially affected organizations as possible. Over the next 90 days, CERT/CC coordinated with TCG and reached out to an increasing number of vendors, eventually more than 600 got access to the report. On February 28th, both TCG and CERT/CC published their security bulletins disclosing the issues. The full scope and impact are not 100% known yet, and more information to complete the picture will appear over the next months, as different vendors issue statements and/or fixes.
There are several take away points from this experience:
- Example code or a reference implementation of a standard widely used in the industry should be scrutinized very carefully and audited regularly and in-depth because any flaw found in it is very likely to cascade into the products of many vendors, making it very difficult and costly to address in a timely and orderly manner afterwards.
- Even the most secure and trustworthy component of a computing system is prone to vulnerabilities. There is no perfection in technology, anchoring trust of the entire system on a single component is risky. Don't take the security of secure elements for granted.
- The impact of a vulnerability is very specific to how the technology is implemented and the environment in which it is used. Generic impact metrics, whether qualitative (such as "high", "medium", "low") or quantitative (such as CVSS) are useful but they don’t necessarily map accurately to your specific IT environment.