HSM vs TPM
The great debate of our time. (Or not)
This is a security theory article, take it for what it is.
What's a TPM?
A Trusted Platform Module (TPM) is a hardware chip on the computer’s motherboard that stores cryptographic keys used for encryption. TPMS provide a unique locally accessible web/root trust relationship with their host, giving an advantage if configured for boot-time security.
The TPM includes a unique RSA key burned into it, which is used for asymmetric encryption. Additionally, it can generate, store, and protect other keys used in the encryption and decryption process.
What's a HSM?
A hardware security module (HSM) is a security device you can add to a system to manage, generate, and securely store cryptographic keys.
High performance HSMs are external devices connected to a network using TCP/IP. Smaller HSMs come as expansion cards you install within a server, or as devices you plug into computer ports.
How do they differ?
For the purpose of this conversation we will focus on TPM 2.0 and HSM specifications that are FIPS 140-2 compliant.
Well that's the interesting part of this conversation. In all reality they don't. HSMs are 'securely networked' and are capable of performing many more logical operations per second, since they really just act as a deployed TPM for multiple computers. You could still use TPMs to store keys in a PKI environment if your root and issuing CA have TPMs. You don't gain much from a HSM if you already have your primary keys protected.
You could argue "but what about the clients?" and I'd say... what about them? If the client is compromised and its keys are in a position of being stolen and their only goal was compromising the client... they've already succeeded. Popping a local computers private key isn't going to gain them access to anything else. You can still revoke any x509 certificates down from the issuing CA which is how your clients should be receiving their key pairings.
You could further argue that an attacker could directly attack the TPM if the compromise the OS of machine that has the TPM. Which, sure, but you could also compromise the HSM.
You may also get some additional crypto integration packages out of an HSM you'd be hard pressed to get in a TPM. The ability to craft PKCS#11, KMIP, and other protocol implementations can provide unique use cases for on the fly key generation and storage across the enterprise. But if you're not using a networked HSM the value goes down considerably since you can't establish qourums or backups of your keys as easily. Though if you're comfortable with a bit of risk of the CAs needing a very sudden reissuance due to key loss, the YubiHSM's are pretty cool... keeping in mind they're only marginally better than a TPM and should be inside of the machine.
Should I use a virtual HSM?
Short Answer: Maybe?
Long Answer: Maybe. It really generates a lot of questions that you'd have answer in an appropriate way. Remember outside of the mathematics of it all, crypto security is nearly completely up to how much YOU are willing to trust it. Are you backing it's primary keys to a TPM? Are you using shielded VMs that are effectively isolated from their fabric? How well was your shielding built? Are you using virtualization protection tools like Shielded VMs in Hyper-V? I'd argue you shouldn't do this for the same reason you shouldn't virtualize DCs. Unless you've spent a lot of money, time and effort building a properly architecture secure administration environment on top of production... If they compromise your virtualization platform, they essentially compromise everything by proxy since you've got higher security platforms virtualized on top of it.
Side note: Your DCs, Offline Roots and if you have a Virtual HSM should be in a "global admin" level zone. They should be virtualized independently if you're going to virtualize them. The virtual platform they are hosted on should not be accessible anywhere but locally ideally. In truth you should never virtualize any DC or extremely high security platform, but if you do, virtualize them securly. There are exceptions to this. If you use a properly defined tiered and shielded environment this should be fairly secure to most hardware to virt attacks.
Companies like Hashicorp and Unbound are making a lot of progress with multi-party computing functions that show a lot of promise. The later supports direct CryptoAPI and OpenSSL integrations which could make developing on top of them a breeze and allow flexibility in the normally very inflexible crypto world.
So what does this mean? Should I get an HSM or TPM?
In this article I've mainly focused on general x509 PKI aspects of both of these items. Other uses become more complex and murky very quickly and is why HSM's are very popular in manufacturing and financial industries. If you're in the murky side of these industries you likely -don't need- my ideas on the subject. Otherwise... You should almost always have a TPM on every device in your network that supports it, because why not? They're major requirements for tons of security solutions in multiple operating systems and provide a difficult to manipulate layer of hardware attestation and trust when combined with EUFI.
An HSM? Maybe, do you have the resources to actually support it? Are you alright with additional network complexity that could fail and wreck your entire environment in a single swoop? Can you afford the cost of it? And is it worth it compared to doing something that provides much more security like a fully deployed MFA or PAM?
Unless you plan on implementing x509 certificates to encrypt your data at rest, authentication and authorization, document encryption, signing (code/doc)... and full drive encryption keys - you're really not going to get all the benefits that HSM's typically tout as their benefits. But you should never deploy a PKI without some form of hardware cryptographic root. So in a net new crypto build it may be beneficial to look at an HSM to scale the future potential benefits.
Keep in mind an HSM is an amazing solution if you've got the resources to maintain it and the ability to retrofit your CAs/RAs in your environment to make it worth while. Otherwise for a low-volume internal PKI solution TPM based key protection could be exactly what you need. If you'd like to apply this to Active Directory Certificate Services here's a pretty good write-up on it: https://blogs.technet.microsoft.com/pki/2014/06/05/setting-up-tpm-protected-certificates-using-a-microsoft-certificate-authority-part-1-microsoft-platform-crypto-provider/ .