Cryptographic Best Practices in 2023

2023 Apr 17 See all posts
Cryptographic Best Practices in 2023 @ Satoshi Nakamoto
Author

Satoshi Nakamoto

Email

Site

https://satoshinakamoto.network

Putting cryptographic primitives together is a lot like putting a jigsaw puzzle together, where all the pieces are cut exactly the same way, but there is only one correct solution. Thankfully, there are some projects out there that are working hard to make sure developers are getting it right.

The following advice comes from years of research from leading security researchers, developers, and cryptographers. This Gist was forked from Thomas Ptacek's Gist to be more readable. Additions have been added from Latacora's Cryptographic Right Answers.

Some others have been added from years of discussions on Twitter, IRC, and mailing lists that would be too difficult to pin down, or exhaustively list here.

If at any point, I disagree with some of the advice, I will note it and provide reasoning why. If you have any questions, or disagreements, let me know.

TL;DR

If you take only one thing away from this post, it should be to use a library that puts these puzzle pieces together correctly for you. Pick one of for your project:

  1. NaCl - By cryptographer Daniel Bernstein
  2. libsodium - NaCl fork by developer Frank Denis
  3. monocypher- libsodium fork by developer Loup Vaillant

Throughout this document, when I refer to "just use NaCl", I mean one of these libraries.

Symmetric Encryption

If you are in a position to use a key management system (KMS), then you should use KMS. If you are not in a position to use KMS, then you should use authenticated encryption with associated data (AEAD).

Currently, the CAESAR competition is being held to find an AEAD algorithm that doesn't have some of the sharp edges of AES-GCM while also improving performance. When the announcement of the final portfolio drops, this document will be updated.

Some notes on AEAD:

The NaCl libraries will handle AEAD for you natively.

Use, in order of preference:

  1. KMS, if available.
  2. The NaCL, libsodium, or monocypher default
  3. Chacha20-Poly1305
  4. AES-GCM
  5. AES-CTR with HMAC

Avoid:

  1. AES-CBC, AES-CTR by itself
  2. Block ciphers with 64-bit blocks, such as Blowfish.
  3. OFB mode
  4. RC4, which is comically broken

Symmetric Key Length

See my blog post about The Physics of Brute Force to understand why 256-bit keys is more than sufficient. But rememeber: your AES key is far less likely to be broken than your public key pair, so the latter key size should be larger if you're going to obsess about this.

If your symmetric key is based on user input, such as a passphrase, then it should provide at least as many bits of theoretical entropic security as the symmetric key length. In other words, if your AES key is 128-bits, and is built from a password, then that passwourd should provide at least 128-bits of entropy.

As with asymmetric encryption, symmetric encryption key length is a vital security parameter. Academic, private, and government organizations provide different recommendations with mathematical formulas to approimate the minimum key size requirement for security. See BlueKcrypt's Cryptographyc Key Length Recommendation for other recommendations and dates.

To protect data up through 2050, it is recommended to meet the minimum requirements for symmetric key lengths:

See also the NSA Fact Sheet Suite B Cryptography and RFC 3766 for additional recommendations and math algorithms for calculating strengths based on calendar year.

Personally, I don't see any problem with using 256-bit key lengths. So, my recommendation would be:

Use:

  1. Minimum- 128-bit keys
  2. Maximum- 256-bit keys

Avoid:

  1. Constructions with huge keys
  2. Cipher "cascades"
  3. Key sizes under 128 bits

Symmetric Signatures

If you're authenticating but not encrypting, as with API requests, don't do anything complicated. There is a class of crypto implementation bugs that arises from how you feed data to your MAC, so, if you're designing a new system from scratch, Google "crypto canonicalization bugs". Also, use a secure compare function.

If you use HMAC, people will feel the need to point out that SHA-3 (and the truncated SHA-2 hashes) can do "KMAC", which is to say you can just concatenate the key and data, and hash them to be secure. This means that in theory, HMAC is doing unnecessary extra work with SHA-3 or truncated SHA-2. But who cares? Think of HMAC as cheap insurance for your design, in case someone switches to non-truncated SHA-2.

Use:

  1. HMAC-SHA-512/256
  2. HMAC-SHA-512/224
  3. HMAC-SHA-384
  4. HMAC-SHA-224
  5. HMAC-SHA-512
  6. HMAC-SHA-256

Alternately, use in order of preference:

  1. Keyed BLAKE2b
  2. Keyed BLAKE2s
  3. Keyed SHA3-512
  4. Keyed SHA3-256

Avoid:

  1. HMAC-MD5
  2. HMAC-SHA1
  3. Custom "keyed hash" constructions
  4. Complex polynomial MACs
  5. Encrypted hashes
  6. Anything CRC

Hashing

If you can get away with it you want to use hashing algorithms that truncate their output and sidesteps length extension attacks. Meanwhile: it's less likely that you'll upgrade from SHA-2 to SHA-3 than it is that you'll upgrade from SHA-2 to BLAKE2, which is faster than SHA-3, and SHA-2 looks great right now, so get comfortable and cuddly with SHA-2.

Use (pick one):

  1. SHA-2 (fast, time-tested, industry standard)
  2. BLAKE2 (fastest, SHA-3 finalist)
  3. SHA-3 (slowest, industry standard)

Avoid:

  1. SHA-1
  2. MD5
  3. MD6
  4. EDON-R (I'm looking at you OpenZFS)

Random Numbers

When creating random IDs, numbers, URLs, initialization vectors, or anything that is random, then you should always use your operating system's kernelspace CSPRNG On GNU/Linux (including Android), BSD, or Mac (including iOS), this is /dev/urandom. On Windows, this is CryptGenRandom.

NOTE: /dev/random is not more secure then /dev/urandom. They use the same CSPRNG. They only time you would obsess over this, is when working on an information theoretic cryptographic primitive that exploits the blocking aspects of /dev/random, which you aren't doing (you would know it when you're doing it).

The only time you should ever use a userspace RNG, is when you're in a constrained environment, such as embedded firmware, where the OS RNG is not available. In that case, use fast-key erasure. The problem here, however, is making sure that it is properly seeded with entropy on each boot. This is harder than it sounds, so really, at all costs, this should only be used as a worst-scenario fallback.

Use:

  1. Your operating system's CSPRNG.
  2. Fast key erasure (as a fallback).

Create: 256-bit random numbers

Avoid:

  1. Userspace random number generators
  2. /dev/random

Password Hashing

When using scrypt for password hashing, be aware that it is very sensitive to the parameters, making it possible to end up weaker than bcrypt, and suffers from time-memory trade-off (source #1 and source #2). When using bcrypt, make sure to use the following algorithm to prevent the leading NULL byte problem. and the 72-character password limit:

bcrypt(base64(sha-512(password)))

Initially, I was hesitant to recommend Argon2 for general production use. I no longer feel that way. It was the winner of the Password Hashing Competition, has had ample analysis, even before the competition finished, and is showing no signs of serious weaknesses.

Each password hashing algorithm requires a "cost" to implement correctly. For Argon2, this is using a sufficient time on the CPU and a sufficient amount of RAM. For scrypt, this is using at least 16 MB of RAM. For bcrypt, this is a cost of at least "5". For sha512crypt and sha256crypt, this it at least 5,000 rounds. For PBKDF2, this is at least 1,000 rounds.

Jeremi Gonsey, a professional password cracker, publishes benchmarks with Nvidia GPU clusters, such as with 8x Nvidia GTX 1080 Ti GPUs. It's worth looking over those numbers.

Use, in order of preference:

  1. Argon2 (tune appropriately)
  2. scrypt (>= 16 MB)
  3. bcrypt (>= 5)
  4. sha512crypt (>= 5,000 rounds)
  5. sha256crypt (>= 5,000 rounds)
  6. PBKDF2 (>= 1,000 rounds)

Avoid:

  1. Plaintext
  2. Naked SHA-2, SHA-1, MD5
  3. Complex homebrew algorithms
  4. Any encryption algorithm

Asymmetric Encryption

It's time to stop using anything RSA, and start using NaCL. Of all the cryptographic "best practices", this is the one you're least likely to get right on your own. NaCL has been designed to prevent you from making stupid mistakes, it's highly favored among the cryptographic community, and focuses on modern, highly secure cryptographic primitives.

It's time to start using ECC. Here are several reasons you should stop using RSA and switch to elliptic curve software:

If you absolutely have to use RSA, do use RSA-KEM. But don't use RSA. Use ECC.

Use: NaCL, libsodium, or monocypher

Avoid:

  1. Really, anything RSA
  2. ElGamal
  3. OpenPGP, OpenSSL, BouncyCastle, etc.

Asymmetric Key Length

As with symmetric encryption, asymmetric encryption key length is a vital security parameter. Academic, private, and government organizations provide different recommendations with mathematical formulas to approimate the minimum key size requirement for security. See BlueKcrypt's Cryptographyc Key Length Recommendation for other recommendations and dates.

To protect data up through 2050, it is recommended to meet the minimum requirements for asymmetric key lengths:

Method RSA ECC D-H Key D-H Group
Lenstra/Verheul 4047 206 193 4047
Lenstra Updated 2440 203 203 2440
ECRYPT II 15424 512 512 15424
NIST 7680 384 384 7680
ANSSI 3072 256 200 3072
BSI 3000 250 250 3000

See also the NSA Fact Sheet Suite B Cryptography and RFC 3766 for additional recommendations and math algorithms for calculating strengths based on calendar year.

Personally, I don't see any problem with using 2048-bit RSA/DH group and 256-bit ECC/DH key lengths. So, my recommendation would be:

Use:

  1. 256-bit minimum for ECC/DH Keys
  2. 2048-bit minimum for RSA/DH Group (but you're not using RSA, right?)

Avoid: Not following the above recommendations.

Asymmetric Signatures

The two dominating use cases within the last 10 years for asymmetric signatures are cryptocurrencies and forward-secret key agreement, as with ECDHE-TLS. The dominating algorithms for these use cases are all elliptic-curve based. Be wary of new systems that use RSA signatures.

In the last few years there has been a major shift away from conventional DSA signatures and towards misuse-resistent "deterministic" signature schemes, of which EdDSA and RFC6979 are the best examples. You can think of these schemes as "user-proofed" responses to the Playstation 3 ECDSA flaw, in which reuse of a random number leaked secret keys. Use deterministic signatures in preference to any other signature scheme.

Ed25519, the NaCl default, is by far the most popular public key signature scheme outside of Bitcoin. It's misuse-resistant and carefully designed in other ways as well. You shouldn't freelance this either; get it from NaCl.

Use, in order of preference:

  1. NaCL, libsodium, or monocypher
  2. Ed25519
  3. RFC6979 (deterministic DSA/ECDSA)

Avoid:

  1. Anything RSA
  2. ECDSA
  3. DSA

Diffie-Hellman

Developers should not freelance their own encrypted transports. To get a sense of the complexity of this issue, read the documentation for the Noise Protocol Framework. If you're doing a key-exchange with Diffie-Hellman, you probably want an authenticated key exchange (AKE) that resists key compromise impersonation (KCI), and so the primitive you use for Diffie-Hellman is not the only important security concern.

It remains the case: if you can just use NaCl, use NaCl. You don't even have to care what NaCl does. That's the point of NaCl.

Otherwise: use Curve25519. There are libraries for virtually every language. In 2015, we were worried about encouraging people to write their own Curve25519 libraries, with visions of Javascript bignum implementations dancing in our heads. But really, part of the point of Curve25519 is that the entire curve was carefully chosen to minimize implementation errors. Don't write your own! But really, just use Curve25519.

Don't do ECDH with the NIST curves, where you'll have to carefully verify elliptic curve points before computing with them to avoid leaking secrets. That attack is very simple to implement, easier than a CBC padding oracle, and far more devastating.

The previos edition of this document included a clause about using DH-1024 in preference to sketchy curve libraries. You know what? That's still a valid point. Valid and stupid. The way to solve the "DH-1024 vs. sketchy curve library" problem is, the same as the "should I use Blowfish or IDEA?" problem. Don't have that problem. Use Curve25519.

Use, in order of preference:

  1. NaCL, libsodium, or monocypher
  2. Curve25519
  3. 2048-bit Diffie-Hellman Group #14

Avoid:

  1. Conventional DH
  2. SRP
  3. J-PAKE
  4. Handshakes and negotiation
  5. Elaborate key negotiation schemes that only use block ciphers
  6. srand(time())

Website security

By "website security", we mean "the library you use to make your web server speak HTTPS". If you can pay a web hosting provider to worry about this problem for you, then you do that. Otherwise, use OpenSSL.

There was a dark period between 2010 and 2016 where OpenSSL might not have been the right answer, but that time has passed. OpenSSL has gotten better, and, more importantly, OpenSSL is on-the-ball with vulnerability disclosure and response.

Using anything besides OpenSSL will drastically complicate your system for little, no, or even negative security benefit. This means avoid LibreSSL, BoringSSL, or BearSSL for the time being. Not because they're bad, but because OpenSSL really is the Right Answer here. Just keep it simple; use OpenSSL.

Speaking of simple: LetsEncrypt is free and automated. Set up a cron job to re-fetch certificates regularly, and test it.

Use:

  1. A web hosting provider, like AWS.
  2. OpenSSL with Let's Encrypt

Avoid:

  1. PolarSSL
  2. GnuTLS
  3. MatrixSSL
  4. LibreSSL
  5. BoringSSL
  6. BearSSL

Client-Server Application Security

What happens when you design your own custom RSA protocol is that 1-18 months afterwards, hopefully sooner but often later, you discover that you made a mistake and your protocol had virtually no security. A good example is Salt Stack. Salt managed to deploy e=1 RSA.

It seems a little crazy to recommend TLS given its recent history:

Here's why you should still use TLS for your custom transport problem:

Use: TLS

Avoid:

  1. Designing your own encrypted transport, which is a genuinely hard engineering problem;
  2. Using TLS but in a default configuration, like, with "curl"
  3. Using "curl"
  4. IPSEC

Online Backups

Of course, you should host your own backups in house. The best security is the security where others just don't get access to your data.

The best solution, IMO, is OpenZFS. Not only do you get data integrity with 256-bit checksums, but you get redundancy, volume management, network transport, and many other options, all for free. FreeNAS makes setting this up trivial. Setting it up with Debian GNU/Linux isn't to difficult.

If using an online backup service, rather than hosting your own, use Tarsnap. It's withstood the test of time.

Alternatively, Keybase has its own keybase filesystem (KBFS) that supports public, private, and team repositories. The specification is sound, but they only provide 10 GB for free, without any paid plans currently. All data is end-to-end encrypted in your KBFS client before being stored on the filesystem, using the NaCl library.

Use:

  1. OpenZFS
  2. Tarsnap
  3. Keybase

Avoid:

  1. Google
  2. Apple
  3. Microsoft
  4. Dropbox
  5. Amazon S3