Cryptography is a fascinating subject at the intersection of pure math and computer science that has become nearly ubiquitous over the past several years. Similar to functional programming, cryptography holds a particular interest to me because of the pure math connections, and my work in security software engineering has brought me back into the topic over the past year or so. Diving in to the standard Java cryptography APIs, I quickly discovered a tangle of strange naming conventions and poorly documented cryptographic primitives. There have been numerous academic papers written studying the widespread phenomenon of misuse of Java cryptography APIs, and after spending a bit of time with them, I can easily see why this is. Java, in its quest to remain low level and generic, oftentimes provides overly complicated APIs that, while flexible for low level use, offer very little guidance in their correct use. Some of this might be attributed to typical overengineering common to Java APIs prior to Java 8, but this particular API is more like a C API ported directly to Java. In fact, there’s a fairly innocuous explanation for it: it’s the same basic pattern present in most historical cryptographic libraries, most of which are indeed written in C by academics with little care for engineering. Combined with the historical baggage of having to deal with cryptographic export controls back in the 1990’s and early 2000’s, the Java cryptography API, like most cryptographic software written before export controls were relaxed, suffered from security vulnerabilities by design.
Much has been written and discovered since these dark days, though most software still struggles with outdated cryptography practices and misuse. In an effort to help software developers incorporate strong cryptography into their applications, an effort must be made to create and use cryptographic software components that are both easy to use and hard to misuse. One of the longest running efforts with a similar philosophy is the OpenBSD operating system which has incorporated strong cryptography and security by default for the past 25 years. Unconstrained by American cryptographic export controls in its home of Canada, the project has been one of the great examples of pervasive use of cryptography and secure design throughout its codebase. With simplified export controls, particularly for free and open source software, this philosophy must expand and help improve the security, privacy, and safety of the software we all write.
Using a corpus of public domain algorithms and cryptographic knowledge, I’ve started the O(1) Cryptography Project where I’m developing an opinionated cryptography library that aims to be easy to use and hard to misuse. O(1) Cryptography is a Java and C library that bundles the latest best practices in cryptography by providing abstractions for common cryptographic use cases. The name is a pun on the idea that cryptographic algorithms should generally run in constant-time and with minimal differentiation in its observable state. These properties seem to be somewhat modern to cryptography in that computers are fast enough and powerful enough now to statistically differentiate non-constant-time cryptographic algorithmic output, though they weren’t as big a concern back at the turn of the 21st century when AES was standardized. In fact, most standardized cryptography until recently was developed almost entirely detached from the reality that one day, non-cryptographers might need to use this stuff, too. There simply aren’t enough cryptographers in existence to write custom cryptographic routines for every application that needs it, and older cryptographic libraries were not designed for non-experts. Maybe you’ve heard of the various acronyms from yesteryear like AES, DES, RC4, RSA, DSA, MD5, SHA1, and many more. Some of these primitives are still useful and secure, but they all have problems of one form or another. In particular, they all fail the two-prong test of being easy to use and hard to misuse.
For example, AES is hard to use: as it was standardized before the consensus formed that authenticated encryption was the way to go, it requires pairing with an authentication algorithm which is commonly forgotten in practice. Implementers of AES are not discouraged from writing various optimizations that leak information about the underlying encryption key. It is fairly easy to misuse AES on both the implementer side and the application side. Being a block cipher, in order to encrypt data longer than the length of the encryption key, a mode of operation must be specified, and many commonly implemented modes have their own security vulnerabilities besides a lack of authentication. Another example is RSA, the pair of signature and asymmetric encryption algorithms that are frequently misused and improperly implemented. Due to its large key size, temptations to cut corners have run rampant through history in many implementations. Misunderstandings in the use of symmetric versus asymmetric encryption have led to people using RSA to directly encrypt data, direct use of the Diffie-Hellman shared secret result for encryption, duplicate use of keys, and many other security failures.
Between 2005 and 2010, Prof. Daniel J. Bernstein at University of Illinois at Chicago published a few papers that form the basis for much of the underlying cryptographic primitives central to O(1) Cryptography. In particular, the Salsa20/ChaCha20 family of stream ciphers and extended nonce versions, the Poly1305 one-time authenticator, and the elliptic curve Curve25519, were all detailed during this time. A more detailed listing of these various foundational papers are listed in the O(1) wiki, though the common theme behind the choice of primitives for this library are that they, too, are designed with the philosophy that they should be easy to use and hard to misuse. Another interesting theme is that many of these algorithm choices have been included into various IETF standards such as TLS 1.3 and SSH, so their use has clearly become far more widespread than their initial years.
Naturally, I am not the first to develop such a library, and there is prior art that inspires this library. The general idea behind making a cryptographic library that is easy to use and hard to misuse is the central concept of the polyglot Themis cryptographic framework. The choice of algorithms featured have been strongly influnced by Prof. Bernstein’s old NaCl library which formed the basis for libsodium, the essential C library with a similar philosophy (or at least as easy to use as C can be given that it’s C). Special thanks to Frank Denis, the maintainer of libsodium and developer of much of the zig standard crypto library, for further widening the rabbit hole of cryptographic primitives to explore and support, particularly in the field of lightweight cryptography.
This library is still under heavy development, particularly in the area of documentation and the higher level APIs. Much of the primary cryptographic primitives have been ported to Java where needed, and native implementations are also available. While the primary concern is to develop the Java API first, due to the eventual inclusion of C code from the Fiat Cryptography project for formally verified elliptic curve functions, I am also considering what other languages make sense to provide facades for, especially non-JVM ones that would benefit more from the native code. Pure Java versions of the algorithms are all available, though optimized native versions are also provided as an option. Similar to Apache Log4j, it may make sense to create Scala and Kotlin APIs, but those are currently out of scope for initial release.