JWT
src: JWT is Awesome: Here’s Why – The HFT Guy - 2022-03-12
Here’s my take on JWT and why I think it’s amazing*, after having migrated hundreds of applications in JP Morgan from a legacy authentication solution.
*JWT = JSON Web Token
1) Pro: JWT is standardized and supported in most languages
For applications to have authentication, this requires developers to tightly integrate authentication when writing the application. Authentication code is tough work as anybody who ever had to do it can attest.
JWT can take a lot of that pain away. JWT has ready-to-use libraries for most languages and they pretty much just work. See a list on https://jwt.io/#libraries-io
Even better, it is a standardized format with examples and documentation freely available on the internet. Any trouble with it and help is only a Google search away. That wouldn’t be the case with any in-house developed alternative.
2) Pro: JWT supports user attributes out-of-the box
Authentication is to guarantee who the user is? A user is usually represented by some sort of unique user identifier.
Most authentication solutions provide the user identifier, only the user identifier, and that’s exactly where they fall short.
Turns out a user is more than an identifier! From a decade working around development and authentication, I can tell there are at least 3 things needed to cover most application use cases.
- A unique user identifier. A unique identifier per user, meant for computers. Usually an id number straight out of a user database.
- A display name. A string to display back to the user that should make sense to themselves, meant for human comprehension. Usually a combination of first/middle/last names or an email address.
- An email address. The email address of the user. Usually needed by a fair amount of applications/reports/systems to email the user after some actions.
Having additional user attributes is trivial with JWT tokens and they can easily be used (by developers) in applications. A JWT token is basically a JSON key=value structure, with some of the keys being standardized.
If it were not for JWT, developers would have to not only integrate some made-up authentication system, but also some made up user attributes system, in order to perform the simplest of task like showing the user its own name.
3) Pro: JWT is unicode friendly out-of-the-box
JWT and JSON underneath are explicitly utf-8. They handle non-English names well.
Actually it’s more than that. They are strictly unicode and from day one, so pretty much always work and in all programming languages.
For example, JWT libraries in python (2.7) and C++ work with unicode strings. This is great because it doesn’t leave a choice to developers, they MUST handle unicode somewhat or the application WILL blow up right away (type error and such).
This and the rise of python 3 will eventually get the world to a place where applications work with non-English characters. Well, at least not mangling the user name.
To be clear. The biggest obstacle to internationalization is programming languages themselves, see str()
and std::string()
, that don’t support non-English out-of-the-box at all. Even if one is willing to go the extra mile to make it work, they quickly hit a wall because every other function and library is implicitly limited to “standard” strings.
4) Pro: JWT doesn’t DDoS authentication servers
What is the most common recurring source of company wide outage?
[To clarify, I mean truly major outages, not like just one application or ten being down.]
.
.
.
[building up the tension]
.
.
.
You guessed it, it’s the authentication system. If authentication services are down, everything else is down. Well, it’s up but may not be quite usable.
Needless to say, this happened before and will happen again. Buggy applications DDoSing critical services in an infinite loop is a source of issues.
JWT can totally alleviate this. JWT tokens are signed by an asymmetric keys and can be verified offline by applications. No need to call an external service and have a gigantic single point of failure.
Authentication is the single most important thing in the universe and without authentication the universe stops running. source
5) Pro: Dramatic decrease in latency
Putting aside DDoS scenarios. Latency during normal operation is an important concern.
Consider that everything nowadays is a mesh of services (#Microservice). When a user sends a request, the web server calls another service, that calls another service, that calls a database…
Each step may involve some authentication check. Even if a single check is quick (below 100 milliseconds), this is not as quick when repeated N times. What about 99% latency? What if systems are under load and take whole second(s) to complete? You guessed it, latency can add up really quickly and it’s not great at all.
JWT tokens can be verified offline by applications, making the latency effectively zero. Once again, JWT shines.
How long should a HTTP call take by the way? One may be tempted to say a few milliseconds, roughly the time to open a connection and check if token is in Redis. But that doesn’t consider multi data center scenarios. For global companies -I hope most companies hope to serve users in the US and the EU (if not Asia)- there are datacenters in and across continents, meaning latency in the tens and in the hundreds of milliseconds respectively.
Fun fact: Did you know that Australia is a continent? People always forget about it. Being 12 000 km from North America, network round trip time is a minimum of 80 milliseconds. Bet users there can feel the latency every time they click a button on a US website.
6) Pro: JWT is secure
HTTPS, HMAC and asymmetric keys. It’s all basic and proven.
If you’d try to roll your own, either you’d end up with pretty much the same, or you’d follow top stack overflow answers -that have not been updated in the past 10 years- and end up with obsolete insecure stuff (#MD5).
7) Pro: Don’t roll your own
Alright, let’s roll our own authentication tokens.
So… quick thinking… we need to have some sort of tokens with:
- a user identifier in the token,
- a way to retrieve user attributes (another system?)
- when was the token created (iat) and is it still valid (exp)
- which settings were used (alg + kid), (really hard to do future upgrade if missing)
- what created the token and why (issuer + aud) (probably don’t care about that)
Sounds about right, so let’s put that in a dict and pickle it.
OMG DON’T PICKLE DATA. PICKLING IS INSECURE. Pickling doesn’t generate an innocent string but python code to (re)generate the object. Parsing back is executing that python code, any python code really. Wherever pickle is used in any capacity, users can simply send arbitrary commands and they will get executed on the authentication servers.
Pickle, number #1 source of critical vulnerabilities in python applications since 1991.
Anyway, moving on… you’re probably fine if not working in Python, or Ruby.
</end of rant>
The end result is something like a key=value structure to be serialized and signed somehow.
Guess what is JWT? JWT is basically a JSON encoded object that’s signed. Might as well go with that, no need to reinvent the wheel. They probably gave some thoughts to common use cases and edge cases, unlike this 5 minutes brainstorming.
8) Myth: JWT is decentralized. (It is not unless you make it so)
I’ve said that JWT allow to verify tokens offline, in a decentralized fashion. Fact is, JWT can be fully centralized. Here’s how.
JWT is most often used alongside OpenID Connect. JWT defines a token format and OpenID Connect defines API to initiate authentication over HTTP. See related article on the difference between id_token and access_token in OpenID Connect.
When using OpenID Connect, the server usually provides a /userinfo
endpoint like https://example.com/oauth2/v2.0/userinfo
. This endpoint will return user attributes when called with a user token. Note that few attributes -if any- are usually stored inside tokens in order to keep them small.
It’s possible to operate in a centralized fashion by (exclusively) verifying tokens through the /userinfo
endpoint, rather than verifying token signatures offline.
Both modes of operation are supported in most setups. To prevent either, disable the /userinfo
endpoint or do not distribute public signing keys.
9) Myth: JWT doesn’t support logout or invalidation. (It can with OpenID Connect)
JWT tokens are emitted and valid for some amount of time. See issued at and expires on attributes in the JWT standard (iat
, exp
). It’s fair to say that tokens are not meant to be invalidated. They’re rather intended to live for a defined period of time, that ought to be reasonably short.
It’s entirely possible to have centralized logout and invalidation if that’s all one wishes to have. See above “JWT is (not) decentralized“. This simply requires to operate centrally with the /userinfo
endpoint. The OIDC server should also be configured to expose the /logout
endpoint to be able to close sessions (not all OIDC providers support it). That’s it. It’s build-in.
Note: One stupid alternative that I strongly advise against, but that people often talk about on the internet, is to maintain some blacklist of tokens (in some databases or Redis). Then comes the trouble of maintaining another anti-authentication system and integrating it in libraries and applications (All applications will connect to a Redis somehow? That is always online and globally distributed?). If centralized logout is an absolute must have, then setup OIDC to work centrally and let the OIDC server do the work it’s meant to do.
Last but not least. It’s possible to invalidate all (current) authentication tokens at once. Simply rotate the signing keys. Previously created tokens won’t be accepted as valid anymore.
10) Con: JWT token Size
One minor downside of JWT tokens is their size. A typical token is 500 to 1000 bytes in practice. It could be considered fairly large when compared to a blackbox token like a 128 bits integer or a GUID.
Arbitrary user attributes can be set in tokens. They could easily consume multiple kilobytes and upward if not careful and that will cause issues down the line.
Attributes should be kept to a minimum to limit token size. Enabling all optional attributes is an anti-pattern, only enable needed attributes selectively. Tokens are not suited to store long strings. For example a postal address, let alone multiple addresses, is not a good fit for token attributes.
Over HTTP, authentication tokens are passed through the Cookie
or the Authorization
header. HTTP request headers are limited to 8 kB in most web servers, covering normal HTTP request headers and all user cookies and the token if any. The limit can easily be reached accidentally. Applications running on the same domain, as in https://internal.example.com/...
, are sharing the same cookie namespace and are more likely to run into issues for using numerous or large cookies.
Conclusion
That was a lot of pros and two myths busted.
Did I say JWT is awesome?
Long live JWT! …unless the token was logged out.