Should your next web-based login form avoid sending passwords in clear text?

TL;DR: The answer to the question in the title is most likely “no.” While the OPAQUE protocol is a fascinating approach to authentication, for web applications it doesn’t provide any security advantages.

I read an interesting post by Matthew Green where he presents ways to authenticate users by password without actually transmitting the password to the server, in particular a protocol called OPAQUE. It works roughly like that:

The server has the user’s salt and public key, the client knows the password. Through application of some highly advanced magic, a private key materializes in the client, matching the public key known to the server. This only works if the password known to the client is correct, yet the client doesn’t learn the salt and the server doesn’t learn the password in the process. From that point on, the client can sign any requests sent to the server, and the server can verify them as belonging to this user.

The fact that you can do it like this is amazing. Yet the blog post seems to suggest that websites should adopt this approach. I wrote a comment mentioning this being pointless. The resulting discussion with another commenter made obvious that the fundamental issues of browser-based cryptography that I first saw mentioned in Javascript Cryptography Considered Harmful (2011) still aren’t widely known.

What are we protecting against?

Before we can have a meaningful discussion on the advantages of an approach we need to decide: what are the scenarios we are protecting against? In 2018, there is no excuse for avoiding HTTPS, so we can assume that any communication between the client and the server is encrypted. Even if the server receives the password in clear text, a responsible implementation will always hash the password before storing it in the database. So the potential attacks seem to be:

  • The server is compromised, either because of being hacked or as an inside job. So the attackers already have all the data, but they want to have your password as well. The password is valuable to them either because of password reuse (they could take over accounts on other services) or because parts of the data are encrypted on the server and the password is required for decryption. So they intercept the password as it comes in, before it is hashed.
  • Somebody succeeded with a Man-in-the-Middle attack on your connection, despite HTTPS. So they can inspect the data being sent over the connection and recover your password in particular. With that password they can log into your account themselves.
  • A rather unlikely scenario: a state-level actor recorded the (encrypted) contents of your HTTPS connection and successfully decrypted them after a lengthy period of time. They can now use your password to log into your account.

Does OPAQUE help in these scenarios?

With OPAQUE, the password is never sent to the server, so it cannot be intercepted in transit. However, with web applications the server controls both the server and the client side. So all it has to do is giving you a slightly modified version of its JavaScript code on the login page. That code can then intercept the password as you enter it into the login form. The user cannot notice this manipulation, with JavaScript code often going into megabytes these days, inspecting it every time just isn’t possible. Monitoring network traffic won’t help either if the data being sent is obfuscated.

This is no different with the Man-in-the-Middle attack, somebody who managed to break up your HTTPS connection will also be able to modify JavaScript code in transit. So OPAQUE only helps with the scenario where the attacker has to be completely passive, typically because they only manage to decrypt the data after the fact. With this scenario being extremely uncommon compared to compromised servers, it doesn’t justify the significant added complexity of the OPAQUE protocol.

What about leaked databases?

Very often however, the attackers will not compromise a server completely but “merely” extract its database, e.g. via an SQL injection vulnerability. The passwords in this database will hopefully be hashed, so the attackers will run an offline brute-force attack to extract the original passwords: hash various guesses and test whether the resulting hash matches the one in the database. Whether they succeed depends largely on the hashing function used. While storing passwords hashed with a fast hashing function like SHA-1 is only marginally better than storing passwords as clear text, a hashing function that is hard to speed up such as scrypt or argon2 with well-chosen parameters will be far more resilient.

It is a bit surprising at first, but using OPAQUE doesn’t really change anything here. Even though the database no longer stores the password (not even as a hash), it still contains all the data that attackers would need to test their password guesses. If you think about it, there is nothing special about the client. It doesn’t know any salts or other secrets, it only knows the password. So an attacker could just do everything that the client does to test a password guess. And the only factor slowing them down is again the hashing function, only that with OPAQUE this hashing function is applied on the client side.

In fact, it seems that OPAQUE might make things worse in this scenario. The server’s capability for hashing is well-known. It is relatively easy to tell what parameters will be doable, and it is also possible to throw more hardware at the problem if necessary. But what if hashing needs to be done on the client? We don’t know what hardware the client is running, so we have to assume the worst. And the worst is currently a low-end smartphone with a browser that doesn’t optimize JavaScript well. So chances are that a website deploying OPAQUE will choose comparatively weak parameters for the hashing function rather than risk some users to be upset about extensive delays.

Can’t OPAQUE be built into the browser?

Adding OPAQUE support to the browsers would address a part of the performance concerns. Then again, browsers that would add this feature should have highly-optimized JavaScript engines and Web Crypto API already. But the fundamental issue is passwords being entered into untrusted user interface, so the browser would also have to take over querying the password, probably the way it is done for HTTP authentication (everybody loves those prompts, right?). A compromised web server could still show a regular login form instead, but maybe the users will suspect something then? Yeah, not likely.

But wait, there is another issue. The attacker in the Man-in-the-Middle scenario doesn’t really need your password, they merely need a way to access your account even after they got out of your connection. The OPAQUE protocol results in a private key on the client side, and having that private key is almost as good as having the password — it means permanent access to the account. So the browser’s OPAQUE implementation doesn’t merely have to handle the password entry, it also needs to keep the private key for itself and sign requests made by the web application to the server. Doable? Yes, should be. Likely to get implemented and adopted by websites? Rather questionable.

Comments

  • a

    We use a similar algorithm to OPAQUE, and it has one big advantage. When an error occurs we log the request – and this means we don’t see the user password in our logs.

    Wladimir Palant

    Well, I am pretty certain that this can be solved by other means as well :-)

  • D

    This is a good analysis, but I think you slightly overstate things—commenter “a” is, in fact, spot-on.

    The Facebook plaintext password issue (https://krebsonsecurity.com/2019/03/facebook-stored-hundreds-of-millions-of-user-passwords-in-plain-text-for-years/) appears to be exactly what “a” was talking about. Yes, there are other ways to solve the problem, but there’s some advantage in the plaintext HTTP requests not containing plaintext passwords—both for the case that servers mistakenly log request data, and for the case that clients running debuggers do similarly.

    I can’t say this changes the argument you already made about the complexity of implementing OPAQUE vs the probability of passive MITM attacks. That’s a qualitative argument, and I won’t say you are wrong—or right. But attacks or—as in the FB case—accidental disclosures of this nature do occur. It’s undeniably an advantage—perhaps not one sufficient to counter the complexity cost—that OPAQUE mitigates against such risks.

    Wladimir Palant

    Yes, that Facebook issue is quite remarkable. Whether OPAQUE is the best approach to avoid it, is doubtful. I strongly suspect that there would have been easier ways – if anybody thought about it in the first place.

  • T

    Overall a good analysis, very useful - thanks.

    Similar to a, I think that passwords, session tokens, and API keys are susceptible to being logged, and so to the extent that they are long-lived, they probably should not be sent over the wire.

    I wanted to point out that many web properties are fronted by TLS termination devices for various reasons. Common ones include Akamai, MaxCDN, CloudFlare, CloudFront, and Amazon ELB - which together probably account for >50% of internet HTTPS traffic. This presents a noticeable weak point in the TLS assurances, though I'm not at the moment aware of any well-known breaches. On the client side, enterprise TLS-MITM boxes with built in CAs in HSMs certainly exist for corporate customers.

    Furthermore, the last few years have been full of TLS-busting bugs like LOGJAM, BREACH, SMACK-TLS, FREAK, POODLE, 3SHAKE, DROWN, et al. I've been attempting to keep track of all the TLS-breaking bugs and my summary excerpts have reached 150 pages already. Surprisingly, none of them have substantially disrupted the entire ecosystem.

    Your points about the abject failure of IETF HTTP-AUTH are well received. I suspect that a next-gen layer 7 authentication would want to talk directly to the Chrome team and get it implemented there to penetrate the ecosystem - Mozilla would quickly follow and then Microsoft and Apple would grudgingly comply a few years later.

  • c

    Not to mention that we user have no way to tell if the server does implement password hashing or not. Just like FaceBook case, everyone thought they had server-sided password hashing, but they didn't have. Why can't FB have some simple password hashing (scrypt probably) at client-side? Then, we users can be more confident to trust FB.

  • Emelia Smith

    If I'm reading your arguments correctly, the main one is that you can't trust that the javascript on the page hasn't been tampered with by a MITM or bad actor; I think the way to mitigate this would be to employ a Content Security Policy that prevents javascript from being run directly on the page; i.e., no script-src=self; Furthermore, you can have the server include Subresource Integrity checksums into the HTML of the page, which the browser will then verify match the checksums calculated for the downloaded javascript.

    Using CSP, you can also assert that all scripts and styles on the page have SRIs, and then have the browser report back to your server or a reporting service when/if the SRIs fail or are missing.

    https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy

    Wladimir Palant

    No, CSP and SRI do not protect against either of these scenarios. If the server is compromised or a MITM attack is ongoing, the attackers can modify CSP and SRI declarations as well, the same way they can manipulate JavaScript code.

  • Emelia Smith

    No, CSP and SRI do not protect against either of these scenarios. If the server is compromised or a MITM attack is ongoing, the attackers can modify CSP and SRI declarations as well, the same way they can manipulate JavaScript code.

    Well, only If the HTML and scripts come from the same server. If they're different hosts..

    Wladimir Palant

    If they are different hosts, CSP is still irrelevant - it either allows a third-party or not, it doesn't help with MITM or compromise of that host. SRI on the other hand solves the issue I mentioned in Please don't use externally hosted JavaScript libraries six years ago, namely well-known libraries hosted on untrusted CDNs. I'd argue however that the more relevant issue here are third-party tracking/analytics scripts. Even if the website owner uses SRI to ensure that these don't change, expecting them to check every modification for malicious changes is unrealistic.

  • Adam

    Hey Wladimir, I have been developing an online application system for a small non-profit for international students to fill out application forms for our programs, in which we ask for various sensitive information that needs to be stored securely. The project is still in development phase.

    What are your thoughts on our current security model below?

    Given your thoughts in this article (and other places), would I be correct to assume that this model is more secure than any SRP or PAKE browser-based implementation out there (such as Thinbus SRP)?

    Our Current Security Model:

    • when registering, client is forced to create strong password for account
    • plaintext password is then sent to server from client over HTTPS
    • server immediately hashes pw with SHA-256, and pw hash is used to encrypt/decrypt a user-specific encryption key (randomly generated via libsodium) that unlocks the user's application data
    • our org's registrar admins will also have a (very strong) password (hashed) that can decrypt this same encryption key to access user application data once the user has submitted their app form
    • the admin pw hash is also used to do password resets, since we don't store the user's password anywhere on the server to do an automatic reset

    Thank you for your helpful articles and comments, I am learning a lot!

    Wladimir Palant

    First of all, please don't hash passwords using SHA-256, this offers very little protection against bruteforce attacks. At the very least you should use bcrypt with a reasonable work factor, scrypt or Argon2 are better if you have the necessary server-side resources. Random salt for each user as well please. This way, if your data leaks the attackers will have a much harder time recovering the original passwords from it.

    In case you are thinking that you aren't storing the user's password anywhere, not even in hashed form -- yes, you are, indirectly. A potential bruteforce attack would go against these encrypted keys. A password that can decrypt the key is the correct one.

    What you have here is encryption at rest. It makes sense in case an unauthorized person gains access to the raw data but isn't able to modify server-side code. What it cannot protect against is full server compromise of course -- somebody able to modify the code will intercept admin password hash and use it to decrypt everything.

    But I agree with you that SRP or PAKE won't offer any additional protection in this scenario, that's the whole point of this blog post.

  • Adam

    ...please don't hash passwords using SHA-256...At the very least you should use bcrypt with a reasonable work factor, scrypt or Argon2 are better if you have the necessary server-side resources. Random salt for each user as well please. I'll admit I didn't do my research on this one, so I will definitely take this advice and implement Argon2

    What it cannot protect against is full server compromise of course... Indeed. I suppose I could force the user to store their encrypted keys and data on their own computer's file system, but since all crypto would still be server-side, that would probably only increase the chances of a data leak via logs or the very unlikely MITM/state-actor interception over HTTPS, and wouldn't really do anything to mitigate a full server compromise. Not to mention it would degrade the user experience a bit. What are your thoughts on this?

    Something that I forgot to mention originally was that I also am storing an encrypted version of the user's password hash in the user's browser cookies after successful login (destroyed at logout). The key used to encrypt it is a server-wide key stored in plaintext in a config file in a private directory. This key is only used for encrypting/decrypting the user's password hash for further decryption of the user's data key so they can edit their application forms. Is this secure?

    Thank you for your time and advice!

    Wladimir Palant

    You cannot protect against a full server compromise as long as crypto is done on the server side. And even if you were able to move all crypto to the client side, doing so would make little sense as long as "client side" means a web page - the server can change the JavaScript code running there at any time. So this is simply something where you have to keep in mind that you cannot fix it. It matters because it means that there is no point protecting against this scenario elsewhere either.

    That's also why I think that encrypting cookie values makes little sense, as long as the server doesn't store the cookies anywhere (logs, database, any other kind of permanent storage). Cookies being sent to the server is pretty much the same as the password itself being sent, no reason to treat them differently.

    Maybe you are concerned about the cookies being compromised on the user's machine. But you aren't actually protecting against this scenario -- whoever manages to steal cookies doesn't need to decrypt them. They can just use the encrypted cookie and the server will decrypt them. So if that's your concern, you better use session-only cookies if possible (prevents them from being stored on disk) and you should set HttpOnly attribute (prevents them from being read out by malicious JavaScript code). Plus the usual recommendation of using HTTPS along with Strict-Transport-Security, making sure that cookies cannot be intercepted on the network level.

  • ADA

    I'd like to expand on what 'a' and 'D' said. While Wladimir is correct that you can solve the issue in 'a' using other means, the benefit to never sending the password over the wire in plaintext is that you can implement it once and not have to worry about properly configuring everything that may log the HTTP request to successfully mask out passwords.

    With modern web infrastructure there may be a half dozen to a dozen points where an HTTP request might be logged, some of them not under your control. APMs running on the client may log requests, server-side APMs, web server logs, load balancer logs, logs on the web server itself, etc. Then you get into things like if your traffic passes through a CDN like CloudFront to your origin so that cached responses can be served faster. Again, the CDN could log an HTTP request which includes a password.

    So approach A is configure all of those to mask out passwords, and hope there's no bugs (which of course there will be), and that the software doesn't change and break your masking configuration. Or approach B where you don't send the password in clear text to begin with, and then you don't have to lose sleep over something getting misconfigured and logging plaintext passwords like Facebook.

    And there's yet another recent example of this, with logs for multiple VPN services being exposed due to a misconfigured Elasticsearch (https://www.vpnmentor.com/blog/report-free-vpns-leak/). Cleartext passwords were among those things in the logs.

    In a world where requests may be logged in multiple places and by third parties, it's probably a good practice to review how to keep data secret between just the client and the intended recipient. Relying on just HTTPS solves only the leg between the client and a server, but provides no guarantee that the data will remain known only to the final destination server, and won't be logged along the way. With no mention of employees of one of those other services or intermediary servers going rogue like what may have happened in the recent mass compromise of verified Twitter accounts.

  • BOB

    I agree with ADA. Sending clear-text passwords back to the servers, opens up a lot of vulnerabilities, specially in more complex architectures and larger companies, where requests can pass through various intermediary spots. These intermediaries can be part of infrastructure services like CDNs as well as organization services like API gateways. Plus, you have all the code that these requests still have to go through once they reach the origin server. No wonder these issues have affected Facebook and Twitter (and probably many other companies that either never detected the issue, or didn't report it once detected.)

    Instead of relying on the security of the whole communication chain, trying to patch every possible vulnerable spot that you have control over and relying on the security of all vulnerable spots that you don't control, as well as trusting hundreds if not thoushends of employees it may be a better idea to not have to deal with the clear-text passwords hot potato in the first place.

    And this is from the perspective of service providers. From the user perspective, trusting all these services to properly handle their passwords and not storing them (either intentionally or unintentionally) can be quite a strong requirement, specially for services aiming to minimize trust requirements, for example by offering edge-to-edge encryption.

    In view of these points, saying that the OPAQUE protocol doesn't offer any security advantage that is worth the implementation complexity, seems like a rather bold statement worth reviewing in more detail.

  • titeuf

    Hi, The problem of logging by mistake the cleartext password is way more common than you think. Or a tcpdump to debug things between two hosts, after the TLS gateway.

  • J

    I find these concerns very unconvincing. The point of OPAQUE is that you only need to trust that there's no vulnerabilities in the software running directly on the client, whereas with traditional methods you 'also' need to trust there's no vulnerabilities in the server and the whole TLS infrastructure.

    It's certainly not perfect, but even if the benefits were small, any serious security practitioner should at least consider it. We have a responsibility to protect user info

  • Tobias

    With OPAQUE, the password is never sent to the server, so it cannot be intercepted in transit. However, with web applications the server controls both the server and the client side. So all it has to do is giving you a slightly modified version of its JavaScript code on the login page. That code can then intercept the password as you enter it into the login form. The user cannot notice this manipulation, with JavaScript code often going into megabytes these days, inspecting it every time just isn’t possible.

    For the web CODE SIGNING used to be long past dead… however in the spotlight of those crypto messenger's web apps seems to experience kind of resurgence. Some are using the Signed Pages browser plugin: https://chrome.google.com/webstore/detail/signed-pages/pdhofgeoopaglkejgpjojeikbdmkmkbp So, for the login form (where the password is entered by the user) you could make the web app direct to a code-signed login page (where the user would have to pay his security awareness attention to the "green check mark" – and URL/TLS of course) and from there redirect back.

    I personally would prefer the aPAKE login feature to be standardized and baked into the browser like FIDO2/WebAuthn – or even better: extend those FIDO2 plugins already in use by an aPAKE/OPAQUE feature (ideally including HMAC Authentication Header for the authenticated session's requests to be signed with the exchanged key K).

  • Mitar

    I am curious what you think of arguments listed here FOR using SPA/OPAQUE (e.g., Heartbleed and TLS-stripping plugins in enterprise settings).

  • Mitar

    After reading through the OPAQUE spec draft myself, I tend to agree that OPAQUE makes a trade-off and increases security of the password in transit for less security of the password in the database. I made an issue about this with more details.

    If the goal is really to prevent logging passwords in transit (by accident) or intercepting them in TLS stripping proxies, I think much simpler solution is to establish a Diffie–Hellman key between client and server, encrypt the password with the key, and send it over. Server then decrypts it, hashes it, stores or compares it with one in the database. So in a way, if you do not trust TLS for confidentiality, you can create you own encrypted channel. You can still rely on TLS for integrity and authentication of the server for use cases I listed above.

  • Joel

    About storing passwords in database. Would it be good solution to make the database password more secure in the event of a database hack? We can create several columns for storing password, e.g.: columns from password_1 to password_7 (or it can be named as "email" or "username" to confuse, and then email address/username would be stored in "password" column.

    The real password can be stored in one of the password columns. In the remaining six we can store an hashed random string of characters. Secondly, when hashing, we can use alternately various hashing functions and use them multiple times, e.g.

    our password = $password, Just a simple example:

    $password = hash('sha384', $password); $password = hash('sha256', $password); $password = hash('sha512', $password); $password = hash('guest', $password); $h1 = hash('md5', $password); $h2 = hash('md5', $h1); $h3 = hash('md5', $h2); $h4 = hash('md5', $h3); $hash_db = $h1 . $h2 . $h3 . $h4; (or last hash function would be 'sha512', and then we take substring, for example 15 characters and stored them in database).

    We store $hash_db in the database. The hash is 128 characters long, it looks like the sha512 algorithm is used.

    If the database were compromised, wouldn't such a simple solution make it significantly more difficult for hackers to guess the password?

    Wladimir Palant

    No, this solution would make it considerably easier to retrieve the passwords.

    1. Never hash passwords with fast hashing functions like MD5 or SHA2, and especially never hash them without a salt. Use a slow hashing function like bcrypt, scrypt or Argon2, and use a salt that is unique for each user.
    2. Do not assume that your database will be compromised but the code used to write into the database won’t be. Typically, the hackers will take a look at your code and laugh at your silly attempts to fool them.
    3. Even if the hackers don’t get your code, all they need is the data of their own account where they know the password. They hash their own password and see immediately where/how that hash is stored.
  • Joel

    You clearly have a good head on your shoulders. These weak algorithms are just an example. In fact, wecan use a strong algorithm first and then use weaker ones. Does multiple hashing using alternating different algorithms make sense?

    My antivirus detected a threat under the link 'Markdown syntax'.

    Wladimir Palant

    No, multiple hashing doesn’t make any sense. It’s security by obscurity, not helping at all.

    Antivirus software always detects threats, that’s what it is there for. Whether any of these threats are real is a different question.