more rearranging

Brian Warner 2013-12-16 15:00:01 -08:00
Родитель c0d48a40b7
Коммит 6e8450526f
1 изменённых файлов: 105 добавлений и 122 удалений

@ -34,7 +34,7 @@ After the client submits /account/create, it performs the "/session/auth" login
# Login: Obtaining the sessionToken
To connect a browser to an existing account, we use the following login protocol to transform an email+password pair into a sessionToken. This will be used in the next section to obtain signed certificates.
To connect a browser to an existing account, we use the following login protocol to transform an email+password pair into a sessionToken. The sessionToken will be used in the next section to obtain signed certificates.
This protocol starts by feeding the password and email address into 1000 rounds of PBKDF2 to obtain "stretchedPW", feeding stretchedPW into HKDF to get "authPW", then delivering email+authPW to the server's `/auth/password` endpoint.
@ -42,45 +42,15 @@ This protocol starts by feeding the password and email address into 1000 rounds
The server uses the email address to look up the database row, extracts authSalt, performs 64k/8/1 scrypt stretching to obtain "bigStretchedPW", feed bigStretchedPW into HKDF to obtain "verifyHash", then compares verifyHash against the stored value. If they match, the client has proved knowledge of the password, and the server creates a new session. The server returns the sessionToken to the client, along with its account identifier (uid).
In the future, the `/auth/password` endpoint may also accept two-factor authentication data. If so, it is likely to return a "2FA-required" error to the first request, with information on what additional UI should be displayed to solicit the additional data.
# Creating a Session
The `/auth/password` call should also include information about the client device, such as a host name, profile name, model number, etc. This will be used to describe the session to the user later, when they enumerate their active sessions (for review and possible revocation).
For calls which accept an authToken, the client uses authToken to derive three values:
## Creating a Session
* tokenID
* reqHMACkey
* requestKey
Each successful `/auth/password` call results in a new session (with a unique+unguessable sessionToken). The server can support multiple sessions per account (typically one per client device, plus perhaps others for account-management portals). The sessionToken lasts forever (until revoked by a password change or explicit revocation command), and can be used an unlimited number of times.
The client uses tokenID and reqHMACkey for a HAWK (https://github.com/hueniverse/hawk/) request to the "POST /session/create" API, using tokenID as "credentials.id" and reqHMACkey as "credentials.key". The server uses tokenID to look up the corresponding token, then derives reqHMACkey to validate the request.
Each authToken-using call then derives additional API-specific values from requestKey. /session/create uses two derived values:
* respHMACkey
* respXORkey
When the server receives a valid /session/create request, it allocates sessionToken and keyFetchToken, concatenates them, encrypts the pair by XORing it with the derived respXORkey, and attaches a MAC generated with respHMACkey. The encrypted MACed bundle is returned to the client.
The client recomputes the MAC, compares it (throwing an error if it doesn't match), extracts the ciphertext, XORs it with the derived respXORkey, then splits it into the separate keyFetchToken and sessionToken values.
The server can support multiple sessions per account (typically one per client device, plus perhaps others for account-management portals). There can also be multiple outstanding keyFetchTokens. The sessionToken lasts forever (until revoked by a password change or explicit revocation command), and can be used an unlimited number of times. The keyFetchToken expires after 60 seconds, and is single-use.
# Signing Certificates
The sessionToken is used to derive two values:
* tokenID
* request HMAC key
[[images/IdPAuth-use-session.png]]
The requestHMACkey is used in a HAWK request to provide integrity over many APIs, including /certificate/sign. requestHMACkey is used as credentials.key, while tokenID is used as credentials.id . HAWK includes the URL and the HTTP method ("POST") in the HMAC-protected data, and will optionally include the HTTP request body (payload) if requested.
For /certificate/sign, it is critical to enable payload verification by setting options.payload=true (on both client and server). Otherwise a man-in-the-middle could submit their own public key, get it signed, and then delete the user's data on the storage servers.
The following keyserver APIs require a HAWK-protected request that uses the sessionToken. In addition, some require that the account be in the "verified" state:
Many keyserver APIs require a HAWK-protected request that uses the sessionToken. Some of them require that the account be in the "verified" state:
* GET /account/devices
* POST /session/destroy
@ -88,31 +58,31 @@ The following keyserver APIs require a HAWK-protected request that uses the sess
* POST /recovery_email/resend_code
* POST /certificate/sign (requires "verified" account)
# Signing Certificates
Clients who have a active sessionToken, for an account on which the email address has been verified, can use the `/certificate/sign` endpoint to obtain a signed BrowserID/Persona certificate. This certificate can then be used to produce signed Persona assertions for delivery to RPs.
The sessionToken is used to derive two values:
* tokenID
* request HMAC key
[[images/IdPAuth-use-session.png]]
The requestHMACkey is used in a HAWK request to provide integrity over many APIs, including /certificate/sign. requestHMACkey is used as credentials.key, while tokenID is used as credentials.id . HAWK includes the URL and the HTTP method ("POST") in the HMAC-protected data, and will optionally include the HTTP request body (payload) if requested.
For /certificate/sign, it is critical to enable payload verification by setting options.payload=true (on both client and server). Otherwise a man-in-the-middle could submit their own public key, get it signed, and control the user's account when speaking to other relying parties (including deleting the user's data on the Sync storage servers).
# Fetching Sync Keys
If the client also wants kA/kB for Sync, it adds `?service=sync` to the endpoint URL (thus `/auth/password?service=sync`). When the server sees this, in addition to creating a sessionToken, it also creates a `keyFetchToken` and extracts a second value from its HKDF call named `stretchWrap`. It then returns sessionToken, keyFetchToken, and stretchWrap to the client.
If the client also wants kA/kB for Sync, it adds `?service=sync` to the endpoint URL during initial login (thus `/auth/password?service=sync`). When the server sees this, in addition to creating a sessionToken, it also creates a `keyFetchToken` and extracts a second value from its HKDF call named `stretchWrap`. It then returns sessionToken, keyFetchToken, and stretchWrap to the client.
The client will use keyFetchToken below to obtain kA and wrap(kB). It will then combine another derivative of stretchedPW with stretchWrap to derive `unwrapBKey`, from which is can obtain the unwrapped kB.
# After Login: Using the keyFetchToken
During login the server allocates and returns a new (random 32-byte) token: the long-lived sessionToken.
If the client asks for sync keys, it allocates and returns another token named keyFetchToken. This is a single-use token, which expires.. eventually?
If the client asks to change the password, it allocates and returns a new accountResetToken. This is a single-use token which expires quickly, perhaps within 10 minutes.
After the authToken is acquired, the client can create a session and fetch the encryption keys. The high-level flow looks like this:
[[images/IdPAuth-session-start.png]]
# Obtaining keys kA and kB
The single-use keyFetchToken allows the client to retrieve kA and wrap(kB), which enables the client to encrypt and decrypt browser data (bookmarks, open-tabs, etc) correctly. As above, the keyFetchToken is used to derive tokenID, reqHMACkey, respHMACkey, and respXORkey, which are used in a HAWK request to the "GET /account/keys" API.
[[images/justauth-overview-keys-weak.png]]
The keyFetchToken is used to derive tokenID and reqHMACkey, which are used in a HAWK request to the "GET /account/keys" API. It is also used to derive keyRequestKey, from which respHMACkey and respXORkey are derived.
The server pulls kA and wrap(kB) from the account table, concatenates them, encrypts the pair by XORing it with the derived respXORkey, and attaches a MAC generated with respHMACkey.
[[images/IdPAuth-keys-server.png]]
@ -129,25 +99,21 @@ Finally, the server-provided wrap(kB) value is simply XORed with the password-de
Note that /account/keys will not succeed until the account's email address has been verified. Also note that each keyFetchToken is single-use and short-lived. The token is consumed even if the request fails (e.g. the MAC does not match).
Crypto note: while the two returned keys are encrypted keyFetchToken, the keyFetchToken itself is sent over the (TLS-protected) wire without additional protection. This superfluous encryption will be useful in a future protocol, in which SRP is used to protect the delivery of keyFetchToken. We retain this encryption step to minimize the changes to our existing (SRP-based) code.
Crypto note: while the two returned keys are encrypted with (a derivative of) keyFetchToken, the keyFetchToken itself is sent over the (TLS-protected) wire without additional protection. This superfluous encryption will be useful in a future protocol, in which SRP is used to protect the delivery of keyFetchToken. We retain this encryption step to minimize the changes to our existing (SRP-based) code.
# Resetting The Account
The account may be reset in two circumstances: when the user changes their password, or when the user forgets their password. In both cases, the client first obtains an "accountResetToken". This token is then used to change the verifierHash and either reset or replace the wrap(kB) value.
## changing the password
## Changing The Password
To change the password, the client uses an alternate endpoint named `/auth/change_password`. This accepts the same email+authPW that `/auth/password` takes, and returns two values: `stretchWrap` and an accountResetToken. The accountResetToken can then be used below to reset the account's password and wrapped kB.
To change the password, the client uses an alternate endpoint named `/auth/change_password`. This accepts the same email+authPW that `/auth/password` takes, and returns three values: `stretchWrap`, `keyFetchToken`, and a new `accountResetToken`. The accountResetToken can then be used below to reset the account's password and wrapped kB. This is a single-use token which expires quickly, perhaps within 10 minutes.
## Changing the Password
When the user wishes to change their password (i.e. they still know the old password), they first use the `/auth/password` API to obtain an accountResetToken and a keyFetchToken.
Clients should then use the `keyFetchToken` to obtain kB, so the subsequent account reset can replace wrap(kB) with a new value. This allows the password-changing client to retain their class-B data.
[[images/IdPAuth-encrypt-passwordChange.png]]
The accountResetToken will be used below to set the new password. The keyFetchToken should be used first, to obtain kB, so the subsequent account reset can replace wrap(kB) with a new value. This allows the password-changing client to retain their class-B data.
Using the `/auth/password` API proves that the user has provided the correct account password recently. When the account is reset, all active sessions and tokens will be cancelled (disconnecting all devices from the account). The client should immediately establish a new session as described above.
This API is only used when the user knows their old password: if they have forgotten the password, use the "/password/forgot" APIs below.
@ -206,66 +172,6 @@ The device submits `authPW` to the `/account/destroy` endpoint. This request con
[[images/IdPAuth-deleteAccount.png]]
## deleting the account
To delete the account entirely, email+authPW are delivered to the `/account/delete` endpoint.
To mitigate DoS abuse, /auth/start may also require a proof-of-work string, described below.
## Client-Side Key Stretching
"Key Stretching" is the practice of running a password through a computationally-expensive one-way function before using it for encryption or authentication. The goal is to make brute-force dictionary attacks more expensive, by raising the cost of testing each guess.
To protect the user's class-B data against active compromise of our keyserver (in which the attacker gets to see `authPW` as it is sent to the server), we perform some key stretching on the client. To further improve protection against static compromise (where the attacker sees the stored verify-hash and wrap(kB) in the server's database), we do additional stretching on the server.
On the server, we use the memory-hard "scrypt" function (pronounced "ess-crypt") for this purpose, as motivated by the attacker-cost studies in [Identity/CryptoIdeas/01-PBKDF-scrypt](https://wiki.mozilla.org/Identity/CryptoIdeas/01-PBKDF-scrypt).
After "stretchedPW" is derived, a second HKDF call is used to derive "srpPW" and "unwrapBKey" which will be used later.
[[images/IdPAuth-main-KDF.png]]
All tokens have an associated tokenID, described below. The server needs to maintain a table that maps the tokenID to the token itself, so it can derive other values from the token later. The tokens are also associated with a specific account, so later API requests do not specify an email address or account ID.
# Crypto Notes
Strong entropy is needed in the following places:
* (server) initial creation of kA and wrap(kB)
* (server) creation of signToken and resetToken
On the server, code should get entropy from /dev/urandom via a function that uses it, like "crypto.randomBytes()" in node.js or "os.urandom()" in python.
An HKDF-based stream cipher is used to protect the contents of some requests. HKDF is used to create a number of random bytes equal to the length of the message, then these are XORed with the plaintext to produce the ciphertext. An HMAC is then computed from the ciphertext, to protect the integrity of the message.
HKDF, like any KDF, is defined to produce output that is indistinguishable from random data ("The HKDF Scheme", http://eprint.iacr.org/2010/264.pdf , by Hugo Krawczyk, section 3). XORing a plaintext with a random keystream to produce ciphertext is a simple and secure approach to data encryption, epitomized by AES-CTR or a stream cipher (http://cr.yp.to/snuffle/design.pdf). HKDF is not the fastest way to generate such a keystream, but it is safe, easy to specify, and easy to implement (just HMAC and XOR).
Each keystream must be unique. We define keyFetchToken to be a single-use randomly-generated value, to ensure our HKDF-XOR keystreams will be unique.
A slightly more-traditional alternative would be to use AES-CTR (with the same HMAC-SHA256 used here), with a randomly-generated IV. This is equally secure, but requires implementors to obtain an AES library (with CTR mode, which does not seem to be universal). An even more traditional technique would be AES-CBC, which introduces the need for padding and a way to specify the length of the plaintext. The additional specification complexity, plus the library load, leads me to prefer HKDF+XOR.
kB is equal to the XOR of wrapKey (which is a deterministic function of the user's email address, password, salt, and the server-side stretching parameters) and the server's randomly-generated wrap(kB) value, making kB a random value too. Using XOR as a wrapping function allows us to avoid sending kB or wrap(kB) in the initial createAccount arguments.
To make this technique safe, any time kB or the password is changed, the mainSalt should be changed too. Otherwise knowledge of both wrap(old-kB) and old-kB would reveal wrapKey, making it easy to deduce the new kB. Changing mainSalt causes wrapKey to change too, preventing this.
There is no MAC on wrap(kB). If the keyserver chooses to deliver a bogus wrap(kB) or kA, the client will discover the problem a moment later when it talks to a storage server and attempts to retrieve data from an unrecognized collection-ID (since we intend to derive collection-IDs from the key used to encrypt their data, which will be derived from kA or kB as appropriate). It might be useful to add a checksum to kA and wrap(kB) to detect accidental corruption (e.g. store and deliver kA+SHA256(kA)), but this doesn't protect against intentional changes. We omit this checksum for now, assuming that disks will be reliable enough to let us never experience such failures.
HAWK provides one thing: integrity/authentication for the request contents (URL, method, and optionally the body). It does not provide confidentiality of the request, or integrity of the response, or confidentiality of the response.
For /certificate/sign, we do not need request confidentiality or response confidentiality, since the client's pubkey and the resulting certificate will both be exposed over a similar SSL connection to the storage server later. And it is sufficient to rely on the response integrity provided by SSL, since the client can verify the returned certificate for itself. For the other keyserver APIs protected by HAWK, these properties are either unnecessary, or are provided by additional mechanisms.
# Glossary
This defines some of the jargon we've developed for this protocol.
* data classes: each type of browser data (bookmarks, passwords, history, etc) can be assigned, by the user, to either class-A or class-B
* class-A: data assigned to this class can be recovered, even if the user forgets their password, by proving control over an email address and resetting the account. It can also be read by Mozilla (since it runs the keyserver and knows kA), or by the user's IdP (by resetting the account without the user's permission).
* class-B: data in this class cannot be recovered if the password is forgotten. It cannot be read by the IdP. Mozilla (via the keyserver) cannot read this data, but can attempt a brute-force dictionary attack against the password.
* kA: the master key for data stored as "class-A", a 32-byte binary string. Individual encryption keys for different datatypes are derived from kA.
* kB: the master key for data stored as "class-B", a 32-byte binary string.
* wrap(kB): an encrypted copy of kB. The keyserver stores wrap(kB) and never sees kB itself. The client (browser) uses a key derived from the user's password to decrypt wrap(kB), obtaining the real kB.
* sessionToken: a long-lived per-device token which allows the device to obtained signed BrowserID certificates for the account's identity (GUID@picl-something.org). This token remains valid until the user revokes it (either by changing their password, or triggering some kind of "revoke a specific device" or "revoke all devices" function).
# Keyserver Protocol Summary
* POST /account/create (email,srpV,srpSalt) -> ok (server sends verification email)
@ -341,3 +247,80 @@ Change Password
* GET /account/keys [keyFetchToken] () -> kA/wrap(kB)
* POST /account/reset [authed+encrypted by accountResetToken] (wrap(kB),srpV,srpSalt) -> ok
* GOTO "Attach to new device"
# misc
For calls which accept an authToken, the client uses authToken to derive three values:
* tokenID
* reqHMACkey
* requestKey
The client uses tokenID and reqHMACkey for a HAWK (https://github.com/hueniverse/hawk/) request to the "POST /session/create" API, using tokenID as "credentials.id" and reqHMACkey as "credentials.key". The server uses tokenID to look up the corresponding token, then derives reqHMACkey to validate the request.
Each authToken-using call then derives additional API-specific values from requestKey. /session/create uses two derived values:
* respHMACkey
* respXORkey
When the server receives a valid /session/create request, it allocates sessionToken and keyFetchToken, concatenates them, encrypts the pair by XORing it with the derived respXORkey, and attaches a MAC generated with respHMACkey. The encrypted MACed bundle is returned to the client.
The client recomputes the MAC, compares it (throwing an error if it doesn't match), extracts the ciphertext, XORs it with the derived respXORkey, then splits it into the separate keyFetchToken and sessionToken values.
## Client-Side Key Stretching
"Key Stretching" is the practice of running a password through a computationally-expensive one-way function before using it for encryption or authentication. The goal is to make brute-force dictionary attacks more expensive, by raising the cost of testing each guess.
To protect the user's class-B data against active compromise of our keyserver (in which the attacker gets to see `authPW` as it is sent to the server), we perform some key stretching on the client. To further improve protection against static compromise (where the attacker sees the stored verify-hash and wrap(kB) in the server's database), we do additional stretching on the server.
On the server, we use the memory-hard "scrypt" function (pronounced "ess-crypt") for this purpose, as motivated by the attacker-cost studies in [Identity/CryptoIdeas/01-PBKDF-scrypt](https://wiki.mozilla.org/Identity/CryptoIdeas/01-PBKDF-scrypt).
After "stretchedPW" is derived, a second HKDF call is used to derive "srpPW" and "unwrapBKey" which will be used later.
[[images/IdPAuth-main-KDF.png]]
All tokens have an associated tokenID, described below. The server needs to maintain a table that maps the tokenID to the token itself, so it can derive other values from the token later. The tokens are also associated with a specific account, so later API requests do not specify an email address or account ID.
# Crypto Notes
Strong entropy is needed in the following places:
* (server) initial creation of kA and wrap(kB)
* (server) creation of signToken and resetToken
On the server, code should get entropy from /dev/urandom via a function that uses it, like "crypto.randomBytes()" in node.js or "os.urandom()" in python.
An HKDF-based stream cipher is used to protect the contents of some requests. HKDF is used to create a number of random bytes equal to the length of the message, then these are XORed with the plaintext to produce the ciphertext. An HMAC is then computed from the ciphertext, to protect the integrity of the message.
HKDF, like any KDF, is defined to produce output that is indistinguishable from random data ("The HKDF Scheme", http://eprint.iacr.org/2010/264.pdf , by Hugo Krawczyk, section 3). XORing a plaintext with a random keystream to produce ciphertext is a simple and secure approach to data encryption, epitomized by AES-CTR or a stream cipher (http://cr.yp.to/snuffle/design.pdf). HKDF is not the fastest way to generate such a keystream, but it is safe, easy to specify, and easy to implement (just HMAC and XOR).
Each keystream must be unique. We define keyFetchToken to be a single-use randomly-generated value, to ensure our HKDF-XOR keystreams will be unique.
A slightly more-traditional alternative would be to use AES-CTR (with the same HMAC-SHA256 used here), with a randomly-generated IV. This is equally secure, but requires implementors to obtain an AES library (with CTR mode, which does not seem to be universal). An even more traditional technique would be AES-CBC, which introduces the need for padding and a way to specify the length of the plaintext. The additional specification complexity, plus the library load, leads me to prefer HKDF+XOR.
kB is equal to the XOR of wrapKey (which is a deterministic function of the user's email address, password, salt, and the server-side stretching parameters) and the server's randomly-generated wrap(kB) value, making kB a random value too. Using XOR as a wrapping function allows us to avoid sending kB or wrap(kB) in the initial createAccount arguments.
To make this technique safe, any time kB or the password is changed, the mainSalt should be changed too. Otherwise knowledge of both wrap(old-kB) and old-kB would reveal wrapKey, making it easy to deduce the new kB. Changing mainSalt causes wrapKey to change too, preventing this.
There is no MAC on wrap(kB). If the keyserver chooses to deliver a bogus wrap(kB) or kA, the client will discover the problem a moment later when it talks to a storage server and attempts to retrieve data from an unrecognized collection-ID (since we intend to derive collection-IDs from the key used to encrypt their data, which will be derived from kA or kB as appropriate). It might be useful to add a checksum to kA and wrap(kB) to detect accidental corruption (e.g. store and deliver kA+SHA256(kA)), but this doesn't protect against intentional changes. We omit this checksum for now, assuming that disks will be reliable enough to let us never experience such failures.
HAWK provides one thing: integrity/authentication for the request contents (URL, method, and optionally the body). It does not provide confidentiality of the request, or integrity of the response, or confidentiality of the response.
For /certificate/sign, we do not need request confidentiality or response confidentiality, since the client's pubkey and the resulting certificate will both be exposed over a similar SSL connection to the storage server later. And it is sufficient to rely on the response integrity provided by SSL, since the client can verify the returned certificate for itself. For the other keyserver APIs protected by HAWK, these properties are either unnecessary, or are provided by additional mechanisms.
# Glossary
This defines some of the jargon we've developed for this protocol.
* data classes: each type of browser data (bookmarks, passwords, history, etc) can be assigned, by the user, to either class-A or class-B
* class-A: data assigned to this class can be recovered, even if the user forgets their password, by proving control over an email address and resetting the account. It can also be read by Mozilla (since it runs the keyserver and knows kA), or by the user's IdP (by resetting the account without the user's permission).
* class-B: data in this class cannot be recovered if the password is forgotten. It cannot be read by the IdP. Mozilla (via the keyserver) cannot read this data, but can attempt a brute-force dictionary attack against the password.
* kA: the master key for data stored as "class-A", a 32-byte binary string. Individual encryption keys for different datatypes are derived from kA.
* kB: the master key for data stored as "class-B", a 32-byte binary string.
* wrap(kB): an encrypted copy of kB. The keyserver stores wrap(kB) and never sees kB itself. The client (browser) uses a key derived from the user's password to decrypt wrap(kB), obtaining the real kB.
* sessionToken: a long-lived per-device token which allows the device to obtained signed BrowserID certificates for the account's identity (GUID@picl-something.org). This token remains valid until the user revokes it (either by changing their password, or triggering some kind of "revoke a specific device" or "revoke all devices" function).