Introduction to OAuth

OAuth allows you to approve access by any application to your private data stored a website without being forced to disclose your username or password. If you think about it, the practice of handing over your username and password for sites like Yahoo Mail or Twitter has been endemic for quite a while. This has raised some serious concerns because there's nothing to prevent other applications from misusing this data. Yes, some services may appear trustworthy but that is never guaranteed. OAuth resolves this problem by eliminating the need for any username and password sharing, replacing it with a user controlled authorization process.

This authorization process is token based. If you authorize an application (and by application we can include any web based or desktop application) to access your data, it will be in receipt of an Access Token associated with your account. Using this Access Token, the application can access your private data without continually requiring your credentials. In all this authorization delegation style of protocol is simply a more secure solution to the problem of accessing private data via any web service API.

OAuth is not a completely new idea, rather it is a standardized protocol building on the existing properties of protocols such as Google AuthSub, Yahoo BBAuth, Flickr API, etc. These all to some extent operate on the basis of exchanging user credentials for an Access Token of some description. The power of a standardized specification like OAuth is that it only requires a single implementation as opposed to many disparate ones depending on the web service. This standardization has not occurred independently of the major players, and indeed many now support OAuth as an alternative and future replacement for their own solutions.

Zend Framework's Zend_Oauth currently implements a full OAuth Consumer conforming to the OAuth Core 1.0 Revision A Specification (24 June 2009) via the Zend_Oauth_Consumer class.

Protocol Workflow

Before implementing OAuth it makes sense to understand how the protocol operates. To do so we'll take the example of Twitter which currently implements OAuth based on the OAuth Core 1.0 Revision A Specification. This example looks at the protocol from the perspectives of the User (who will approve access), the Consumer (who is seeking access) and the Provider (who holds the User's private data). Access may be read-only or read and write.

By chance, our User has decided that they want to utilise a new service called TweetExpress which claims to be capable of reposting your blog posts to Twitter in a manner of seconds. TweetExpress is a registered application on Twitter meaning that it has access to a Consumer Key and a Consumer Secret (all OAuth applications must have these from the Provider they will be accessing) which identify its requests to Twitter and that ensure all requests can be signed using the Consumer Secret to verify their origin.

To use TweetExpress you are asked to register for a new account, and after your registration is confirmed you are informed that TweetExpress will seek to associate your Twitter account with the service.

In the meantime TweetExpress has been busy. Before gaining your approval from Twitter, it has sent a HTTP request to Twitter's service asking for a new unauthorized Request Token. This token is not User specific from Twitter's perspective, but TweetExpress may use it specifically for the current User and should associate it with their account and store it for future use. TweetExpress now redirects the User to Twitter so they can approve TweetExpress' access. The URL for this redirect will be signed using TweetExpress' Consumer Secret and it will contain the unauthorized Request Token as a parameter.

At this point the User may be asked to log into Twitter and will now be faced with a Twitter screen asking if they approve this request by TweetExpress to access Twitter's API on the User's behalf. Twitter will record the response which we'll assume was positive. Based on the User's approval, Twitter will record the current unauthorized Request Token as having been approved by the User (thus making it User specific) and will generate a new value in the form of a verification code. The User is now redirected back to a specific callback URL used by TweetExpress (this callback URL may be registered with Twitter or dynamically set using an oauth_callback parameter in requests). The redirect URL will contain the newly generated verification code.

TweetExpress' callback URL will trigger an examination of the response to determine whether the User has granted their approval to Twitter. Assuming so, it may now exchange it's unauthorized Request Token for a fully authorized Access Token by sending a request back to Twitter including the Request Token and the received verification code. Twitter should now send back a response containing this Access Token which must be used in all requests used to access Twitter's API on behalf of the User. Twitter will only do this once they have confirmed the attached Request Token has not already been used to retrieve another Access Token. At this point, TweetExpress may confirm the receipt of the approval to the User and delete the original Request Token which is no longer needed.

From this point forward, TweetExpress may use Twitter's API to post new tweets on the User's behalf simply by accessing the API endpoints with a request that has been digitally signed (via HMAC-SHA1) with a combination of TweetExpress' Consumer Secret and the Access Key being used.

Although Twitter do not currently expire Access Tokens, the User is free to deauthorize TweetExpress from their Twitter account settings. Once deauthorized, TweetExpress' access will be cut off and their Access Token rendered invalid.

Security Architecture

OAuth was designed specifically to operate over an insecure HTTP connection and so the use of HTTPS is not required though obviously it would be desireable if available. Should a HTTPS connection be feasible, OAuth offers a signature method implementation called PLAINTEXT which may be utilised. Over a typical unsecured HTTP connection, the use of PLAINTEXT must be avoided and an alternate scheme using. The OAuth specification defines two such signature methods: HMAC-SHA1 and RSA-SHA1. Both are fully supported by Zend_Oauth.

These signature methods are quite easy to understand. As you can imagine, a PLAINTEXT signature method does nothing that bears mentioning since it relies on HTTPS. If you were to use PLAINTEXT over HTTP, you are left with a significant problem: there's no way to be sure that the content of any OAuth enabled request (which would include the OAuth Access Token) was altered en route. This is because unsecured HTTP requests are always at risk of eavesdropping, Man In The Middle (MITM) attacks, or other risks whereby a request can be retooled so to speak to perform tasks on behalf of the attacker by masquerading as the origin application without being noticed by the service provider.

HMAC-SHA1 and RSA-SHA1 alleviate this risk by digitally signing all OAuth requests with the original application's registered Consumer Secret. Assuming only the Consumer and the Provider know what this secret is, a middle-man can alter requests all they wish - but they will not be able to validly sign them and unsigned or invalidly signed requests would be discarded by both parties. Digital signatures therefore offer a guarantee that validly signed requests do come from the expected party and have not been altered en route. This is the core of why OAuth can operate over an unsecure connection.

How these digital signatures operate depends on the method used, i.e. HMAC-SHA1, RSA-SHA1 or perhaps another method defined by the service provider. HMAC-SHA1 is a simple mechanism which generates a Message Authentication Code (MAC) using a cryptographic hash function (i.e. SHA1) in combination with a secret key known only to the message sender and receiver (i.e. the OAuth Consumer Secret and the authorized Access Key combined). This hashing mechanism is applied to the parameters and content of any OAuth requests which are concatenated into a "base signature string" as defined by the OAuth specification.

RSA-SHA1 operates on similar principles except that the shared secret is, as you would expect, each parties' RSA private key. Both sides would have the other's public key with which to verify digital signatures. This does pose a level of risk compared to HMAC-SHA1 since the RSA method does not use the Access Key as part of the shared secret. This means that if the RSA private key of any Consumer is compromised, then all Access Tokens assigned to that Consumer are also. RSA imposes an all or nothing scheme. In general, the majority of service providers offering OAuth authorization have therefore tended to use HMAC-SHA1 by default, and those who offer RSA-SHA1 may offer fallback support to HMAC-SHA1.

While digital signatures add to OAuth's security they are still vulnerable to other forms of attack, such as replay attacks which copy earlier requests which were intercepted and validly signed at that time. An attacker can now resend the exact same request to a Provider at will at any time and intercept its results. This poses a significant risk but it is quiet simple to defend against - add a unique string (i.e. a nonce) to all requests which changes per request (thus continually changing the signature string) but which can never be reused because Providers actively track used nonces within the a certain window defined by the timestamp also attached to a request. You might first suspect that once you stop tracking a particular nonce, the replay could work but this ignore the timestamp which can be used to determine a request's age at the time it was validly signed. One can assume that a week old request used in an attempted replay should be summarily discarded!

As a final point, this is not an exhaustive look at the security architecture in OAuth. For example, what if HTTP requests which contain both the Access Token and the Consumer Secret are eavesdropped? The system relies on at one in the clear transmission of each unless HTTPS is active, so the obvious conclusion is that where feasible HTTPS is to be preferred leaving unsecured HTTP in place only where it is not possible or affordable to do so.

Getting Started

With the OAuth protocol explained, let's show a simple example of it with source code. Our new Consumer will be handling Twitter Status submissions. To do so, it will need to be registered with Twitter in order to receive an OAuth Consumer Key and Consumer Secret. This are utilised to obtain an Access Token before we use the Twitter API to post a status message.

Assuming we have obtained a key and secret, we can start the OAuth workflow by setting up a Zend_Oauth_Consumer instance as follows passing it a configuration (either an array or Zend_Config object).

  1. $config = array(
  2.     'callbackUrl' => 'http://example.com/callback.php',
  3.     'siteUrl' => 'http://twitter.com/oauth',
  4.     'consumerKey' => 'gg3DsFTW9OU9eWPnbuPzQ',
  5.     'consumerSecret' => 'tFB0fyWLSMf74lkEu9FTyoHXcazOWpbrAjTCCK48A'
  6. );
  7. $consumer = new Zend_Oauth_Consumer($config);

The callbackUrl is the URI we want Twitter to request from our server when sending information. We'll look at this later. The siteUrl is the base URI of Twitter's OAuth API endpoints. The full list of endpoints include http://twitter.com/oauth/request_token, http://twitter.com/oauth/access_token, and http://twitter.com/oauth/authorize. The base siteUrl utilises a convention which maps to these three OAuth endpoints (as standard) for requesting a request token, access token or authorization. If the actual endpoints of any service differ from the standard set, these three URIs can be separately set using the methods setRequestTokenUrl(), setAccessTokenUrl(), and setAuthorizeUrl() or the configuration fields requestTokenUrl, accessTokenUrl and authorizeUrl.

The consumerKey and consumerSecret are retrieved from Twitter when your application is registered for OAuth access. These also apply to any OAuth enabled service, so each one will provide a key and secret for your application.

All of these configuration options may be set using method calls simply by converting from, e.g. callbackUrl to setCallbackUrl().

In addition, you should note several other configuration values not explicitly used: requestMethod and requestScheme. By default, Zend_Oauth_Consumer sends requests as POST (except for a redirect which uses GET). The customised client (see later) also includes its authorization by way of a header. Some services may, at their discretion, require alternatives. You can reset the requestMethod (which defaults to Zend_Oauth::POST) to Zend_Oauth::GET, for example, and reset the requestScheme from its default of Zend_Oauth::REQUEST_SCHEME_HEADER to one of Zend_Oauth::REQUEST_SCHEME_POSTBODY or Zend_Oauth::REQUEST_SCHEME_QUERYSTRING. Typically the defaults should work fine apart from some exceptional cases. Please refer to the service provider's documentation for more details.

The second area of customisation is how HMAC operates when calculating/comparing them for all requests. This is configured using the signatureMethod configuration field or setSignatureMethod() . By default this is HMAC-SHA1. You can set it also to a provider's preferred method including RSA-SHA1. For RSA-SHA1, you should also configure RSA private and public keys via the rsaPrivateKey and rsaPublicKey configuration fields or the setRsaPrivateKey() and setRsaPublicKey() methods.

The first part of the OAuth workflow is obtaining a request token. This is accomplished using:

  1. $config = array(
  2.     'callbackUrl' => 'http://example.com/callback.php',
  3.     'siteUrl' => 'http://twitter.com/oauth',
  4.     'consumerKey' => 'gg3DsFTW9OU9eWPnbuPzQ',
  5.     'consumerSecret' => 'tFB0fyWLSMf74lkEu9FTyoHXcazOWpbrAjTCCK48A'
  6. );
  7. $consumer = new Zend_Oauth_Consumer($config);
  8.  
  9. // fetch a request token
  10. $token = $consumer->getRequestToken();

The new request token (an instance of Zend_Oauth_Token_Request ) is unauthorized. In order to exchange it for an authorized token with which we can access the Twitter API, we need the user to authorize it. We accomplish this by redirecting the user to Twitter's authorize endpoint via:

  1. $config = array(
  2.     'callbackUrl' => 'http://example.com/callback.php',
  3.     'siteUrl' => 'http://twitter.com/oauth',
  4.     'consumerKey' => 'gg3DsFTW9OU9eWPnbuPzQ',
  5.     'consumerSecret' => 'tFB0fyWLSMf74lkEu9FTyoHXcazOWpbrAjTCCK48A'
  6. );
  7. $consumer = new Zend_Oauth_Consumer($config);
  8.  
  9. // fetch a request token
  10. $token = $consumer->getRequestToken();
  11.  
  12. // persist the token to storage
  13. $_SESSION['TWITTER_REQUEST_TOKEN'] = serialize($token);
  14.  
  15. // redirect the user
  16. $consumer->redirect();

The user will now be redirected to Twitter. They will be asked to authorize the request token attached to the redirect URI's query string. Assuming they agree, and complete the authorization, they will be again redirected, this time to our Callback URL as previously set (note that the callback URL is also registered with Twitter when we registered our application).

Before redirecting the user, we should persist the request token to storage. For simplicity I'm just using the user's session, but you can easily use a database for the same purpose, so long as you tie the request token to the current user so it can be retrieved when they return to our application.

The redirect URI from Twitter will contain an authorized Access Token. We can include code to parse out this access token as follows - this source code would exist within the executed code of our callback URI. Once parsed we can discard the previous request token, and instead persist the access token for future use with the Twitter API. Again, we're simply persisting to the user session, but in reality an access token can have a long lifetime so it should really be stored to a database.

  1. $config = array(
  2.     'callbackUrl' => 'http://example.com/callback.php',
  3.     'siteUrl' => 'http://twitter.com/oauth',
  4.     'consumerKey' => 'gg3DsFTW9OU9eWPnbuPzQ',
  5.     'consumerSecret' => 'tFB0fyWLSMf74lkEu9FTyoHXcazOWpbrAjTCCK48A'
  6. );
  7. $consumer = new Zend_Oauth_Consumer($config);
  8.  
  9. if (!empty($_GET) && isset($_SESSION['TWITTER_REQUEST_TOKEN'])) {
  10.     $token = $consumer->getAccessToken(
  11.                  $_GET,
  12.                  unserialize($_SESSION['TWITTER_REQUEST_TOKEN'])
  13.              );
  14.     $_SESSION['TWITTER_ACCESS_TOKEN'] = serialize($token);
  15.  
  16.     // Now that we have an Access Token, we can discard the Request Token
  17.     $_SESSION['TWITTER_REQUEST_TOKEN'] = null;
  18. } else {
  19.     // Mistaken request? Some malfeasant trying something?
  20.     exit('Invalid callback request. Oops. Sorry.');
  21. }

Success! We have an authorized access token - so it's time to actually use the Twitter API. Since the access token must be included with every single API request, Zend_Oauth_Consumer offers a ready-to-go HTTP client (a subclass of Zend_Http_Client) to use either by itself or by passing it as a custom HTTP Client to another library or component. Here's an example of using it standalone. This can be done from anywhere in your application, so long as you can access the OAuth configuration and retrieve the final authorized access token.

  1. $config = array(
  2.     'callbackUrl' => 'http://example.com/callback.php',
  3.     'siteUrl' => 'http://twitter.com/oauth',
  4.     'consumerKey' => 'gg3DsFTW9OU9eWPnbuPzQ',
  5.     'consumerSecret' => 'tFB0fyWLSMf74lkEu9FTyoHXcazOWpbrAjTCCK48A'
  6. );
  7.  
  8. $statusMessage = 'I\'m posting to Twitter using Zend_Oauth!';
  9.  
  10. $token = unserialize($_SESSION['TWITTER_ACCESS_TOKEN']);
  11. $client = $token->getHttpClient($configuration);
  12. $client->setUri('http://twitter.com/statuses/update.json');
  13. $client->setMethod(Zend_Http_Client::POST);
  14. $client->setParameterPost('status', $statusMessage);
  15. $response = $client->request();
  16.  
  17. $data = Zend_Json::decode($response->getBody());
  18. $result = $response->getBody();
  19. if (isset($data->text)) {
  20.     $result = 'true';
  21. }
  22. echo $result;

As a note on the customised client, this can be passed to most Zend Framework service or other classes using Zend_Http_Client displacing the default client they would otherwise use.

blog comments powered by Disqus