Introduction ============ On our last conference call I explained there were some lingering issues on our proposed session based authentication. It was hard to summarize the issues in a few sentences so I promised a write-up, here it is. In addition to reviewing the issues below I will point out I am not an Apache guru, therefore it is possible I may have drawn an incorrect conclusion or failed to see a possible solution that an Apache expert would be aware of, please keep that in mind as you read the material and offer constructive corrections where appropriate. Of course general comments/questions are sought and welcomed. Overview ======== This describes how we currently authenticate and how we plan to improve authentication performance. First some definitions. There are 4 major players: 1. client 2. mod_auth_kerb (in Apache process) 3. wsgi handler (in IPA wsgi python process) 4. ds (directory server) There are several resources: 1. /ipa/ui (unprotected) 2. /ipa/xmlrpc (protected) 3. /ipa/json (protected) 4. ds (protected, wsgi acts as proxy) Both /ipa/xmlrpc and /ipa/json are RPC URI's and are dispatched by our wsgi handler, both are nearly identical for this discussion so to simplify the discussion let's just use /ipa/rpc to mean any protected RPC resource. Current Model ============= This describes how things work in our current system. 1. Client requests /ipa/ui, this is unprotected, is static and contains no sensitive information. Apache replies with html and javascript. The javascript requests /ipa/rpc. 2. Client sends post to /ipa/rpc. 3. mod_auth_kerb is configured to protect /ipa/rpc, replies 401 authenticate negotiate. 4. Client resends with credentials 5. mod_auth_kerb validates credentials a. if invalid replies 403 access denied (stops here) b. if valid creates temporary ccache, adds KRB5CCNAME to request headers 6. Request passed to wsgi handler a. validates request, KRB5CCNAME must be present, referrer, etc. b. ccache saved and used to bind to ds c. routes to specified RPC handler. 7. wsgi handler replies to client Proposed new session based optimization ======================================= The round trip negotiate and credential validation in steps 3,4,5 is expensive. This can be avoided if we can cache the client credentials. With client sessions we can store the client credentials in the session bound to the client. A few notes about the session implementation. * based on session cookies, cookies must be enabled * session cookie is secure, only passed on secure connections, only passed to our URL resource, never visible to client javascript etc. * session cookie has a session id which is used by wsgi handler to retrieve client session data from shared multi-process cache. Changes to Apache's resource protection --------------------------------------- * /ipa/rpc is no longer protected by mod_auth_kerb. This is necessary to avoid the negotiate expense in steps 3,4,5 above. Instead the /ipa/rpc resource will be protected in our wsgi handler via the session cookie. * A new protected URI is introduced, /ipa/login. This resource does no serve any data, it is used exclusively for authentication. The new sequence is: 1. Client requests /ipa/ui, this is unprotected. Apache replies with html and javascript. The javascript requests /ipa/rpc. 2. Client sends post to /ipa/rpc, which is unprotected. 3. wsgi handler obtains session data from session cookie. a. if ccache is present in session data and is valid - request is further validated - ccache is established for bind to ds - request is routed to RPC handler - wsgi handler eventually replies to client b. if ccache is not present or not valid processing continues ... 4. wsgi handler sends temporary redirect to protected /ipa/login 5. client sends request to /ipa/login 6. mod_auth_kerb replies 401 negotiate on /ipa/login 7. client sends credentials to /ipa/login 8. mod_auth_kerb validates credentials a. if valid - mod_auth_kerb permits access to /ipa/login. wsgi handler is invoked and does the following: * establishes session for client * retrieves the ccache from KRB5CCNAME and stores it * sends temporary redirect back to /ipa/ui. a. if invalid - mod_auth_kerb sends 403 access denied (processing stops) 9. client now requests /ipa/ipc again due to redirect from step 8 and includes session cookie. Processing repeats starting at step 3 and since the session data now contains a valid ccache step 3a executes, a successful reply is sent to client. Note: Web clients can start at the /ipa/login URI or the /ipa/ui URI, either one will work the same. Problems to be solved ===================== We have added one new URI, /ipa/login which is essentially invisible. For web clients with cookies enabled nothing has changed other than better performance. However if cookies are disabled or if web clients such as curl or our command line tools are used to access the /ipa/rpc resource a problem occurs. The /ipa/rpc resource is no longer protected by Kerberos, it's protected by session data, access can only be granted by exchanging session data obtained through a sequence of request/reply interactions that need to follow redirects. Ideally we would like access to the /ipa/rpc resource to be transparently accessible no matter if you're using sessions or straight Kerberos authentication. But this is impossible. Why? The key to understanding the problem is the location where the various steps occur are distinct and do not interact with one another. Rather it is a series of linear sequential steps with no branching control. Let's examine this in more detail. 1. mod_auth_kerb protection is on a file system resource, not a web URI. This is because many URI's may map to a file system resource (uniquely identified by a URL). 2. authentication in Apache occurs after any translations on the incoming URI, thus aliases and rewrites are fully processed to obtain a file system resource. 3. Once a file system resource is determined Apache then asks the question is this file system resource protected, and if so checks access and deny rules and invokes authentication. 4. The application of authentication to a file system resource is never conditional, the authentication requirement is established when Apache initializes and configures itself. The resource is either protected or it's not, period. 5. If our wsgi handler file system resource is protected by mod_auth_kerb and we only will be invoked after the entire negotiate process has been completed and succeeds. Thus any logic in our wsgi handler cannot influence the authentication performed by mod_auth_kerb because our wsgi handler handler runs only after the entire mod_auth_kerb authentication has succeeded. Point 5 above is critical to understand, the /ipa/rpc resource is either Kerberos protected or it's not. You can't have the situation where /ipa/rpc is not Kerberos protected if a valid ccache is in the session data but is Kerberos protected otherwise. Can we just have non session based clients point to /ipa/login? --------------------------------------------------------------- At the moment this is not clear to me. The role of /ipa/login is to obtain credentials and store them in a client session cache and then redirect back the UI. If we want /ipa/login to process RPC requests instead of setting up session data and redirecting then /ipa/login needs to know whether the request was the result of a failed authentication redirect or as an original request. It's possible there may be information in the request header (such as a referer) that could be used to distinguish between the two cases and to take conditional action (note this is occurring in the wsgi handler). But what's the point? It's no longer transparently the same /ipa/rpc URI. The client need to know to use a different URI to access the RPC interface. It can't be the /ipa/rpc URI because of point 5 above. Can mod_rewrite determine which authentication to use? ------------------------------------------------------ Is it possible to utilize mod_rewrite to examine the incoming request and direct it to either a Kerberos protected resource or a session protected resource depending on the contents of the request? Doubtful. To decide which authentication to perform it is necessary to know if a session cookie is available. mod_rewrite give you access to the request headers but very limited abilty to parse them, it could be difficult to robustly inspect the cookie header components and conditionally operate on them. mod_rewrite also has the abilility to call out to an external process, but it appears as only the URI is passed, not full information about the request, such as the headers. It might be possible to construct something with mod_rewrite but it's not immediately obvious it can be done or be done robustly. Maybe this should be investigated further (any mod_rewrite gurus want to chime in?). A possible simple solution -------------------------- Since a non session based client *must* know to use a different URI why overload the /ipa/login URI? We would establish another URI specifically for Kerberos protected RPC resources, for example /ipa/krb/rpc. This would be used by curl and/or our command line tools exactly like /ipa/rpc is used today. The only change is a different URI. For web browsers the javascript would decide which RPC URI to use depending on whether cookies are enabled or not. For cookie enabled browsers /ipa/rpc would be used, when cookies are disabled /ipa/krb/rpc would be used instead.