Capturing incoming HTTP requests and outgoing responses

Logback ships with a module called logback-access which integrates with Servlet containers such as Jetty or Tomcat to provide rich and powerful HTTP-access log functionality.

When diagnosing issues in a web-application, it is very helpful to be able to capture incoming HTTP requests and the outgoing response. This is particularly true for a web-applications offering web-services using SOAP. Having access to the "raw" SOAP message makes it much easier to diagnose errors caused by misspelled namespaces or missing fields.

Certain Web-Service stacks can be easily configured to log SOAP traffic. For instance, with JBoss' Web-services stack, you can set the level of the org.jboss.ws.core.MessageTrace logger to TRACE in order to enable SOAP message tracing as documented in the Jboss wiki.

If you are using a different Web-Services stack, e.g. Metro, it might not be very convenient to enable SOAP message tracing. In that case, and assuming your application is deployed on Tomcat, you might want to enable RequestDumperFilter. Unfortunately, not only is RequestDumperFilter rather hard to install, it does not dump the payload of the incoming request nor the outgoing response.

With the help of logback-access its TeeFilter you can capture the full input and output for each request as explained below.

Capturing

The TeeFilter, as any other servlet filter, needs to be declared in your web-application's web.xml file. Here is the declaration to add to your web-application's web.xml file.

<filter>
  <filter-name>TeeFilter</filter-name>
  <filter-class>ch.qos.logback.access.servlet.TeeFilter</filter-class>
</filter>

<filter-mapping>
  <filter-name>TeeFilter</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>

The TeeFilter requires the logback-access module to be installed in your web-server. The installation of logback-access is explained elsewhere. Once you have installed logback-access into your Servlet container, e.g. Tomcat or Jetty, you can configure logback-access according to your wishes with the help of a configuration file named logback-access.xml.

Here is a sample logback-access.xml configuration file which will output the full contents of the request and response on the console.

<configuration>
  <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <encoder>      
      <pattern>%fullRequest%n%n%fullResponse</pattern>
    </encoder>
  </appender>
	
  <appender-ref ref="STDOUT" />
</configuration>

For the list of conversion words supported by logback-access' PatternLayout please refer to its documentation.

Here is the output generated when accessing the logback-demo application configured as shown above, yields:

GET /logback-demo/index.jsp HTTP/1.1 Host: localhost:8080 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20070312 Firefox/1.5.0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://localhost:8080/logback-demo/login.jsp Cookie: JSESSIONID=15c7tqi9ehlwk; OID324nkzcmr=null; OID32862zgoa=null; HTTP/1.1 200 OK Content-Type: text/html; charset=iso-8859-1 Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: JSESSIONID=bgebt99ce9om;path=/logback-demo <html> <head> <LINK REL=StyleSheet HREF="css/pk.css" /> </head> <body> <h2>Logback demo center</h2> [snip, so that text is reasonably sized]

 

Disabling TeeFilter in the production environment

Due to its intrusive nature, TeeFilter can slow down performance. Moreover, although we have fixed all currently known bugs, TeeFilter has broken otherwise correctly behaving applications in the past. Thus, while extremely useful during problem hunting, we do not recommend having TeeFilter active in production systems. In order to avoid shipping different code for test and production environments, TeeFilter supports includes and excludes parameters. TeeFilter will be active if the current host is listed in the includes list and absent in the excludes list. By special convention, an empty includes list is interpreted as to contain all possible host names in the universe.

Assume we wish to capture HTTP traffic on all hosts except on orion and gemini, the hostnames of the the production systems, we would write:

<filter>
  <filter-name>TeeFilter</filter-name>
  <filter-class>ch.qos.logback.access.servlet.TeeFilter</filter-class>
  <init-param>
    <!-- exclude captures on production systems -->
    <param-name>excludes</param-name>
    <param-value>orion, gemini</param-value>
  </init-param>
</filter>  

If it is easier to explicitly name the integration machines, you could list them in the includes list and omit the excludes list.

Filtering captured requests

Let assume that our web-application is deployed in a cluster. The cluster is located behind a load-balancer which probes each member of the cluster once or twice a second to check whether it is alive. Whenever a member becomes unavailable the load-balancer will immediately (well, after at most one second) divert traffic from that member.

While such a load-balancing strategy will ensure high-availability of your web-application, it will also seriously pollute the access-logs with the contents of each probe made by the load-balancer.

We need to a way to filter-out these probes so that they no longer contaminate the log output. In other words, we need to distinguish probes emanating from the load-balancer from other requests. The contents of the probes can give us clues about possible distinguishing criteria.

Here is a sample probe as logged by logback-access:

HEAD /myapp/probe HTTP/1.1
connection: Close
host: 192.168.1.1

HTTP/1.1 200 OK
Expires: Thu, 01 Jan 1970 01:00:00 CET
X-Powered-By: Servlet 2.4;
Cache-Control: no-cache
Pragma: No-cache 

From the above, we can see that the load-balancer probes employ the HTTP HEAD method instead of the usual GET or POST. We can also see that the request URI for the probes contains the string "probe".

Here is a logback-access.xml configuration file which will deny any AccessEvent where the method is HEAD. Note that events are evaluated using JaninoEventEvaluator which requires Janino.

<configuration>
  <!-- always a good idea to install OnConsoleStatusListener -->
  <statusListener class="ch.qos.logback.core.status.OnConsoleStatusListener" />

  <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <filter class="ch.qos.logback.core.filter.EvaluatorFilter">
      <!-- in the absence of a class attribute the <evaluator> element 
           defaults to ch.qos.logback.access.boolex.JaninoEventEvaluator -->
      <evaluator>
        <expression>event.getMethod().equals("HEAD")</expression>
      </evaluator>
      <onMismatch>NEUTRAL</onMismatch>
      <onMatch>DENY</onMatch>
    </filter>

    <encoder>
      <pattern>%fullRequest%n%n%fullResponse</pattern>
    </encoder>
  </appender>

  <appender-ref ref="STDOUT" />
</configuration>