Overview

This tutorial is designed to provide a basic overview of how to use HttpClient. When you have completed the tutorial you will have written a simple application that downloads a page using HttpClient.

It is assumed that you have an understanding of how to program in Java and are familiar with the development environment you are using.

Getting Ready

The first thing you need to do is get a copy of HttpClient and its dependencies. This tutorial was written for HttpClient 3.0. You will also need JDK 1.3 or above.

Once you've downloaded HttpClient and dependencies you will need to put them on your classpath. There is also an optional dependency on JSSE which is required for HTTPS connections; this is not required for this tutorial.

Concepts

The general process for using HttpClient consists of a number of steps:

  1. Create an instance of HttpClient.
  2. Create an instance of one of the methods (GetMethod in this case). The URL to connect to is passed in to the the method constructor.
  3. Tell HttpClient to execute the method.
  4. Read the response.
  5. Release the connection.
  6. Deal with the response.

We'll cover how to perform each of these steps below. Notice that we go through the entire process regardless of whether the server returned an error or not. This is important because HTTP 1.1 allows multiple requests to use the same connection by simply sending the requests one after the other. Obviously, if we don't read the entire response to the first request, the left over data will get in the way of the second response. HttpClient tries to handle this but to avoid problems it is important to always release the connection.

Upon the connection release HttpClient will do its best to ensure that the connection is reusable.

It is important to always release the connection regardless of whether the server returned an error or not.

Instantiating HttpClient

The no argument constructor for HttpClient provides a good set of defaults for most situations so that is what we'll use.

HttpClient client = new HttpClient();

Creating a Method

The various methods defined by the HTTP specification correspond to the various classes in HttpClient which implement the HttpMethod interface. These classes are all found in the package org.apache.commons.httpclient.methods.

We will be using the Get method which is a simple method that simply takes a URL and gets the document the URL points to.

HttpMethod method = new GetMethod("http://www.apache.org/");

Execute the Method

The actual execution of the method is performed by calling executeMethod on the client and passing in the method to execute. Since networks connections are unreliable, we also need to deal with any errors that occur.

There are two kinds of exceptions that could be thrown by executeMethod, HttpException and IOException.

The other useful piece of information is the status code that is returned by the server. This code is returned by executeMethod as an int and can be used to determine if the request was successful or not and can sometimes indicate that further action is required by the client such as providing authentication credentials.

HttpException

An HttpException represents a logical error and is thrown when the request cannot be sent or the response cannot be processed due to a fatal violation of the HTTP specification. Usually this kind of exceptions cannot be recovered from. For a detailed discussion on protocol exceptions please refer to the HttpClient exception handling guide. Note that HttpException actually extends IOException so you can just ignore it and catch the IOException if your application does not distinguish between protocol and transport errors.

IOException

A plain IOException (which is not a subclass of HttpException) represents a transport error and is thrown when an error occurs that is likely to be a once-off I/O problem. Usually the request has a good chance of succeeding on a second attempt, so per default HttpClient will try to recover the request automatically. For a detailed discussion on transport exceptions please refer to the HttpClient exception handling guide.

Method recovery

Per default HttpClient will automatically attempt to recover from the not-fatal errors, that is, when a plain IOException is thrown. HttpClient will retry the method three times provided that the request has never been fully transmitted to the target server. For a detailed discussion on HTTP method recovery please refer to the HttpClient exception handling guide

// set per default
client.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, 
  new DefaultHttpMethodRetryHandler());

Default recovery procedure can be replaced with a custom one. The number of automatic retries can be increased. HttpClient can also be instructed to retry the method even though the request may have already been processed by the server and the I/O exception has occurred while receiving the response. Please exercise caution when enabling auto-retrial. Use it only if the method is known to be idempotent, that is, it is known to be safe to retry multiple times without causing data corruption or data inconsistency.

The rule of thumb is GET methods are usually safe unless known otherwise, entity enclosing methods such as POST and PUT are usually unsafe unless known otherwise.

DefaultMethodRetryHandler retryhandler = new DefaultMethodRetryHandler(10, true);
client.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, retryhandler);

Read the Response

It is vital that the response body is always read regardless of the status returned by the server. There are three ways to do this:

  • Call method.getResponseBody(). This will return a byte array containing the data in the response body.
  • Call method.getResponseBodyAsString(). This will return a String containing the response body. Be warned though that the conversion from bytes to a String is done using the default encoding so this method may not be portable across all platforms.
  • Call method.getResponseBodyAsStream() and read the entire contents of the stream then call stream.close(). This method is best if it is possible for a lot of data to be received as it can be buffered to a file or processed as it is read. Be sure to always read the entirety of the data and call close on the stream.

For this tutorial we will use getResponseBody() for simplicity.

byte[] responseBody = method.getResponseBody();

Release the Connection

This is a crucial step to keep things flowing. We must tell HttpClient that we are done with the connection and that it can now be reused. Without doing this HttpClient will wait indefinitely for a connection to free up so that it can be reused.

method.releaseConnection();

Deal with the Repsonse

We've now completed our interaction with HttpClient and can just concentrate on doing what we need to do with the data. In our case, we'll just print it out to the console.

It's worth noting that if you were retrieving the response as a stream and processing it as it is read, this step would actually be combined with reading the connection, and when you'd finished processing all the data, you'd then close the input stream and release the connection.

Note: We should pay attention to character encodings here instead of just using the system default.

System.err.println(new String(responseBody));

Final Source Code

When we put all of that together plus a little bit of glue code we get the program below.

import org.apache.commons.httpclient.*;
import org.apache.commons.httpclient.methods.*;
import org.apache.commons.httpclient.params.HttpMethodParams;

import java.io.*;

public class HttpClientTutorial {
  
  private static String url = "http://www.apache.org/";

  public static void main(String[] args) {
    // Create an instance of HttpClient.
    HttpClient client = new HttpClient();

    // Create a method instance.
    GetMethod method = new GetMethod(url);
    
    // Provide custom retry handler is necessary
    method.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, 
    		new DefaultHttpMethodRetryHandler(3, false));

    try {
      // Execute the method.
      int statusCode = client.executeMethod(method);

      if (statusCode != HttpStatus.SC_OK) {
        System.err.println("Method failed: " + method.getStatusLine());
      }

      // Read the response body.
      byte[] responseBody = method.getResponseBody();

      // Deal with the response.
      // Use caution: ensure correct character encoding and is not binary data
      System.out.println(new String(responseBody));

    } catch (HttpException e) {
      System.err.println("Fatal protocol violation: " + e.getMessage());
      e.printStackTrace();
    } catch (IOException e) {
      System.err.println("Fatal transport error: " + e.getMessage());
      e.printStackTrace();
    } finally {
      // Release the connection.
      method.releaseConnection();
    }  
  }
}