Friday, March 23, 2012

Working with Stream Filters

    I love StreamFilters. They're great. I frequently use them for compression or keeping statistics about how much data has been written through a pipe, producing MD5s of the data written, or even using them as event producers, so you can be notified with a stream has been closed. You can stack them like Legos and do all sorts of wonderful things.

    There are three things to keep in mind when working with StreamFilters. Make sure you're always conscience of the order in which you stack them, don't try to compress an encrypted stream for example, always encrypt your compressed stream.  Second, life is simpler if you push all of your different read or write methods into just one of them, and call you're actual filtering code from there. The third thing is that you generally get much better performance from a stream if you use the byte array read and write methods as opposed to the simple read() or write() methods that work with ints.

   When I write a stream filter for a single purpose I can generally just override a single read or write method and ignore the other two methods, but when writing a library, you don't get that luxury. So I sat down to write a complete implementation in an abstract class that I could use all over the place.

    The idea was simple enough, use a byte[] of length 1 to handle implementing the read() method, by calling the read(byte[],offset,length) method with a length of 1, and an offset of 0. This worked fine almost all of the time, but something wasn't right on occasion, my streams would close early. After way too much debugging and not much help from the internet I found the following problem. Bytes returned from read call and stored in a byte[] are signed, while the simple read method only returns unsigned ints. A simple bit masking fixes the problem, if you know that it's the cause.  The following code should give anyone enough information write their own InputStreamFilter and not run into any obscure sign issues.

    @Override
    public int read(byte[] b) throws IOException
    {
        return read(b,0,b.length);
    }
  
    @Override
    public int read() throws IOException
    {
         int bytesRead = read(smallBuffer, 0, 1);
         if (bytesRead > 0)  //because we are always passing a buffer of length 1, this should never be zero
         {
             //the bytes come in as signed, so we need to mask them to unsigned so that code using the int read() method always gets a postive result as expected
             return (int)(smallBuffer[0] & 0xff);
         }
         else //if we're closed, then return -1 to indicate as much
         {
             return -1;
         }
    }