Tuesday, October 07, 2008

Using dynamic proxies for cache implementation

The act of storing a copy of data (which is usually expensive to fetch or compute) from the original source, near (ideally, though not necessary) the users is called caching. Once the data is stored in the cache, future use can be made by accessing the cached copy rather than re-fetching or re-computing the original data, so that the average access time is shorter, leading to performance gains. There are various cache implementations in java e.g. EHCache, Memcached etc.

Usually we interleave cache lookup calls for objects in our business logic as shown in following example-

public Employee getEmployee(long id) {
// See if it is available in cache
Employee employee = MyCache.get(Long.toString(id));
if (employee != null)
return employee;

// Construct new objects and put in cache
employee = new Employee(id, "Vinod", "Singh");
MyCache.put(Long.toString(id), employee);
return employee;
}
Where we first look in cache for the object and return the same if available. If not found then create the new one and put that in cache before returning to client. This kind of implementation looks superfluous and adds to clutter in the code. In my opinion business classes should concentrate on the relevant logic only and developers should not be bothered about caching. The caching logic should be moved out of business logic and kept at one centralized place and intercept the calls to business logic and take up the task of looking for objects in cache or putting newly created objects in cache. This will provide great flexibility to enable/disable and/or change the caching implementation at will with almost no changes in code. In rest of the post I will try to explain how with the help of dynamic proxies we can separate out the caching code from business logic.

Usually the returned object of some method is the thing what we keep in cache. So let us create an annotation, which can be used on any method to indicate that the returned object needs to be cached.
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface Cacheable {
}
The Cacheable annotation will be applied as shown in following example-

public interface HrmItf {

@Cacheable
Employee getEmployee(long id);
}
Now create a proxy, which will look for the object in cache or put it in cache whenever a method with @Cacheable annotation is called.

public class CacheProxy implements InvocationHandler {

private Object obj;

public static T newInstance(Class cls, Object obj) {
return cls.cast(Proxy.newProxyInstance(
obj.getClass().getClassLoader(),
obj.getClass().getInterfaces(), new CacheProxy(obj)));
}

private CacheProxy(Object obj) {
this.obj = obj;
}

@Override
public Object invoke(Object proxy, Method method, Object[] args)
throws Throwable {
// If method is not annotated, don't look in cache
Cacheable cacheable = method.getAnnotation(Cacheable.class);
if (cacheable == null)
return method.invoke(obj, args);

// try to get from cache
String key = args[0].toString();
Object value = MyCache.get(key);
if (value != null)
return value;

// Invoke the actual method and put the result in cache
value = method.invoke(obj, args);
MyCache.put(key, value);
return value;
}
}
When we get a handle to our business interface, actually we will be get a proxy encapsulating the implementation. Here is how we do that-

public void proxyCacheTest() {
HrmItf hrm = CacheProxy.newInstance(HrmItf.class, new HrmImpl());
hrm.getEmployee(1);
}
Now business layer is concerned only about the task it is supposed to do. It looks much cleaner now without any noise.

public Employee getEmployee(long id) {
return new Employee(id, "Vinod", "Singh");
}
If we want to use the in-process caching solution like EHCache or the distributed one like Memcached or want to disable the cache at all, it is the one place CacheProxy.java where we have to make changes.

2 Comments:

Matthew McEachen said...

Thanks for taking the time to write this post!

Your use of only the first parameter in the method call (in CacheProxy, line 24), has a bug if you wrap a method that takes 2 or more parameters.

As an example, let's try to cache a new method whose object has a composite key, getComposite(int key1, int key2).

getComposite(1, 1) will fetch (1, 1), cache the result and associate it to new Integer(1).

If you subsequently call getComposite(1, 2), it will incorrectly return (1, 1) because the first argument matched your cache backing store.

You could serialize the argument array and using that byte array as the key, or something else that looks at the argument list comprehensively to address this issue.

Anonymous said...

Mathew,

Yes, your observations are valid and I am aware of. This post just demonstrates the concept and far form being used as it is in production system. Besides the 'get' one has to handle 'update' and 'delete' functionality as well, so that cache is always in sync with database.