Caucho maker of Resin Server | Application Server (Java EE Certified) and Web Server


 

Resin Documentation

home company blog wiki docs 
app server web server 
health cloud java ee pro 
 Resin Server | Application Server (Java EE Certified) and Web Server
 

server caching


Server caching can speed dynamic pages to near-static speeds. When pages created by database queries only change every 15 minutes, e.g. CNN or Wikipedia or Slashdot, Resin can cache the results and serve them like static pages. Because Resin's caching only depends on HTTP headers, it will work for any JSPs, servlet or PHP page.response.

Resin's caching operates like a proxy cache, looking at HTTP headers to compare hash codes or digests or simply caching for a static amount of time. Since the proxy cache follows the HTTP standards, applications like Mediawiki will automatically see dramatic performance improvement with no extra work. You can even cache REST-style GET requests.

Because the cache supports advanced headers like "Vary", it can cache different versions of the page depending on the browser's capabilities. Gzip-enabled browsers will get the cached compressed page while more primitive browsers will see the plan page. With "Vary: Cookie", you can return a cached page for anonymous users, and still return a custom page for logged-in users.

Overview

For many applications, enabling the proxy cache can improve your application's performance dramatically. When Quercus runs Mediawiki with caching enabled, Resin can return results as fast as static pages. Without caching, the performance is significantly slower.

Mediawiki Performance
CACHE PERFORMANCENON-CACHE PERFORMANCE
4316 requests/sec29.80 requests/sec

To enable caching, your application will need to set a few HTTP headers. While a simple application can just set a timed cache with max-age, a more sophisticated application can generate page hash digests with ETag and short-circuit repeated If-None-Match responses.

If subsections of your pages are cacheable but the main page is not, you can cache servlet includes just like caching top-level pages. Resin's include processing will examine the headers set by your include servlet and treat them just like a top-level cache.

HTTP Caching Headers

HTTP Server to Client Headers
HEADERDESCRIPTION
Cache-Control: privateRestricts caching to the browser only, forbidding proxy-caching.
Cache-Control: max-age=nSpecifies a static cache time in seconds for both the browser and proxy cache.
Cache-Control: s-maxage=nSpecifies a static cache time in seconds for the proxy cache only.
Cache-Control: no-cacheDisables caching entirely.
ETag: hash or identifierUnique identifier for the page's version. Hash-based values are better than date/versioning, especially in clustered configurations.
Last-Modified: time of modificationAccepted by Resin's cache, but not recommended in clustered configurations.
Vary: header-nameCaches the client's header, e.g. Cookie, or Accept-encoding
HTTP Client to Server Headers
HEADERDESCRIPTION
If-None-MatchSpecifies the ETag value for the page
If-Modified-SinceSpecifies the Last-Modified value for the page

Cache-Control: max-age

Setting the max-age header will cache the results for a specified time. For heavily loaded pages, even setting short expires times can significantly improve performance. Pages using sessions should set a "Vary: Cookie" header, so anonymous users will see the cached page, while logged-in users will see their private page.

Example: 15s cache
<%@ page session="false" %>
<%! int counter; %>
<%
response.addHeader("Cache-Control", "max-age=15");
%>
Count: <%= counter++ %>

max-age is useful for database generated pages which are continuously, but slowly updated. To cache with a fixed content, i.e. something which has a valid hash value like a file, you can use ETag with If-None-Match.

ETag and If-None-Match

The ETag header specifies a hash or digest code for the generated page to further improve caching. The browser or cache will send the ETag as a If-None-Match value when it checks for any page updates. If the page is the same, the application will return a 304 NOT_MODIFIED response with an empty body. Resin's FileServlet automatically provides this capability for static pages. In general, the ETag is the most effective caching technique, although it requires a bit more work than max-age.

To handle clustered servers in a load-balanced configuration, the calculated ETag should be a hash of the result value, not a timestamp or version. Since each server behind a load balancer will generate a different timestamp for the files, each server would produce a different tag, even though the generated content was identical. So either producing a hash or ensuring the ETag value is the same is critical.

ETag servlets will often also use <cache-mapping> configuration to set a max-age or s-maxage. The browser and proxy cache will cache the page without revalidation until max-age runs out. When the time expires, it will use If-None-Match to revalidate the page.

When using ETag, your application will need to look for the If-None-Match header on incoming requests. If the value is the same, your servlet can return 304 NOT-MODIFIED. If the value differs, you'll return the new content and hash.

Example: ETag servlet
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class MyServlet extends HttpServlet
{
  public void doGet(HttpServletRequest req, HttpServletResponse res)
  {
    String etag = getCurrentEtag();

    String ifNoneMatch = req.getHeader("If-None-Match");

    if (ifNoneMatch != null && ifNoneMatch.equals(etag)) {
      res.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
      return;
    }

    res.setHeader("ETag", etag);

    ... // generate response
  }
}
Example: HTTP headers for ETag match
C: GET /test-servlet HTTP/1.1

S: HTTP/1.1 200 OK
S: ETag: xm8vz29I
S: Cache-Control: max-age=15s
S: ...

C: GET /test-servlet HTTP/1.1
C: If-None-Match: xm8vz29I

S: HTTP/1.1 304 Not Modified
S: Cache-Control: max-age=15s
S: ...
Example: HTTP headers for ETag mismatch
C: GET /test-servlet HTTP/1.1
C: If-None-Match: UXi456Ww

S: HTTP/1.1 200 OK
S: ETag: xM81x+3j
S: Cache-Control: max-age=15s
S: ...

Expires

Although max-age tends to be easier and more flexible, an application can also set the Expires header to enable caching, when the expiration date is a specific time instead of an interval. For heavily loaded pages, even setting short expires times can significantly improve performance. Sessions should be disabled for caching.

The following example sets expiration for 15 seconds. So the counter should update slowly.

Example: expires
<%@ page session="false" %>
<%! int counter; %>
<%
long now = System.currentTimeMillis();
response.setDateHeader("Expires", now + 15000);
%>
Count: <%= counter++ %>

Expires is useful for database generated pages which are continuously, but slowly updated. To cache with a fixed content, i.e. something which has a valid hash value like a file, you can use ETag with If-None-Match.

If-Modified-Since

The If-Modified-Since headers let you cache based on an underlying change date. For example, the page may only change when an underlying source page changes. Resin lets you easily use If-Modified by overriding methods in HttpServlet or in a JSP page.

Because of the clustering issues mentioned in the ETag section, it's generally recommended to use ETag and If-None-Match and avoid If-Modified-Since. In a load balanced environment, each backend server would generally have a different Last-Modified value, while would effectively disable caching for a proxy cache or a browser that switched from one backend server to another.

The following page only changes when the underlying 'test.xml' page changes.

<%@ page session="false" %>
<%!
int counter;

public long getLastModified(HttpServletRequest req)
{
  String path = req.getRealPath("test.xml");
  return new File(path).lastModified();
}
%>
Count: <%= counter++ %>

If-Modified pages are useful in combination with the cache-mapping configuration.

Vary

In some cases, you'll want to have separate cached pages for the same URL depending on the capabilities of the browser. Using gzip compression is the most important example. Browsers which can understand gzip-compressed files receive the compressed page while simple browsers will see the uncompressed page. Using the "Vary" header, Resin can cache different versions of that page.

Example: vary caching for on gzip
<%
  response.addHeader("Cache-Control", "max-age=3600");
  response.addHeader("Vary", "Accept-Encoding");
%>

Accept-Encoding: <%= request.getHeader("Accept-Encoding") %>

The "Vary" header can be particularly useful for caching anonymous pages, i.e. using "Vary: Cookie". Logged-in users will get their custom pages, while anonymous users will see the cached page.

Included Pages

Resin can cache subpages even when the top page can't be cached. Sites allowing user personalization will often design pages with jsp:include subpages. Some subpages are user-specific and can't be cached. Others are common to everybody and can be cached.

Resin treats subpages as independent requests, so they can be cached independent of the top-level page. Try the following, use the first expires counter example as the included page. Create a top-level page that looks like:

Example: top-level non-cached page
<% if (! session.isNew()) { %>
<h1>Welcome back!</h1>
<% } %>

<jsp:include page="expires.jsp"/>
Example: cached include page
<%@ page session="false" %>
<%! int counter; %>
<%
response.setHeader("Cache-Control", "max-age=15");
%>
Count: <%= counter++ %>

Caching Anonymous Users

The Vary header can be used to implement anonymous user caching. If a user is not logged in, he will get a cached page. If he's logged in, he'll get his own page. This feature will not work if anonymous users are assigned cookies for tracking purposes.

To make anonymous caching work, you must set the Vary: Cookie If you omit the Vary header, Resin will use the max-age to cache the same page for every user.

Example: 'Vary: Cookie' for anonymous users
<%@ page session="false" %>
<%! int _counter; %>
<%
response.addHeader("Cache-Control", "max-age=15");
response.addHeader("Vary", "Cookie");

String user = request.getParameter("user");
%>
User: <%= user %> <%= counter++ %>

The top page must still set the max-age or If-Modified header, but Resin will take care of deciding if the page is cacheable or not. If the request has any cookies, Resin will not cache it and will not use the cached page. If it has no cookies, Resin will use the cached page.

When using Vary: Cookie, user tracking cookies will make the page uncacheable even if the page is the same for all users. Resin chooses to cache or not based on the existence of any cookies in the request, whether they're used or not.

Configuration

proxy-cache

child of cluster

<proxy-cache> configures the proxy cache (requires Resin Professional). The proxy cache improves performance by caching the output of servlets, jsp and php pages. For database-heavy pages, this caching can improve performance and reduce database load by several orders of magnitude.

The proxy cache uses a combination of a memory cache and a disk-based cache to save large amounts of data with little overhead.

Management of the proxy cache uses the ProxyCacheMXBean.

<proxy-cache> Attributes
ATTRIBUTEDESCRIPTIONDEFAULT
pathPath to the persistent cache files.resin-data/
disk-sizeMaximum size of the cache saved on disk.1024M
enableEnables the proxy cache.true
enable-rangeEnables support for the HTTP Range header.true
entriesMaximum number of pages stored in the cache.8192
max-entry-sizeLargest page size allowed in the cache.1M
memory-sizeMaximum heap memory used to cache blocks.16M
rewrite-vary-as-privateRewrite Vary headers as Cache-Control: private to avoid browser and proxy-cache bugs (particularly IE).false
<proxy-cache> schema
element cache {
  disk-size?
  & enable?
  & enable-range?
  & entries?
  & path?
  & max-entry-size?
  & memory-size?
  & rewrite-vary-as-private?
}
Example: enabling proxy cache
<resin xmlns="http://caucho.com/ns/resin">
    <cluster id="web-tier">
        <proxy-cache entries="16384" disk-size="2G" memory-size="256M"/>

        <server id="a" address="192.168.0.10"/>

        <host host-name="www.foo.com">
    </cluster>
</resin>

rewrite-vary-as-private

Because not all browsers understand the Vary header, Resin can rewrite Vary to a Cache-Control: private. This rewriting will cache the page with the Vary in Resin's proxy cache, and also cache the page in the browser. Any other proxy caches, however, will not be able to cache the page.

The underlying issue is a limitation of browsers such as IE. When IE sees a Vary header it doesn't understand, it marks the page as uncacheable. Since IE only understands "Vary: User-Agent", this would mean IE would refuse to cache gzipped pages or "Vary: Cookie" pages.

With the <rewrite-vary-as-private> tag, IE will cache the page since it's rewritten as "Cache-Control: private" with no Vary at all. Resin will continue to cache the page as normal.

cache-mapping

cache-mapping assigns a max-age and Expires to a cacheable page, i.e. a page with an ETag or Last-Modified setting. It does not affect max-age or Expires cached pages. The FileServlet takes advantage of cache-mapping because it provides the ETag servlet.

Often, you want a long Expires time for a page to a browser. For example, any gif will not change for 24 hours. That keeps browsers from asking for the same gif every five seconds; that's especially important for tiny formatting gifs. However, as soon as that page or gif changes, you want the change immediately available to any new browser or to a browser using reload.

Here's how you would set the Expires to 24 hours for a gif, based on the default FileServlet.

Example: caching .gif files for 24h
<web-app xmlns="http://caucho.com/ns/resin">

  <cache-mapping url-pattern='*.gif'
                 expires='24h'/>
</web-app>

The cache-mapping automatically generates the Expires header. It only works for cacheable pages setting If-Modified or ETag. It will not affect pages explicily setting Expires or non-cacheable pages. So it's safe to create a cache-mapping for *.jsp even if only some are cacheable.

Debugging caching

When designing and testing your cached page, it's important to see how Resin is caching the page. To turn on logging for caching, you'll add the following to your resin.xml:

Example: adding caching log
<resin xmlns="http://caucho.com/ns/resin">

  <logger name="com.caucho.server.httpcache" level="fine"/>

  ...

</resin>

The output will look something like the following:

[10:18:11.369] caching: /images/caucho-white.jpg etag="AAAAPbkEyoA" length=6190
[10:18:11.377] caching: /images/logo.gif etag="AAAAOQ9zLeQ" length=571
[10:18:11.393] caching: /css/default.css etag="AAAANzMooDY" length=1665
[10:18:11.524] caching: /images/pixel.gif etag="AAAANpcE4pY" length=61

...

[10:18:49.303] using cache: /css/default.css
[10:18:49.346] using cache: /images/pixel.gif
[10:18:49.348] using cache: /images/caucho-white.jpg
[10:18:49.362] using cache: /images/logo.gif

Caching Proxy

It's important to realize that Resin's <proxy-cache> can be applied to any Resin layer and will operate on that layer locally. For example, consider a typical 3-tier architecture where a Resin "web-tier" serves static requests and load-balances dynamic requests to a resin "app-tier" which executes business logic and connects to the database. In this architecture, <proxy-cache> is typically only enabled on the web-tier, where you may choose to cache not only static content but dynamic responses from the app-tier also. What you cache is always configurable via <cache-mapping>. However the cache is still a "local" cache, in that it's caching responses that generated by the local server or have flowed through the local server.

Based on the example architecture above, some may consider Resin's <proxy-cache> a misnomer, as it's not actually proxying to another server, in the traditional sense of the term. More technically, it's a proxy in the servlet filter chain; it intercepts requests, responds with cached data, and halts the execution of the servlet filter chain before the actually business logic is executed.

HTTP Proxy

Resin includes an HTTP Proxy that acts as an intermediary for client requests to one or more backend HTTP servers. However using the HTTP Proxy does not imply caching.

It's critical to understand that <proxy-cache> and <resin:HttpProxy> are two separate distict components which can be enabled and disabled separately and used independently!

resin.xml comes packaged with a proxycache tier, which includes a preconfigured <resin:HttpProxy regexp=".*"> element, set to proxy all requests to backend HTTP servers. If <proxy-cache> is also enabled on this tier, then it will act as a caching proxy server. It is up to you to configure HttpProxy and proxy-cache as appropriate for HTTP proxying and caching independently.

Note: backend server do not have to be Resin instances; the requests are sent using standard HTTP protocol with no Resin specific attributes.

<resin:HttpProxy>

Proxies the request to a backend server using HTTP as a proxy protocol.

<resin:HttpProxy> Attributes
ATTRIBUTEDESCRIPTIONDEFAULT VALUE
regexpThe regular expression of the URL to match.none
addressaddress:port of an HTTP servernone
addressesaddress:port of HTTP servers (space separated)none
backenda <backend> with address and optional timeouts (see below)none
strategyround-robin or adaptiveadaptive
connect-timeoutthe maximum time a proxy connection to a backend should take5s
socket-timeoutthe maximum time a the proxy will wait for a read or write to a backend30s
idle-timethe maximum time the proxy will leave an idle socket before closing it (must be less than the keepalive value of the target backend)10s
warmup-timetime the proxy uses to throttle connections to a backend server that's just starting up60s
recover-timethe maximum time the proxy will consider the backend dead after a failure before retrying15s
<backend> Attributes
ATTRIBUTEDESCRIPTIONDEFAULT VALUE
addressaddress:port of an HTTP servernone
connect-timeoutthe maximum time a proxy connection to the backend should take5s
socket-timeoutthe maximum time a the proxy will wait for a read or write to the backend30s
idle-timethe maximum time the proxy will leave an idle socket before closing it (must be less than the keepalive value of the target backend)10s
warmup-timetime the proxy uses to throttle connections to the backend server that's just starting up60s
recover-timethe maximum time the proxy will consider the backend dead after a failure before retrying15s
weightassigns a load-balance weight to a backend server. Servers with higher values get more requests. Servers with lower values get fewer requests100
Proxying requests to local HTTP servers
<web-app xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin">

   <resin:HttpProxy regexp="\.cgi$">
     <strategy>adaptive</strategy>
     <backend address="127.0.0.1:12345" weight="100">
     <backend address="127.0.0.1:65432" weight="200">
   </resin:HttpProxy>

</web-app>

Administration

/resin-admin

The /resin-admin URL provides an overview of the current state of Resin.

block cache miss ratio

The block cache miss ratio tells how often Resin needs to access the disk to read a cache entry. Most cache requests should come from memory to improve performance, but cache entries are paged out to disk when the cache gets larger. It's very important to keep the <memory-size> tag of the <proxy-cache> large enough so the block cache miss ratio is small.

proxy cache miss ratio

The proxy cache miss ratio measures how often cacheable pages must go to their underlying servlet instead of being cached. The miss ratio does not measure the non-cacheable pages.

invocation miss ratio

The invocation miss ratio measures how often Resin's invocation cache misses. The invocation cache is used for both cacheable and non-cacheable pages to save the servlet and filter chain construction. A miss of the invocation cache is expensive, since it will not only execute the servlet, but also force the servlet and filter chain to be rebuilt. The <entries> field of the <proxy-cache> controls the invocation miss ratio.

BlockManagerMXBean

BlockManagerMXBean returns statistics about the block cache. Since Resin's block cache is used for the proxy cache as well as clustered sessions and JMS messages, the performance of the block cache is very important. The block cache is a memory/paging cache on top of a file-based backing store. If the block cache misses, a request needs to go to disk, otherwise it can be served directly from memory.

BlockManagerMXBean ObjectName
resin:type=BlockManager
BlockManagerMXBean.java
public interface BlockManagerMXBean {
  public long getBlockCapacity();

  public long getHitCountTotal();
  public long getMissCountTotal();
}

ProxyCacheMXBean

The ProxyCacheMXBean provides statistics about the proxy cache as well as operations to clear the cache. The hit and miss counts tell how effectively the cache is improving performance.

ProxyCacheMXBean ObjectName
resin:type=ProxyCache
ProxyCacheMXBean.java
public interface ProxyCacheMXBean {
  public long getHitCountTotal();
  public long getMissCountTotal();

  public CacheItem []getCacheableEntries(int max);
  public CacheItem []getUncacheableEntries(int max);

  public void clearCache();
  public void clearCacheByPattern(String hostRegexp, String urlRegexp);
  public void clearExpires();
}

Copyright © 1998-2012 Caucho Technology, Inc. All rights reserved. Resin ® is a registered trademark. Quercustm, and Hessiantm are trademarks of Caucho Technology.

Cloud-optimized Resin Server is a Java EE certified Java Application Server, and Web Server, and Distributed Cache Server (Memcached).
Leading companies worldwide with demand for reliability and high performance web applications including SalesForce.com, CNET, DZone and many more are powered by Resin.

home company blog wiki docs 
app server web server 
health cloud java ee pro