Overview
In order for Jetspeed to support remote content (XML) subscription and
publication, it is important that this information is buffered locally. The
DiskCache mechanism handles fetching content and assuring that it is always
updated. All Jetspeed code should use this mechanism to ensure performance
and reliability.
Operation
The disk cache will download remote URLs and store them as files on the
local filesystem. Local URLs will be served from the webserver. Static URLs
that are served from the webservers will have better performance than URLs
served from dynamic content engines (PHP, ASP, JSP, Servlet, etc).
Strategy
Jetspeed employees numerous stategies for ensuring that content is
continually updated:
-
DiskCacheDaemon: Continually goes over every entry within the cache at a
set interval (the frequency is controlled using daemon.diskcachedaemon.interval
property - you can also start this process manually using the DaemonAdminPortlet).
If an entry is noticed to be out of date (either by HTTP header info or by a
MAX interval) it is fetched and placed and the updated version placed within the cache.
All Portlets and code that utilize this URL will be updated. External URL entry that
returns an "Expires:" header will use the returned value. The URL entry that specifies
"0" or does not have an "Expires:" header will be expired based on cache.default.expiration
property (every 15 minutes by default).
-
Request Trigger: If Jetspeed notices that a URL was requested but is not
within the cache. It will throw an IOException which states this this URL
is currently not available. The DiskCache will then asynchronously update
itself so that the next requests will have this URL available.
-
Asynchonous Bulk Update: The OCS support in Jetspeed will bulk update
the cache with remote content. It is a fair assumption that if OCS uses
this content that it will be needed within a Portlet at some time in the
future.
-
OCS interval monitoring: If an OCS entry specifies that an interval should
be updated at a specified interval, Jetspeed will honor this and fetch the
updated content.
There are also some performance features that ensure that Jetspeed NEVER wastes
CPU time during content updates:
-
One update per interval: Jetspeed will NEVER update content more than once
per interval. If it didn't it might be possible for more than one resource
to trigger a content fetch which would waste both CPU and bandwidth.
-
Bad URL monitoring: Jetspeed keeps track of URLs that it can't fetch. If
it is ever restarted it will reload this so that time isn't wasted on failed
content.