Question:
I am running a small number of web sites with Apache on Red Hat linux
and I have what should be a simple problem. The pages are simple HTML
documents and I want Apache to direct the browsers to refresh the
pages (or expire it's local cache of the page) after 3 hours. Right
now, when we update a page on somebody's site, the people who have
visited the site earlier don't see the update unless they force their
browser to refresh. This is a serious problem given that some of
these pages need to change every couple of days and most browsers
cache content for 15 days or more. On IIS you can establish this
setting by going to the 'HTTP Headers' configuration tab and setting
the appropriate content expiration settings. I can't seem to find the
equivalent settings for httpd.conf (or any configuration file) in
Apache.
Answer:
Do you really want that. Browser do an IfModifiedSince at least once per
session to check if the page has been modified unless they are
(explicitely) konfigured to not do that.
If so you can use "Expires*" options. But be aware that storing pages
which have expired is considered unreasonable by some search engines.
I'd recommend visiting the excellent cacheing tutorial at
http://www.mnot.net/cache_docs/
and also follow the link to the "cacheability engine" where you can
test the properties of your own pages.
There's something wrong with your report here. I'd find it hard to
believe that a page that has been changed within the last 3 days would
be cached for 15 days - that's a cacheing strategy that's aggressive
beyond all reason. Even the infamous old AOL cache didn't cache
volatile pages for more than 24 hours, as I understand it.
On the other hand if a page has been unchanged for 12 months then
it wouldn't be surprising if cached copies were used for a day or two
thereafter.
The hard part is deciding what's "appropriate". And don't forget that
the client also gets a choice (typical browsers have some kind of
configuration option, e.g to check unexpired pages for an update at
every access - or once per session - or not at all, at their sole
discretion. By trying too aggressively to expire pages, it's possible
to make a web site so depressingly slow to access that people will
wander off somewhere else.
Hmmm? It's not as if Apache comes without any documentation - I would
have thought that mod_expires and the Expires* directives were rather
easy to find in the documentation. And there's another couple of ways
it can be done also, but if there's no other constraints, I think
you'd find the mod_expires features the easiest to use.