When a client makes a request to the web site, the request goes to the proxy server. The proxy server then sends the client's request [...] to the content server. The content server passes the result [...] back to the proxy. The proxy sends the retrieved information to the client as if the proxy were the actual content server.In rfc2616, on HTTP/1.1, a 'gateway', what we call a 'reverse proxy server', is defined as follows:
A server which acts as an intermediary for some other server. Unlike a proxy, a gateway receives requests as if it were the origin server for the requested resource; the requesting client may not be aware that it is communicating with a gateway.The following figure shows how the EUROPA reverse proxy works:
![a reverse proxy server appears to be the real content server](https://intracomm.ec.europa.eu/Publishing/images/revpxya1.gif)
The overhead generated by reverse proxy servers can be justified as follows:
- the proxy server caches all requests for static data (HTML pages, GIF files, ...) and tries to serve requests from its' fast cache.
- one document can be made available on several sites without the need to run several physical copies, (example: the same IDEA database is accessible via IntraComm and EUROPA).
- the 'split DNS' and 'round robin DNS' techniques allow us to balance the load over several servers and networks.
- access restrictions to the same data can be restricted differently depending on the source of the request, (example: Internet users might need to identify themselves in order to access a certain document while users located on the Intranet have free access to it).
- the data that is to be published can reside anywhere and can be moved around without any visible changes for the public.
- most of the data that is to be published is physically located on the Intranet, behind the firewall which is set up to allow only HTTP connections between the reverse proxy server and the content servers, while still allowing protocols like FTP to be used to maintain the contents of web sites.
Please note, that the reverse proxy servers EUROPA, IntraComm are only used as "stand-ins" for content servers that are physically located at the Data Centre. We do not make content servers located elsewhere available through these reverse proxy servers because we cannot guarantee that these foreign servers are available at any given time.
The use of reverse proxies does generate some overhead when it comes to creating pages and CGI scripts for our sites. We must therefore insist that you:
use relative links
Allow me to repeat this, because there are quite a few people out there that didn't seem to get this message:USE RELATIVE LINKS
A link should never contain a host name or an IP address, and most certainly not a locally defined machine name or a test server. There is only one reason to include a host name or an IP address in a link: when you point to another web site.When you need to point back to the home page of the current site you might use something like: <a href="/">home</a>.
Try as much as possible to make all links relative to
the address of the current page. This will make it
possible to move your set of pages to somewhere else in
the case of a reorganization of the web site. Very
often sites are developed with one location in mind but
at the time they are made available to the public this
location might have changed.
Another reason for insisting on the strict application
of relative links is that your document is published,
or might be published in the future, on two or more of
our sites. Our web sites have different structures
which makes it impossible in most cases to use the same
URL path for your document on every site.
There are three ways to reference a page "/some/path/page.htm" from a page "/some/page.htm":
-
absolute links including host name and eventual port
number: not to be used.
in our example: <a href="http://www.cc.cec/some/path/page.htm"> will not work when the page is accessed via the secure interface to IntraComm (https://intracomm.cec.eu.int), -
absolute links: can be used, but this limits the
portability of the document.
in our example: <a href="/some/path/page.htm">, -
relative link: the preferred method.
in our example: <a href="path/page.htm">.
When accessing via https://intracomm.cec.eu.int you would see: "by http://intracomm.cec.eu.int:443 (iPlanet-Web-Proxy-Server/3.6), by http://www.cc.cec:80 (Netscape-Proxy/3.52)".
Use this variable instead of "HTTP_HOST" or "SERVER_NAME" because these variables will contain the hostname:port number of the web server, which in most cases will not be directly accessible to the public. This can be used in case you need to generate a HTML "base" tag. For an example check out the demo script used in the chapter on CGI scripts and programs.
"home" content server versus other content servers
Every one of our web sites consists at least of a
reverse proxy server and a "home" content server. The
"home" content server contains mostly static web
pages, like the home page of the site, ColdFusion
pages, and eventual CGI scripts.
There are a few things that will be different
depending on where your pages and CGI scripts will be
located.
The reverse proxy servers are programmed to map requested URLs by default to the corresponding "home" content server.
Most applications that generate web pages dynamically,
and that are not using the ColdFusion server that is
attached to the "home" content server, will be located
on content servers other than the "home" content
server. As a rule all the references within an
application will have the same "root". In both proxy
servers we only defined URL mappings to map all
requests starting with "/idea" to
"http://[databasehost:port]/idea", and
back.
Here again, we must insist on the use of relative
links. Very often applications are developed with one
URL path in mind but in the end, for one reason or the
other, a different URL path is chosen. So, in order to
save yourself a lot of future trouble, and/or extra
expenses, make sure that the web application or web
site that you are about to develop uses relative
links. (See preceding
chapter)
This problem with coordinating URLs and mappings does not exist for CFM pages that are located on the "home" content server because these pages are located right next to the related HTML pages and images. We therefore recommend, where possible, to use ColdFusion applications to access Oracle DBs, rather than creating an application that is to run behind a separate web server.