Previous: Using the Search Engine Contents and Introduction Next: Working with the Reverse Proxy Servers

Publishing Data Bases on the Web

warning: this page needs to be completed with detailed info on how to set up and maintain DBs (Oracle and Fulcrum), and with references to the use of WebLogic.

Informations stored in data bases are being published on web servers more and more often. The main advantage of using data bases to publish information on the web is the possibility to create information dynamically in a way that is better suited to the needs of the individual users. Sometimes, however, the advantages do not outweigh the overhead (development, maintenance, system and network resources). In other cases using data bases is the best, or only possible, choice due to the complexity of the information.

An important factor when choosing between dynamic data, generated on the fly, and static data is the visibility of your data. Data that is generated on the fly will not be indexed by (most) search engines and will therefore not be visible to users trying to locate information using those search engines.

Two types of data bases can be published: Oracle and Fulcrum DBs.
At this moment we are running Oracle 8i, and Fulcrum SearchServer V4, both on dedicated servers. A more complete description of our environment can be found in the chapter on configuration related issues. Please note that all data that are to be published through one of the web servers at Admin-DI-DC-D must reside at the Admin-DI-DC premises.

We strongly urge you to contact the Admin-DI-DC-D web team and DG PRESS before starting new developments. It can save you a lot of future trouble and resources (external contractors do not rewrite applications for free, do they?).

Preferred architecture

A multi-tiered architecture has been put in place to allow the publication of database contents on EUROPA and IntraComm:

standard architecture for publishing database contents

All access to the EUROPA and IntraComm contents is transparently channeled through reverse proxy servers. The reverse proxy servers map incoming HTTP requests to the appropriate web servers. This is described in more detail in a dedicated chapter.

In our standard architecture most of the requests will be mapped to the main contents web servers. Each of these servers have been configured to run with a ColdFusion Enterprise server that is located on the same platform.
Oracle DBs for which a 'data source' has been defined in the ColdFusion servers can be accessed using the standard native Oracle drivers.

The described architecture should be followed when possible. Because the ColdFusion applications reside on the same platforms as the 'static' contents we believe this model allows for easier maintenance of the application logic, and for easier integration with the related web pages and images.

Unfortunately, access to Fulcrum DBs doesn't seem to fit in this picture at this moment. Access to Fulcrum DBs via ColdFusion and ODBC is being investigated, and will be integrated into this architecture when possible.

Alternative solutions are available or can be investigated if necessary.

How to set up a new data base

Here we will describe the procedure to get a new data base published on EUROPA or IntraComm within the boundaries of the described 'preferred architecture'.

Notify web site management

The first step, as with all data that are being published on EUROPA, is to contact the EUROPA team (DG Press). We (Data Centre) only make data bases available through the web servers, both in staging and production, after the acceptance by the EUROPA team. For IntraComm the same applies: contact the IntraComm team (DG Admin) first.
They will provide you with, or eventually agree with, a URL path under which the data base will be published.

When this is done you can contact the Data Centre webteam (dcdiffusion.europa@cec.eu.int) via e-mail. This e-mail message should contain a description of the application: the URL path provided by the web site managers, disk space required, contact person(s), and other useful information.

ColdFusion application

As indicated in the chapter on using ColdFusion, the cfm pages can be uploaded on the staging server using the existing FTP access. If necessary a dedicated FTP userid can be created.

The purpose of the staging server is to allow verification of web contents before they are copied onto to public web servers. Here you can verify links and ColdFusion pages in an environment that is as close as possible the real, publicly accessible, environment.
The staging server allows the editorial committee to view new contents before publication and verify conformance to the respective information provider guides. After acceptance by the editorial committee, then the web site managers (the resp. EUROPA and IntraComm teams) will request that we (Data Centre webteam) make the contents available to the public.

Oracle DB creation

information to be added

Alternative architecture

For cases where it is absolutely impossible to apply the preferred architecture we propose to use dedicated web servers.

alternative architecture for publishing database contents

The HTTP daemon can be an iPlanet Web server. For the CGI scripts Perl V5.6 can be used with the DBI and DBD modules.

Where possible, the related static files (HTML pages and images) should be located on the main contents web server.

As indicated before, the database web servers are accessed via reverse proxy servers. This will have an influence on how the web interface to a database will need to be programmed.

In order to reduce integration problems with the rest of the web site we propose to set up each database environment and web interface so that it is located completely under one URL path. An example with (almost) real configuration statements for our iPlanet Enterprise and Netscape Proxy servers might clarify this:

The most important consequence is that you cannot use hyper links like "/database/cgi-bin/script.pl" in your pages. A link like this would only work on the database web server itself but not on the reverse proxy server.
In other words: use relative links, or it might not work.

To simplify things for those 'programmers' that have problems with relative links we might consider using the same URL path on both the database web server and the reverse proxy server. However, this is not always possible as the relative location of the database to the web site might change.
In other words: use relative links, or it might stop working later.

Testing does help

Some developers have major problems understanding how to work with reverse proxy servers. A few times this lack of understanding has lead to postponing the publication of a new application on our web servers. One way to prevent bumping into these problems when going live is to reserve some time to try the application first in our staging environment. Sounds obvious, doesn't it?

Data Centre