LDP/LDP/lampadas/pylib/doc/Lampadas.wt

=Introduction=

This document defines the architecture for the Lampadas Document Management System.
It is intended for developers working on the Lampadas System.


=Module List=

The following modules comprise the system:

*Config

This module parses the configuration file and provides those settings to the rest
of the system. Can be used by any other module.

*Log

Writes to the system log file. Can be used by any other module.

*Database

Accesses the back-end RDBMS system, such as PostgreSQL or MySQL. This module is to be
used only by DataLayer.

*DataLayer

This module provides an object hierarchy to access all of the data in the
database. This is the core module upon which all additional functionality is built.

*Lintadas

The Lintadas module is named after the Lintian system, which is used by Debian to
automatically check packages for validity. Both names refer to the program Lint, which
checks HTML for validity.

Lintadas performs a series of checks on the source files which make up each document.
It checks XML files to be sure they are valid XML, for example, and it checks to see
if the document contains stale WWW links. Those error reports are then stored into
the database and can be managed by system administrators and individual authors and
editors.

It also performs checks on the data in the DataLayer module, and therefore the data in
the back-end RDBMS. For example, it identifies when referenced files do not exist in
the filestore.

*AutoMirror

Lampadas automatically, selectively, mirrors articles in the Wikipedia and other
external repositories. This is not the same as just mirroring a single article.
The automirror system knows how to select the correct individual pages and parse them
into one of the source formats the system knows how to convert.

This is the module where any special-case processing should be implemented, for example
parsing a Wiki binary file to extract the WikiText itself.

It is anticipated that other types of source repositories will be supported in the
future, but currently support is available only for Wiki articles.

*Mirror

The Mirror module knows how to download documents which reside in external repositories,
like the GNU manuals, and store them in the local filesystem. It uses data in the
database to know which documents to mirror, and updates the database so information
about these mirrored files is available to the Converter module.

Mirror only copies and stores documents. If more sophisticated types of processing
are required, such as a Wiki article which must be parsed out of a binary data file,
then the AutoMirror module handles it.

Mirror is intended to download files via the HTTP or FTP protocols.

*HTML

The HTML module generates HTML primitives, such as comboboxes, that are used to
make up the web pages. HTML generation methods require the caller to specify the
desired language as a two-character ISO code.

It also generates complete web pages for serving over HTTP to web browsers.
The content of the pages themselves is held in the database, to support online editing of
page content by system administrators, and also to support internationalization.

There is a templating system for the web pages and their text strings, which is
explained in the I18n section.

*Converter

The Converter module knows how to take documents in various formats and convert them
into the Lampadas standard output format, DocBook XML.


=Module Dependencies=

<programlisting>
Website----HTML----------+
                         |
           Converter-----+
                         |
           AutoMirror----+
                         |
           Mirror--------+
                         |
           Lintadas---DataLayer-----Database
</programlisting>

The Log and Config modules can be freely referenced and used by any other module, so
to keep things simple they have been omitted from this diagram.
The other modules, however, have distinct relationships that must be understood and
followed.

The highly modular structure of the system was carefully design to isolate
functionality. This makes the system much more easily extensible and much more
flexible.

This diagram shows the relationships between the core modules that make up Lampadas.
As the diagram indicates, the DataLayer is the key module. It provides access to
the underlying database through an object hierarchy. Additional modules are built
on top of the DataLayer which implement specific functionality. Each of these can
be consider an "extension" of the system, and any number of them could be constructed.


=Internationalization|i18n=

Internationalization, or i18n for short, means support for multiple languages.
Localization, or l10n for short, is the process of adding the translations for a particular
language.

Lampadas fully supports i18n by storing strings in the database in Unicode. The DataLayer
module does not perform any localization, but returns in its data sets all possible
translations and strings. Objects which have multiple translated strings will have a
child object called I18n, which is a dictionary object. That dictionary contains an object
for each translation that is available. Each of those objects has a property for each
string the parent object needs. For example, for Class object has the following structure:

Class['HOWTO'].I18n['EN'].Name = 'HOWTO'
Class['HOWTO'].I18n['EN'].Description = 'HOWTO do something in Linux'


=Strings=

The string and string_i18n tables hold localized strings that are displayed on the web
page and elsewhere. Each string is named with a short text name, and the names use the
following conventions:

tpl-*  a page template
pg-*        a page for the website
mi-*        a menu item

We may need to go with something more sophisticated, i.e., a separate table for pages
and templates, but we'll see about that as development goes along.

When a string contains a string delimited with pipe characters, like "|header|", it
replaces that string with the named string. This is how pages are build from page templates.
Nesting can be to any arbitrary depth, but cannot be recursive for obvious reasons.
If you need to use an actual pipe character, escape it like this: "\|".