old-www/HOWTO/Apache-Overview-HOWTO-16.html

225 lines
9.3 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
<TITLE>Apache Overview HOWTO: XML projects</TITLE>
<LINK HREF="Apache-Overview-HOWTO-17.html" REL=next>
<LINK HREF="Apache-Overview-HOWTO-15.html" REL=previous>
<LINK HREF="Apache-Overview-HOWTO.html#toc16" REL=contents>
</HEAD>
<BODY>
<A HREF="Apache-Overview-HOWTO-17.html">Next</A>
<A HREF="Apache-Overview-HOWTO-15.html">Previous</A>
<A HREF="Apache-Overview-HOWTO.html#toc16">Contents</A>
<HR>
<H2><A NAME="s16">16. XML projects</A></H2>
<P>Directly from the Apache XML project website, its goals are:
<UL>
<LI><EM>To provide commercial-quality standards-based XML solutions that
are developed in an open and cooperative fashion.</EM></LI>
<LI><EM>To provide feedback to standards bodies (such as IETF and W3C) from
an implementation perspective.</EM></LI>
<LI><EM>To be a focus for XML-related activities within Apache projects</EM></LI>
</UL>
<P>The project homepage is located at
<A HREF="http://xml.apache.org">http://xml.apache.org</A>.
It is an umbrella for a variety of subprojects.
<H2><A NAME="ss16.1">16.1 Introduction to XML</A>
</H2>
<P>This is a quick introduction to XML. To know more about XML, a good starting
point is
<A HREF="http://www.xml.com">http://www.xml.com</A>. XML is a markup language (think
HTML) for describing structured content using tags and attributes. Once
content is separated from presentation, you can choose how to display
(cellphone, html, text) or exchange it. The XML standard only describes how
the tags and attributes can be arranged, not its names of what they mean.
Apache provides the tools described in the following sections.
<P>
<H2><A NAME="ss16.2">16.2 Xerces</A>
</H2>
<P>The Xerces project provides XML parsers for a variety of languages, including
Java, C++ and Perl. The Perl bindings are based on the C++ sources.
There are Tcl bindings for Xerces in the 2.0 version of
<A HREF="http://tclxml.sourceforge.net/">TclXML</A>, by Steve
Ball. This 2.0 version is available thru the
<A HREF="http://sourceforge.net/projects/tclxml">SourceForge</A> project page.
An XML parser is a tool used for programatic access to XML documents.
This is a description of the standards supported by Xerces:
<UL>
<LI>
<A HREF="http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html">DOM</A>: DOM stands for Document Object Model. XML documents
are hierarchical by nature (nested tags). XML documents can be accessed thru
a tree like interface. The process is as follow:
<UL>
<LI>Parse document</LI>
<LI>Build tree</LI>
<LI>add/delete/modify nodes</LI>
<LI>Serialize tree</LI>
</UL>
</LI>
<LI>
<A HREF="http://www.megginson.com/SAX/index.html">SAX</A>:Simple API for XML. This is a stream based API. This means
that we will receive callbacks as elements are encountered. These callbacks
can be used to construct a DOM tree for example.</LI>
<LI>
<A HREF="http://www.w3.org/TR/REC-xml-names/">XML Namespaces</A></LI>
<LI>XML Schema: The XML standard provides the syntax for writing documents. XML
Schema provides the tools for defining the <EM>contents</EM> of the XML
document (semantics). It allows to define that a certain element in the
document must be an integer between 10 and 20, etc.</LI>
</UL>
The Xerces XML project initial code base was donated by IBM. You can find more
information in the
<A HREF="http://xml.apache.org/xerces-j/index.html">Xerces Java</A>,
<A HREF="http://xml.apache.org/xerces-c/index.html">Xerces C++</A> and
<A HREF="http://xml.apache.org/xerces-p/index.html">Xerces Perl</A> homepages.
<P>
<H2><A NAME="ss16.3">16.3 Xalan</A>
</H2>
<P>Xalan is an XSLT processor available for Java and C++.
XSL is a style sheet language for XML. The T is for Transformation. XML
is good at storing structured data (information). We sometimes need to
display this data to the user or apply some other transformation.
Xalan takes the original XML document, reads transformation configuration
(stylesheet) and outputs HTML, plain text or another XML document.
You can learn more about Xalan at the
<A HREF="http://xml.apache.org/xalan-j/index.html">Xalan Java</A> and
<A HREF="http://xml.apache.org/xalan-c/index.html">Xalan C++</A> project homepages.
<H2><A NAME="ss16.4">16.4 FOP</A>
</H2>
<P>From the website: <EM>FOP is a Java application that reads a formatting
object tree and then turns it into a PDF document</EM>. So FOP takes an
XML document and outputs PDF, in a similar way that Xalan does with HTML
or text. You can learn more about FOP
<A HREF="http://xml.apache.org/fop">here</A>.
<P>
<P>
<H2><A NAME="cocoon"></A> <A NAME="ss16.5">16.5 Cocoon</A>
</H2>
<P>Cocoon leverages other Apache XML technologies like Xerces, Xalan and FOP
to provide a comprehensive publishing framework. Cocoon is based around
XML and XSL and targeted to sites of medium - high complexity.
It separates content, logic and presentation as described in the website:
<UL>
<LI><B>XML creation</B>: <EM>the XML file is created by the content owners.
They do not require specific knowledge on how the XML content is further
processed rather than the particular chosen DTD/namespace.
This layer is always performed by humans directly through normal text editors
or XML-aware tools/editors.</EM></LI>
<LI><B>XML process generators</B>:<EM> the logic is separated from the content
file.</EM></LI>
<LI><B>XSL rendering</B>:<EM> The created document is then rendered by applying an
XSL stylesheet to it and formatting it to the specified resource type (HTML,
PDF, XML, WML, XHTML)</EM></LI>
</UL>
You can learn more about Cocoon at the
<A HREF="http://xml.apache.org/cocoon/index.html">project homepage</A><P>
<P>
<P>
<H2><A NAME="ss16.6">16.6 Xang</A>
</H2>
<P>The goal of the Xang project is to <EM>make it easy for developers to build
commercial quality XML aware applications for the Web.</EM> The application
logic is defined in a hierarchical XML file which can be scripted via
JavaScript. This file defines how to access the data (which can be other XML
files, Java plug-ins, etc.). The Xang engine takes care of mapping HTTP
requests to the appropriate handlers.
You can learn more about Xang at the
<A HREF="http://xml.apache.org/xang">project homepage</A>.
<H2><A NAME="ss16.7">16.7 SOAP</A>
</H2>
<P><EM>Apache SOAP ("Simple Object Access Protocol") is an implementation of
the
<A HREF="http://www.w3.org/TR/SOAP">SOAP submission</A> to W3C.
It is based on, and supersedes, the IBM SOAP4J implementation</EM>.
<P><EM>From the draft W3C specification: SOAP is a lightweight protocol for
exchange of information in a decentralized, distributed environment. It is an
XML based protocol that consists of three parts</EM>:
<UL>
<LI><EM>An envelope that defines a framework for describing what is in a
message and how to process it</EM>,</LI>
<LI><EM>a set of encoding rules for expressing instances of
application-defined datatypes</EM>, and</LI>
<LI><EM>a convention for representing remote procedure calls and
responses</EM>. </LI>
</UL>
Think of SOAP as an XML based remote procedure call or CORBA
system. It is based on HTTP and XML. On the one hand this means it is
verbose and slow compared to other systems. On the other hand it eases
interoperatibility, debugging and development of clients and servers
for a variety of languages (C, Java, , Perl, Python, Tcl, etc.) since
most modern languages have HTTP and XML modules. You can learn more at
the
<A HREF="http://xml.apache.org/soap/">Apache SOAP homepage</A><P>Related talk
<UL>
<LI>W02: Rub-a-dub-dub-dubya: SOAP and the Web</LI>
</UL>
<P>
<H2><A NAME="ss16.8">16.8 Batik</A>
</H2>
<P><EM>Batik is a Java based toolkit for applications that want to use images in the
<A HREF="http://www.w3.org/TR/SVG/">Scalable Vector Graphics (SVG)</A> format for various
purposes, such as viewing, generation or manipulation.</EM>
<P> It is XML centric and compliant with the W3C specification. It is a bit atypical from other Apache
projects, in that it provides a graphical component. Batik provides hooks to extend the
framework thru custom tags and it allows conversion from SVG to other formats like JPEG or PNG.
<P>
<A HREF="http://xml.apache.org/batik/index.html">Batik homepage</A><P>Related talk
<UL>
<LI>W14: Introduction to the Batik project.</LI>
</UL>
<P>
<H2><A NAME="ss16.9">16.9 Crimson</A>
</H2>
<P> Crimson is an alternative, Java-based, XML parser with support for XML 1.0 thru a variety
of interfaces. It is the parser currently shipping in Sun products, and an intermediate
step until the version 2 of Xerces is released.
<P>
<A HREF="http://xml.apache.org/crimson/index.html">Crimson homepage</A><P>
<P>Related talk
<UL>
<LI>TH08: Java API for XML processing (JAXP) version 1.1</LI>
</UL>
<P>
<H2><A NAME="ss16.10">16.10 Other XML projects</A>
</H2>
<P>There are other projects based on Apache and XML that do not live under the
Apache XML umbrella
<UL>
<LI>
<A HREF="http://modxslt.sourceforge.net/">mod_xslt</A> is a C
based module for delivering XML/XSL based content. It has a GPL
license.</LI>
<LI>
<A HREF="http://axkit.org">AxKit</A>
<A NAME="axkit"></A> is
an XML based Application Server for mod_perl and Apache. It allows
separation of content and presentation.</LI>
</UL>
<P>
<P>Related talk
<UL>
<LI>TH04: AxKit - An XML Application server for Apache</LI>
</UL>
<P>
<HR>
<A HREF="Apache-Overview-HOWTO-17.html">Next</A>
<A HREF="Apache-Overview-HOWTO-15.html">Previous</A>
<A HREF="Apache-Overview-HOWTO.html#toc16">Contents</A>
</BODY>
</HTML>