225 lines
9.3 KiB
HTML
225 lines
9.3 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
|
|
<TITLE>Apache Overview HOWTO: XML projects</TITLE>
|
|
<LINK HREF="Apache-Overview-HOWTO-17.html" REL=next>
|
|
<LINK HREF="Apache-Overview-HOWTO-15.html" REL=previous>
|
|
<LINK HREF="Apache-Overview-HOWTO.html#toc16" REL=contents>
|
|
</HEAD>
|
|
<BODY>
|
|
<A HREF="Apache-Overview-HOWTO-17.html">Next</A>
|
|
<A HREF="Apache-Overview-HOWTO-15.html">Previous</A>
|
|
<A HREF="Apache-Overview-HOWTO.html#toc16">Contents</A>
|
|
<HR>
|
|
<H2><A NAME="s16">16. XML projects</A></H2>
|
|
|
|
<P>Directly from the Apache XML project website, its goals are:
|
|
<UL>
|
|
<LI><EM>To provide commercial-quality standards-based XML solutions that
|
|
are developed in an open and cooperative fashion.</EM></LI>
|
|
<LI><EM>To provide feedback to standards bodies (such as IETF and W3C) from
|
|
an implementation perspective.</EM></LI>
|
|
<LI><EM>To be a focus for XML-related activities within Apache projects</EM></LI>
|
|
</UL>
|
|
<P>The project homepage is located at
|
|
<A HREF="http://xml.apache.org">http://xml.apache.org</A>.
|
|
It is an umbrella for a variety of subprojects.
|
|
<H2><A NAME="ss16.1">16.1 Introduction to XML</A>
|
|
</H2>
|
|
|
|
<P>This is a quick introduction to XML. To know more about XML, a good starting
|
|
point is
|
|
<A HREF="http://www.xml.com">http://www.xml.com</A>. XML is a markup language (think
|
|
HTML) for describing structured content using tags and attributes. Once
|
|
content is separated from presentation, you can choose how to display
|
|
(cellphone, html, text) or exchange it. The XML standard only describes how
|
|
the tags and attributes can be arranged, not its names of what they mean.
|
|
Apache provides the tools described in the following sections.
|
|
<P>
|
|
<H2><A NAME="ss16.2">16.2 Xerces</A>
|
|
</H2>
|
|
|
|
<P>The Xerces project provides XML parsers for a variety of languages, including
|
|
Java, C++ and Perl. The Perl bindings are based on the C++ sources.
|
|
There are Tcl bindings for Xerces in the 2.0 version of
|
|
<A HREF="http://tclxml.sourceforge.net/">TclXML</A>, by Steve
|
|
Ball. This 2.0 version is available thru the
|
|
<A HREF="http://sourceforge.net/projects/tclxml">SourceForge</A> project page.
|
|
An XML parser is a tool used for programatic access to XML documents.
|
|
This is a description of the standards supported by Xerces:
|
|
<UL>
|
|
<LI>
|
|
<A HREF="http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html">DOM</A>: DOM stands for Document Object Model. XML documents
|
|
are hierarchical by nature (nested tags). XML documents can be accessed thru
|
|
a tree like interface. The process is as follow:
|
|
<UL>
|
|
<LI>Parse document</LI>
|
|
<LI>Build tree</LI>
|
|
<LI>add/delete/modify nodes</LI>
|
|
<LI>Serialize tree</LI>
|
|
</UL>
|
|
</LI>
|
|
<LI>
|
|
<A HREF="http://www.megginson.com/SAX/index.html">SAX</A>:Simple API for XML. This is a stream based API. This means
|
|
that we will receive callbacks as elements are encountered. These callbacks
|
|
can be used to construct a DOM tree for example.</LI>
|
|
<LI>
|
|
<A HREF="http://www.w3.org/TR/REC-xml-names/">XML Namespaces</A></LI>
|
|
<LI>XML Schema: The XML standard provides the syntax for writing documents. XML
|
|
Schema provides the tools for defining the <EM>contents</EM> of the XML
|
|
document (semantics). It allows to define that a certain element in the
|
|
document must be an integer between 10 and 20, etc.</LI>
|
|
</UL>
|
|
|
|
The Xerces XML project initial code base was donated by IBM. You can find more
|
|
information in the
|
|
<A HREF="http://xml.apache.org/xerces-j/index.html">Xerces Java</A>,
|
|
<A HREF="http://xml.apache.org/xerces-c/index.html">Xerces C++</A> and
|
|
<A HREF="http://xml.apache.org/xerces-p/index.html">Xerces Perl</A> homepages.
|
|
<P>
|
|
<H2><A NAME="ss16.3">16.3 Xalan</A>
|
|
</H2>
|
|
|
|
<P>Xalan is an XSLT processor available for Java and C++.
|
|
XSL is a style sheet language for XML. The T is for Transformation. XML
|
|
is good at storing structured data (information). We sometimes need to
|
|
display this data to the user or apply some other transformation.
|
|
Xalan takes the original XML document, reads transformation configuration
|
|
(stylesheet) and outputs HTML, plain text or another XML document.
|
|
You can learn more about Xalan at the
|
|
<A HREF="http://xml.apache.org/xalan-j/index.html">Xalan Java</A> and
|
|
<A HREF="http://xml.apache.org/xalan-c/index.html">Xalan C++</A> project homepages.
|
|
<H2><A NAME="ss16.4">16.4 FOP</A>
|
|
</H2>
|
|
|
|
<P>From the website: <EM>FOP is a Java application that reads a formatting
|
|
object tree and then turns it into a PDF document</EM>. So FOP takes an
|
|
XML document and outputs PDF, in a similar way that Xalan does with HTML
|
|
or text. You can learn more about FOP
|
|
<A HREF="http://xml.apache.org/fop">here</A>.
|
|
<P>
|
|
<P>
|
|
<H2><A NAME="cocoon"></A> <A NAME="ss16.5">16.5 Cocoon</A>
|
|
</H2>
|
|
|
|
<P>Cocoon leverages other Apache XML technologies like Xerces, Xalan and FOP
|
|
to provide a comprehensive publishing framework. Cocoon is based around
|
|
XML and XSL and targeted to sites of medium - high complexity.
|
|
It separates content, logic and presentation as described in the website:
|
|
<UL>
|
|
<LI><B>XML creation</B>: <EM>the XML file is created by the content owners.
|
|
They do not require specific knowledge on how the XML content is further
|
|
processed rather than the particular chosen DTD/namespace.
|
|
This layer is always performed by humans directly through normal text editors
|
|
or XML-aware tools/editors.</EM></LI>
|
|
<LI><B>XML process generators</B>:<EM> the logic is separated from the content
|
|
file.</EM></LI>
|
|
<LI><B>XSL rendering</B>:<EM> The created document is then rendered by applying an
|
|
XSL stylesheet to it and formatting it to the specified resource type (HTML,
|
|
PDF, XML, WML, XHTML)</EM></LI>
|
|
</UL>
|
|
|
|
You can learn more about Cocoon at the
|
|
<A HREF="http://xml.apache.org/cocoon/index.html">project homepage</A><P>
|
|
<P>
|
|
<P>
|
|
<H2><A NAME="ss16.6">16.6 Xang</A>
|
|
</H2>
|
|
|
|
<P>The goal of the Xang project is to <EM>make it easy for developers to build
|
|
commercial quality XML aware applications for the Web.</EM> The application
|
|
logic is defined in a hierarchical XML file which can be scripted via
|
|
JavaScript. This file defines how to access the data (which can be other XML
|
|
files, Java plug-ins, etc.). The Xang engine takes care of mapping HTTP
|
|
requests to the appropriate handlers.
|
|
You can learn more about Xang at the
|
|
<A HREF="http://xml.apache.org/xang">project homepage</A>.
|
|
<H2><A NAME="ss16.7">16.7 SOAP</A>
|
|
</H2>
|
|
|
|
<P><EM>Apache SOAP ("Simple Object Access Protocol") is an implementation of
|
|
the
|
|
<A HREF="http://www.w3.org/TR/SOAP">SOAP submission</A> to W3C.
|
|
It is based on, and supersedes, the IBM SOAP4J implementation</EM>.
|
|
<P><EM>From the draft W3C specification: SOAP is a lightweight protocol for
|
|
exchange of information in a decentralized, distributed environment. It is an
|
|
XML based protocol that consists of three parts</EM>:
|
|
<UL>
|
|
<LI><EM>An envelope that defines a framework for describing what is in a
|
|
message and how to process it</EM>,</LI>
|
|
<LI><EM>a set of encoding rules for expressing instances of
|
|
application-defined datatypes</EM>, and</LI>
|
|
<LI><EM>a convention for representing remote procedure calls and
|
|
responses</EM>. </LI>
|
|
</UL>
|
|
|
|
Think of SOAP as an XML based remote procedure call or CORBA
|
|
system. It is based on HTTP and XML. On the one hand this means it is
|
|
verbose and slow compared to other systems. On the other hand it eases
|
|
interoperatibility, debugging and development of clients and servers
|
|
for a variety of languages (C, Java, , Perl, Python, Tcl, etc.) since
|
|
most modern languages have HTTP and XML modules. You can learn more at
|
|
the
|
|
<A HREF="http://xml.apache.org/soap/">Apache SOAP homepage</A><P>Related talk
|
|
<UL>
|
|
<LI>W02: Rub-a-dub-dub-dubya: SOAP and the Web</LI>
|
|
</UL>
|
|
<P>
|
|
<H2><A NAME="ss16.8">16.8 Batik</A>
|
|
</H2>
|
|
|
|
<P><EM>Batik is a Java based toolkit for applications that want to use images in the
|
|
<A HREF="http://www.w3.org/TR/SVG/">Scalable Vector Graphics (SVG)</A> format for various
|
|
purposes, such as viewing, generation or manipulation.</EM>
|
|
<P> It is XML centric and compliant with the W3C specification. It is a bit atypical from other Apache
|
|
projects, in that it provides a graphical component. Batik provides hooks to extend the
|
|
framework thru custom tags and it allows conversion from SVG to other formats like JPEG or PNG.
|
|
<P>
|
|
<A HREF="http://xml.apache.org/batik/index.html">Batik homepage</A><P>Related talk
|
|
<UL>
|
|
<LI>W14: Introduction to the Batik project.</LI>
|
|
</UL>
|
|
<P>
|
|
<H2><A NAME="ss16.9">16.9 Crimson</A>
|
|
</H2>
|
|
|
|
<P> Crimson is an alternative, Java-based, XML parser with support for XML 1.0 thru a variety
|
|
of interfaces. It is the parser currently shipping in Sun products, and an intermediate
|
|
step until the version 2 of Xerces is released.
|
|
<P>
|
|
<A HREF="http://xml.apache.org/crimson/index.html">Crimson homepage</A><P>
|
|
<P>Related talk
|
|
<UL>
|
|
<LI>TH08: Java API for XML processing (JAXP) version 1.1</LI>
|
|
</UL>
|
|
<P>
|
|
<H2><A NAME="ss16.10">16.10 Other XML projects</A>
|
|
</H2>
|
|
|
|
<P>There are other projects based on Apache and XML that do not live under the
|
|
Apache XML umbrella
|
|
<UL>
|
|
<LI>
|
|
<A HREF="http://modxslt.sourceforge.net/">mod_xslt</A> is a C
|
|
based module for delivering XML/XSL based content. It has a GPL
|
|
license.</LI>
|
|
<LI>
|
|
<A HREF="http://axkit.org">AxKit</A>
|
|
<A NAME="axkit"></A> is
|
|
an XML based Application Server for mod_perl and Apache. It allows
|
|
separation of content and presentation.</LI>
|
|
</UL>
|
|
<P>
|
|
<P>Related talk
|
|
<UL>
|
|
<LI>TH04: AxKit - An XML Application server for Apache</LI>
|
|
</UL>
|
|
<P>
|
|
<HR>
|
|
<A HREF="Apache-Overview-HOWTO-17.html">Next</A>
|
|
<A HREF="Apache-Overview-HOWTO-15.html">Previous</A>
|
|
<A HREF="Apache-Overview-HOWTO.html#toc16">Contents</A>
|
|
</BODY>
|
|
</HTML>
|