XML and Project Gutenberg

Here's a project that will allow you to get XML documents and examples as you work.

XML and Project Gutenberg
Ed Tittel

So you're interested in XML, but you need some examples and some guidelines to get going. Ed's tip this week tells about a project that is a huge source of XML style sheets and code.
Do you have an XML tip? Why not send it in? We'll post it on our site, and enter you in our tips contest for some nifty prizes.


Surely one of the most interesting, educational, and worthwhile projects underway on the World Wide Web has to be Project Gutenberg. Started by Michael Hart in 1971 as a consequence of obtaining a monster grant of computer time from the Materials Research Lab at the University of Illinois, this project has morphed into a worldwide volunteer effort to convert as many books as possible into electronic, or e-book, form as possible. You'll find the home page (http://gutenberg.hwg.org/index.html) quite interesting and informative, but as an example of why XML is a good idea, what it can do for the Web, and as a learning and teaching tool, this site is hard to beat.

Before I explain what you can find here, and what makes it so great, here's a quote from the history of the project (http://promo.net/pg/history.html) that describes the impetus behind project Gutenberg:
"The premise on which Michael Hart based Project Gutenberg was: anything that can be entered into a computer can be reproduced indefinitely. . .what Michael termed 'Replicator Technology.' The concept of Replicator Technology is simple; once a book or any other item (including pictures, sounds, and even 3-D items) can be stored in a computer, then any number of copies can and will be available. Everyone in the world, or even not in this world (given satellite transmission) can have a copy of a book that has been entered into a computer."

Thus, all books are available in plain ASCII text, suitable for delivery to any kind of computer, but also in HTML, XHTML, or XML form (depending on what format text coders choose when creating an e-book from a printed original). Note also that the Gutenberg's project is wonderfully ambitious--namely to make all printed materials available for free online as soon as their copyrights expire and the materials fall into the public domain. For everything but evanescent works--such as time-sensitive materials about technology, medicine, computing, and so forth--this is an incredible gift to all mankind.

Why am I expostulating like this in a tip about XML? Because most of the new encoding work for this project is using XML to capture book content. But also, because the availability of simple encoding schemes and accompanying CSS style sheets means that people looking for XML content to view in their Web browsers occurs in significant volumes upon the FTP servers from whence Project Gutenberg titles may be downloaded. In other words, here is a great source for XML content you can see in readable form!

Combine those great examples with informative tutorials on XML, DTDs, Cascading Style Sheets, and more, and you've got a great resource where newbies can not only learn about XML and related technologies, but can also see some interesting and useful results from such work. In fact, if you or your colleagues need training and exposure in this topic area, you can get plenty of same while also adding to the collection of available titles by signing up (as an individual or as a group) to encode a title for the Gutenberg Project collection as part of climbing the XML learning curve.

It's nice to be able to do some good for others, as you'll be learning by doing with XML, style sheets, and more. For a complete catalog of what's available in XML format, please visit http://gutenberg.hwg.org/checkdoc1.html; you can also look for titles to check out to mark up at the same URL. Please, consider contributing your time and effort to this worthwhile project!


Ed Tittel is a principal at LANWrights, Inc., a wholly owned subsidiary of LeapIt.com. LANWrights offers training, writing, and consulting services on Internet, networking, and Web topics (including XML and XHTML), plus various IT certifications (Microsoft, Sun/Java, and Prosoft/CIW).

Did you like this tip? Let us know. Email to vent.

Related Book

XML Internationalization and Localization
Author : Yves Savourel
Publisher : SAMS Publishing
Published : Jun 2001
Summary :
The purpose of this book is twofold: First to describe what needs to be done to internationalize XML documents and applications; second to describe how the XML data can be localized efficiently.
There is currently almost no information on these two topics grouped and organized in a single reference. In addition, while XML has evolved a lot the past 2 years, it has now reached a point of global acceptance, as evidenced by the many international XML working groups addressing trading partner agreements, electronic document exchange, business processes, and eBusiness.


This was first published in July 2001

Dig deeper on XML and XML schema

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSoftwareQuality

SearchCloudApplications

SearchAWS

TheServerSide

SearchWinDevelopment

Close