Jesse Goerz
Revision History | ||
---|---|---|
Revision v0.1 | 23 March 2001 | Revised by: jwg |
Initial release. |
What's in an sgml document? An sgml document in it's most basic form looks terribly similar to an html document. In fact, html is derived from sgml. Below is an example of what a sgml document looks like.
1 <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook V3.1//EN"> 2 3 <sect1 id="hello world"><title>Hello world</title> 4 5 <para> 6 Hello world! 7 </para> 8 9 <sect2 id="goodbye_world"><title>Goodbye world</title> 10 <para> 11 Goodbye world. 12 </para> 13 </sect2> 14 15 <sect2 id="goodbye_world"><title>Goodbye again</title> 16 <para> 17 Goodbye again. 18 </para> 19 </sect2> 20 21 <para> 22 Hello again! 23 </para> 24 25 </sect1> |
1 jesse@storm:~/test$ sgmltools -b html simple.sgml 2 /usr/bin/jade:<OSFD>0:2:11:E: value of attribute "ID" must be a single token 3 /usr/bin/jade:<OSFD>0:8:15:E: character "_" is not allowed in the value of attribute "ID" 4 /usr/bin/jade:<OSFD>0:15:23:E: ID "GOODBYE_WORLD" already defined 5 /usr/bin/jade:<OSFD>0:9:15: ID "GOODBYE_WORLD" first defined here |
What's so special about our sgml documents? Well, nothing really, but we do organize them in a specific way to try and make keeping them up to date as easy as possible. First of all, we use three types of documents here at NewbieDoc; Books, Chapters, and like the one you just saw, sections. We keep them in separate files so they are easier to maintain by individual writers. Then we use entity declarations to link them all together. Don't worry if you don't know what those are we will cover that a little later.
Books, Chapters, and Sections. Our documents are organized in a simple fashion. Docbook sgml facilitates this with the use of the book paradigm. If you think about it, it makes sense. A book has several chapters, and each chapter has many pages (in our case we use sections rather than pages.) Our entire archive is represented by the book. The main file which holds the book together is index.sgml located in the root of /newbiedoc. Think of this as the cover and table of contents of our book.
Next we have many chapters. The chapters of our book are represented by different types of categories. For instance, because we are writing about Linux we may have a chapter on system administration, networking, text editors, and/or utilities. Each of these chapters would itself contain many sections. For example, the text editors chapter could contain a section on vi, joe, emacs, and/or ed. Putting all these sections in the text editors chapter makes them easy to organize and easy to change if we need to.
Entities: The linkers. Entity is just a fancy name which docbook uses to mean a cross between a variable and a sym-link. Here is what a chapter looks like with some entity declarations:
1 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN" 2 [ 3 <!ENTITY sourceforge-guide SYSTEM "metadoc/sourceforge-guide.sgml"> 4 <!ENTITY template SYSTEM "metadoc/template.sgml"> 5 ]> 6 7 <chapter id="gendoc"><title>General Documentation</title> 8 9 <!-- Section 1 sourceforge-guide --> 10 &sourceforge-guide; 11 12 <!-- Section 2 document template --> 13 &template; 14 15 </chapter> |
Since I didn't go into to much detail earlier, I'm going to cover a little bit more then just the entities. On the first line you have what is known as the DTD or Document Type Declaration. This is just a fancy way of saying what type of document it is. In this case it's a chapter. In the earlier example we looked at a section. Take a minute and compare the two. One thing to take note of is that not only are their names different but the section DTD ends with a ">" right after //EN". The chapter DTD, however, does not end until after all the entities have been declared. Also note that the entities are all inside their own little container; they have a starting bracket and an ending bracket, and only then does the DTD terminate with a ">".
![]() | Entities declarations take the form: |
Once the entities are declared in the DTD, think of them as variables holding a pointer or URL link to the file included within them. Now that they are declared you can import the files they point to by invoking them as an entity. This is done by preceding the name of the entity with the "&" (ampersand) and ending it with a ";" semi-colon. Once again, look at the chapter file above to see an example.
![]() | It is important that you distinguish the difference between entity declarations and entities. Entity declarations hold the directions to get to a file, but they do absolutely nothing until they are told to. Entities invoked within the document like " &sourceforge-guide; " in the preceding example tell their declarations to import the file into the document right at that point. |
Tying it all together. The reason you need to understand entities is because it will help prevent you from making simple errors which might cause the book build to fail. The book build is when the administrators build all the sections into their appropriate chapters, and all the chapters into the book. Most of us will only need to edit the book and chapter files when we are adding a new document. After that you can pretty much leave them alone unless you change the name of your sgml document file. I'll cover what you need to do to add or update a new document in a later section.
Directory/File names. Directory names which fall under /newbiedoc should be pretty self-explanatory. Most directory names will be the same as the chapter names. For example, the system-admin directory contains all the section documents which cover subjects related to system administration. In the root of /newbiedoc there would be a system-admin.sgml which would be a chapter document with entities pointing at all the section documents within the system-admin directory. Here's what it would look like in your working directory:
![]() | Remember index.sgml in the root of /newbiedoc? That is the cover and table of contents of the book; it is what holds it all together. The administrators use that file to do the book build. Contained inside index.sgml are all the entity declarations. |
1 jesse@storm:~/cvs/newbiedoc$ ls -l 2 total 84 3 drwxr-sr-x 2 jesse jesse 4096 Mar 23 19:15 general-doc 4 -rw-r--r-- 1 jesse jesse 0 Mar 23 19:17 general-doc.sgml 5 -rw-r--r-- 1 jesse jesse 4110 Mar 1 16:56 index.sgml 6 drwxr-sr-x 2 jesse jesse 4096 Mar 23 19:14 system-admin 7 -rw-r--r-- 1 jesse jesse 0 Mar 23 19:16 system-admin.sgml 8 drwxr-sr-x 2 jesse jesse 4096 Mar 23 19:16 text-editors 9 -rw-r--r-- 1 jesse jesse 0 Mar 23 19:17 text-editors.sgml 10 drwxr-sr-x 2 jesse jesse 4096 Mar 23 19:16 utils 11 -rw-r--r-- 1 jesse jesse 0 Mar 23 19:17 utils.sgml 12 13 ===================Directory listings below here =========================== 14 15 jesse@storm:~/cvs/newbiedoc$ ls -l general-doc 16 total 36 17 -rw-r--r-- 1 jesse jesse 13126 Mar 1 16:55 sourceforge-guide.sgml 18 -rw-r--r-- 1 jesse jesse 19285 Mar 1 16:56 template.sgml 19 20 jesse@storm:~/cvs/newbiedoc$ ls -l system-admin 21 total 24 22 -rw-r--r-- 1 jesse jesse 23566 Mar 1 16:56 runlevels-intro.sgml 23 24 jesse@storm:~/cvs/newbiedoc$ ls -l text-editors 25 total 32 26 -rw-r--r-- 1 jesse jesse 11139 Mar 1 16:56 joe.sgml 27 -rw-r--r-- 1 jesse jesse 18598 Mar 1 16:56 vi.sgml 28 29 jesse@storm:~/newbiedoc$ ls -l utils 30 total 8 31 -rw-r--r-- 1 jesse jesse 6390 Mar 1 16:56 grep.sgml 32 |
Entity names. Entity names should be the same as the file they point to in their declaration. When two file names in different subdirectories happen to have the same name the subdirectory name will be included in the entity name. For example:
1 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V3.1//EN" 2 3 [ 4 <!-- These entities import the chapter files. --> 5 <!ENTITY general-doc SYSTEM "general-doc.sgml"> 6 <!ENTITY system-admin SYSTEM "system-admin.sgml"> 7 <!ENTITY text-editors SYSTEM "text-editors.sgml"> 8 <!ENTITY utils.sgml SYSTEM "utils.sgml"> 9 10 <!-- These entities import the section files for --> 11 <!-- General Documentation --> 12 <!ENTITY sourceforge-guide SYSTEM "general-doc/sourceforge-guide.sgml"> 13 <!ENTITY template SYSTEM "general-doc/template.sgml"> 14 15 <!-- These entities import the section files for --> 16 <!-- System Administration --> 17 <!ENTITY runlevels-intro SYSTEM "runlevels-intro.sgml"> 18 19 <!-- These entities import the section files for --> 20 <!-- Text Editors --> 21 <!ENTITY text-editors-joe SYSTEM "text-editors/joe.sgml"> 22 <!ENTITY text-editors-vi SYSTEM "text-editors/vi.sgml"> 23 24 <!-- These entities import the section files for --> 25 <!-- Utilities --> 26 <!ENTITY utils-grep SYSTEM "utils/grep.sgml"> 27 ]> 28 29 <book id="index" lang="en"> 30 |
Even though vi.sgml and joe.sgml's entity names are different (vi and joe respectively) I chose to include their subdirectory names in their entity names as a precaution. Both of these editors are very commonly brought up in howto's and usually have their own section with a link. The name of the link is usually vi or joe. To avoid confusion and a possible conflict we use their subdirectory name along with their filename as the entity name.
URL/link names. The best way to avoid duplicate links is to append the filename to the link you are creating. This way when the section documents are imported into the book, the administrators won't have to deal with changing 5 or 6 section documents who all have url links to "intro". As an example we will fix the first section document I showed you at the beginning of this tutorial. We will assume the file is called howdy.sgml.
1 <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook V3.1//EN"> 2 3 <sect1 id="intro-howdy"><title>Hello world introduction</title> 4 5 <para> 6 Hello world! 7 </para> 8 9 *this file left incomplete intentionally* |
As you can see, now that I have appended "-howdy" to the end of the link, it is highly unlikely that when the administrators do a book build and all the section documents are imported into one file to be processed that the links will have conflicts.
Sections. Please notice that these templates all include a GNU Free Document License at the top. This is required so that each individual section is covered by our license. If you use your own templates you must include this license at the top of your sgml file. Please make sure it is commented out. If this license is unacceptable to you, then you should be publishing documents elsewhere.
1 2 <!-- 3 Copyright (c) 2001 NewbieDoc project at 4 http://sourceforge.net/projects/newbiedoc 5 Permission is granted to copy, distribute and/or modify this 6 document under the terms of the GNU Free Documentation License, 7 Version 1.1 or any later version published by the Free Software 8 Foundation; with no Invariant Sections, with no Front-Cover 9 Texts, and with no Back-Cover Texts. A copy of the license can 10 be found at http://www.fsf.org/copyleft/fdl.html. 11 --> 12 13 <!-- want to build this document? Uncomment the line --> 14 <!-- directly below. Make sure you comment it back out --> 15 <!-- before you commit it back to cvs. --> 16 17 <!-- 18 <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook V3.1//EN"> 19 --> 20 <sect1 id="name-of-document"><title>Name of document</title> 21 22 <sect2 id="intro-nameoffile"><title>Introduction</title> 23 <para> 24 some filler 25 </para> 26 </sect2> 27 28 </sect2 id="changelog-nameoffile"><title>Changelog</title> 29 <para> 30 some filler 31 </para> 32 </sect2> 33 34 35 </sect1> 36 37 |
Chapters.
1 2 <!-- 3 Copyright (c) 2001 NewbieDoc project at 4 http://sourceforge.net/projects/newbiedoc 5 Permission is granted to copy, distribute and/or modify this 6 document under the terms of the GNU Free Documentation License, 7 Version 1.1 or any later version published by the Free Software 8 Foundation; with no Invariant Sections, with no Front-Cover 9 Texts, and with no Back-Cover Texts. A copy of the license can 10 be found at http://www.fsf.org/copyleft/fdl.html. 11 --> 12 13 <!-- Uncomment the declaration directly below this when you --> 14 <!-- want to build this entire chapter without building --> 15 <!-- the entire book. Make sure you comment it back out --> 16 <!-- before you commit it back to cvs. --> 17 <!-- Don't forget to add an entity declaration to index.sgml --> 18 19 <!-- 20 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN" 21 [ 22 <!ENTITY sample SYSTEM "put_your_doc_here.sgml"> 23 <!ENTITY sample1 SYSTEM "put_your_doc_here.sgml"> 24 <!ENTITY sample2 SYSTEM "put_your_doc_here.sgml"> 25 ]> 26 --> 27 28 <chapter id="name-of-chapter"><title>Name of Chapter</title> 29 30 <!-- Section 1 sample--> 31 &sample; 32 33 <!-- Section 2 sample1 --> 34 &sample1; 35 36 37 38 </chapter> 39 40 |
New documents should be added to the appropriate chapter subdirectory. Using the previous example, if I added a new section document for emacs it should go into the text-editors directory. I would then add an entity in the text-editors.sgml file pointing to this new file, and an entity declaration in the index.sgml file. Before committing to cvs it is a good idea to "build" the chapter with your sgml parser and make sure everything builds without errors.
Here's a quick checklist:
Compose your new section document. Include the GNU Free Document License at the top of the sgml file. Run your sgml parser on it and confirm there are no errors.
Add an entity and entity declaration to the appropriate chapter file. Run your sgml parser on the entire chapter and make sure it and your new section doc build without errors.
Add an entity declaration to the index.sgml file.
Comment out the DTD declaration at the top of your section document and commit it to cvs. If you ran your sgml parser on the entire chapter comment out the entity declarations (only the declaration, leave the entity in there) in the chapter file before you commit it to cvs.
The only thing different about updating your documents versus adding them as new documents is you have to trace down and update all the entities and entity delcarations if you happened to change a filename.