Chapter Three: Begin with the Basics

So after yesterday, with lots of text to read and digest, you're probably wondering when you're actually going to get to write an actual Web page.

Welcome to Day 2! Today you'll learn about HTML, the language WWW hypertext documents are written in, and specifically about the following things:

What HTML is and why you have to use it
What you can and cannot do when you design HTML documents
HTML tags: what they are and how to use them
Tags for titles and headings, and paragraphs: <TITLE>, <H1>...<H6>, 
Tags for comments

What HTML Is...and What It Isn't

But before you dive into actually writing some HTML, you should know what HTML is, what it can do, and most specifically what it can't do.

HTML stands for HyperText Markup Language. HTML is based on SGML (the Standard Generalized Markup Language), which is used to describe the general structure of various kinds of documents. It is not a page description language like PostScript, nor is it a language that can be easily generated from your favorite page layout program. The focus of HTML is the content of the document, not its appearance. This section explains a little bit more about that.

HTML Describes the Structure of a Document

HTML, by virtue of its SGML heritage, is a language for describing structured documents. The theory behind this is that most documents have common elements--for example, titles, paragraphs, or lists--and if you define a set of elements that a document has before you start writing, you can label those parts of the document with the appropriate names. (See Figure 3.1.) And once you've labeled a document in terms of its structure, you can then write tools that do things like automatic indexing, or footnotes, or cross-references.

Figure 3.1. Document elements (3K GIF)

If you've worked with word processing programs that use style sheets (such as Microsoft Word) or paragraph catalogs (such as FrameMaker), then you've done something similar; each section of text conforms to one of a set of styles that are pre-defined before you start working.

The elements of a document are labeled through the use of HTML tags. It is the tags that describe the document; anything that it not a tag is part of the document itself.

HTML Does Not Describe Page Layout

What style sheets and templates do provide that HTML doesn't is the appearance of each part of the document on the page or screen. For example, styles in Microsoft Word not only have a name ("heading1," for example, for a heading), they also describe the font, the size, and the indentation, among other things, of that heading.

With a few minor exceptions, HTML does not describe the appearance or layout of a document. The designers of HTML did this on purpose. Why? Because if you separate the structure of a document and its appearance, you can then quickly and easily change the appearance of that document without a lot of tinkering. You can format the document in different ways for different audiences or for different purposes (printed or online documents, quick reference cards, help systems).Also, you or the readers of your document can reformat any text on the fly to different styles as desired. All that is needed is a formatting tool that can interpret the tags.

Web browsers, in addition to providing the networking functions to retrieve documents over the Net, are also HTML formatters. When you load an HTML document into a browser such as Mosaic or Lynx, that browser reads, or parses, the HTML information and formats the text and images on the screen. If you use different browsers, you may notice that the same document may appear differently in each browser--the headings may be centered in one, or in a larger font.

This does put a wrinkle in how you write and design your Web documents, however, and it may often frustrate you. The number one prevailing rule of designing documents for the Web, as I have mentioned before and I'll mention throughout this book is this:

Do NOT design your documents based on what they look like in one browser. Focus instead on providing clear, well-structured content that is easy to read and understand.

HTML Is Limited

So now you've realized that you won't be able to do really interesting visual things on the World Wide Web; that someone else is controlling what your document will look like. There's more bad news.

In the current state of HTML, the choices you have for the elements in your document (the tags) are also very limited. There are very few kinds of elements you have to choose from: headings, paragraphs, a few lists are essentially it. You can include images, but you can't align a column of text next to an image. You can't indent text, or center it, or format it into tables.

You also cannot make up your own elements (tags); if you could, how would browsers know how to interpret them?

So the answer is, this is what you're stuck with.

HTML's Advantages (Yes, There Are Some)

Working in a text-only markup language, with little control over the appearance of the text and limited tags to choose from, may seem frustratingly archaic in this age of fully-WYSIWYG desktop publishing. But for the kind of environment that the Web provides, HTML does have advantages over other forms of document publishing language that would include more features and allow you more control. For example:

Each HTML document is small, so it can be transferred over the Net as fast as possible. You don't have to include font or formatting information that would slow down the time it would take to load and display the document.
HTML documents are device-independent. This is a fancy way of saying that they can be displayed on any platform; all you need is a browser for that platform that understands HTML. You don't have to worry about font formats (or font names or whether a font is installed), or display resolutions, or whether you have a color monitor or not. The browser worries about that.

Also, although HTML is a markup language, it is an especially small and simple-to-learn markup language. There are very few tags to memorize, and there are simple editors that can even insert HTML tags into text for you. Other markup or page layout languages (such as the PostScript page description language, or troff on the UNIX system) are much larger and require a lot of initial learning before you can write simple documents. With HTML you can get started right away, as you'll find out later in this chapter.

Will It Get Better?

Yes, it will.

Most browsers available now support what is called HTML Level One--the first version of the HTML specification (consider it to be something like the 1.0 release of a software program). HTML Level One is the base standard for Web documents; a browser must support most, if not all, HTML Level One tags. This book focuses primarily on HTML Level One.

Two other levels of HTML have been proposed: HTML Level Two is similar to HTML Level One, but has additional features to support interactive forms. (You'll learn more about forms in Chapter 13, "Forms and Image Maps."). By the time you read this, most browsers should be able to handle HTML Level Two; many of the more popular browsers support them now.

HTML Level Three, often called HTML+, is proposed as the next major release of the language. HTML+ includes elements for such things as

Centered and right-aligned text
Tables
Mathematical equations
The alignment of text and images next to each other

HTML+ is, at the time this is being written, still very much open for discussion, and even when the specification is settled, quite a few elements are complex enough that it may take some time for browsers to be able to implement them effectively. But the future looks bright for a more flexible and general HTML language.

What HTML Files Look Like

Documents written in HTML are in plain text (ASCII), and contain two things:

The text of the document itself
HTML tags that indicate document elements, structure, formatting, and hypertext links to other documents or to included media.

Most HTML tags look something like this:

<TheTagName> affected text </TheTagName>

The tag name itself (here, TheTagName), is enclosed in brackets (<>).

HTML tags generally have a beginning and an ending tag, surrounding the text that they affect. The beginning tag "turns on" a feature (such as headings, bold, and so on), and the ending tag turns it off. Closing tags generally have the tag name preceded by a slash (/).

Not all HTML tags have a beginning and an end. Some tags are only one-sided, and still other tags are "containers" that hold extra information and text inside the brackets. You'll learn about these tags as the book progresses.

All HTML tags are case-insensitive; that is, you can specify them in upper or lower case, or in any mixture. So, <HTML> is the same as <html> is the same as <HtMl>. I like to put my tags in all caps (<HTML>) so I can pick them out from the text better. That's how I'll show them in the examples in this book.

Exercise 3.1. Take a Look at HTML Sources

Before you actually start writing your own HTML documents, it helps to get a feel for what an HTML document looks like. Luckily, there's plenty of source out there for you to look at, since every document that comes over the wire to your browser is in HTML format. (You usually only see the formatted version after the browser gets done with it.)

Most Web browsers have a way of viewing the HTML source of the Web page you're currently looking at. You may have a menu item or a button for View Source or View HTML. In Lynx, the \ (backslash) command toggles between source view and formatted view.

Some browsers do not have the capability to directly view the source of a Web document, but do allow you to save the current page as a file to your local disk. Under a dialog box for saving the file, there may be a menu of formats; for example, Text, PostScript, or HTML. You can save the current page as HTML and then open that file in a text editor or word processor to see the HTML source.

Try going to a typical home page, then viewing the source for that page. For example, Figure 3.2 shows what the normal NCSA Mosaic home page (URL http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/NCSAMosaicHome.html) looks like.

Figure 3.2. Mosaic home pages (35K GIF)

The HTML source of that page should look something like Figure 3.3.

Figure 3.3. Some HTML source (8K GIF)

Try viewing the source of your own favorite Web pages. You should start seeing some similarities in the way pages are organized, and get a feel for the kinds of tags that HTML uses. You can learn a lot about HTML by comparing the text on the screen with the source for that text.

Exercise 3.2. Creating an HTML Document

You've seen what HTML looks like--now it's your turn. Let's start with a really simple example so you can get a basic feel for what HTML looks like.

To write an HTML document, all you really need is an editor that can write text (ASCII) files. You can use a plain old text editor (for example, TeachText on the Mac or vi on UNIX), or you can use a full-featured word processor, as long as it can save the files as text only, with no control codes or funny characters.

Open up that text editor, and type the following code. You don't have to understand what any of this means at this point; you'll learn about it later in this chapter. This is just a simple example to get you started:

<HTML><HEAD>
<TITLE>My Sample HTML Document</TITLE></HEAD>
<BODY>
<H1>This is an HTML Document</H1>
</BODY></HTML>

After you create your HTML file, save it to disk--and remember to save it as a text-only file if you're using a word processor. One other thing to note when you save your file: Many HTML browsers use an extension to determine whether the file is an HTML or a plain file. So when you name the file, give it an extension of .html (.htm on DOS systems); for example, myexample.html or homepage.htm.

Now, start up a Web browser such as Mosaic. You don't have to be connected to the Network since you're not going to be opening documents at any other site (although your browser may require you to be on a network; since this varies from browser to browser give it a try and see what happens). Look in your browser for a menu item or button for Open Local.... (In Lynx simply use the command lynx myfile.html from a command line). The Open Local command (or its equivalent) tells the browser to read in an HTML file from a local disk, parse it, and display it, as if it were a page already out on the Web. Using your browser and the Open Local command, you can write and test your HTML files on your computer in the privacy of your own home.

Try opening up the little file you just created in your browser. You should see something like the picture shown in Figure 3.4.

Figure 3.4. The sample HTML file (6K GIF)

If you don't see something like what's in the picture, go back into your text editor and make the change. You don't have to quit your browser; just fix the file and save it again under the same name.

Then, in your browser, choose Reload or its equivalent. (In Lynx, it's Control+R.) The browser will read the new version of your file, and voila, you can edit and preview and edit and preview until you get it right.

A Note About Formatting

When an HTML document is parsed by a browser, any formatting you may have done by hand--that is, any extra spaces, tabs, returns, and so on--are all ignored. The only thing that formats an HTML document is an HTML tag. If you spend hours carefully editing a plain text file to have nicely formatted paragraphs and columns of numbers, but do not include any tags, then when you read the document into an HTML browser, all the text will flow all into one paragraph, and all your work will have been in vain.

NOTE: There's one exception to this rule; a tag called <PRE>. You'll learn about this tag tomorrow in Chapter 5, "More HTML."

The advantage of having all white space (spaces, tabs, returns) ignored is that you can put your tags wherever you want to. The following examples all produce the same output. (Try it!)

<H1>If music be the food of love, play on.</H1>

<H1>
If music be the food of love, play on.
</H1>

<H1>
If music be the food of love, play on.                    </H1>

<H1>  If  music   be   the   food    of   love,  play  on. </H1>

Programs to Help You Write HTML

You may be thinking that all this tag stuff is a real pain, especially if you didn't get that small example right the first time. (Don't fret about it; I didn't get that example right the first time, and I created it.) You have to remember all the tags. And you have to type them in right and close each one. What a hassle.

There are programs that can help you write HTML. These programs tend to fall into two categories: editors in which your write HTML directly, and converters, which convert the output of some other word processing program into HTML.

Editors

Many freeware and shareware programs are available for editing HTML files. Most of these programs are essentially text editors with extra menu items or buttons that insert the appropriate HTML tags into your text. HTML-based text editors are particularly nice for two reasons: You don't have to remember all the tags, and you don't have to take the time to type them all in.

I discuss some of the available HTML-based editors in Chapter 14, "HTML Assistants: Editors and Conveters." For now, if you have an HTML editor, feel free to use it for the examples in this book. If all you have is a text editor, no problem; it just means you'll have to do a little more typing.

What about WYSIWYG editors? The problem is that there's really no such thing as WYSIWYG when you're dealing with HTML, since WYG varies wildly based on the browser that someone is using to read your document. So you could spend hours in a so-called WYSIWYG HTML editor (say, one that makes your documents look just like Mosaic), only to discover that when the output of that editor is read on some other browser, it looks truly awful.

The best way to deal with HTML is not to get too hung up on its appearance. Write clear HTML code and make sure your writing is clear and well-organized, and the appearance will take care of itself.

Converters

In addition to the HTML editors, there are also converters, which take files from many popular word-processing programs and convert them to HTML. This is the closest thing HTML gets to being WYSIWYG; with a simple set of templates, you could write your documents entirely in the program that you're used to, and then convert the result, and almost never have to deal with all this non-WYSIWYG text-only tag nonsense.

In many cases, converters can be extremely useful, particularly for putting existing documents on the Web as fast as possible.

However, converters are in no way an ideal environment for HTML development. What converter programs exist (and most of them are shareware or public domain, with little support) are fairly limited, not necessarily by their own features, but mostly by the limitations in HTML itself. No amount of fancy converting is going to make HTML do things that it can't yet do. If a particular capability doesn't exist in HTML, there's nothing the converter can do to solve that.

The other problem with converters is that even though you can do most of your writing and development in a converter with a simple set of formats and low expectations, you will eventually have to go "under the hood" and edit the HTML text yourself. Most converters do not convert images. No converter that I have seen will automate links to documents out on the Web, although a few do links to related local documents.

In other words, even if you've already decided that want to do the bulk of your Web work using a converter, you'll need to know HTML anyhow. So press onward; there's not that much to learn.

Structuring Your HTML

HTML defines three tags that are used to describe the document's overall structure and provide some simple "header" information that browsers or HTML parsers can use to figure out what your document is, or to find out simple information about the document (such as its title or who wrote it) before loading the entire thing. The document structure tags don't affect what the document looks like when its formatted; they're only there to help tools that interpret or filter HTML files.

Although a "correct" HTML document will always contain these structure tags, if your document does not contain them, most browsers will be able to read it anyway. However, because it is possible that in the future the document structure tags might become required elements, or that tools may come along that require them, if you get in the habit of including the document structure tags now, you won't have to worry about updating all your files later on.

<HTML>

The first document structure tag in every HTML document is the <HTML> tag, which indicates that the content of this file is in the HTML language.

All the text and HTML commands in your HTML document should go within the beginning and ending HTML tags, like this:

<HTML>
...your document...
</HTML>

<HEAD>

The <HEAD> tag specifies that the lines within the beginning and ending points of the tag are the prologue to the rest of the file. There are generally only a few tags that go into the <HEAD> portion of the document (most notably, the document title, described below). You should never put any of the text of your document into the header.

Here's a typical example of how you would properly use the <HEAD> tag (you'll learn about TITLE later on):

<HTML>
<HEAD>
<TITLE>This is the Title.</TITLE>
</HEAD>
....
</HTML>

<BODY>

The remainder of your HTML document, including all the text and other content (links, pictures, and so on) is enclosed within a <BODY> tag. In combination with the <HTML> and <HEAD> tags, this looks like this:

<HTML>
<HEAD>
<TITLE>This is the Title. It will be explained later on</TITLE>
</HEAD>
<BODY>
....
</BODY>
</HTML>

The Title

Each HTML document needs a title. To give a document a title, use the <TITLE> HTML tag. <TITLE> tags always go inside the document header (the <HEAD> tags), and describes the contents of the page, like this:

<HTML>
<HEAD>
<TITLE>The Lion, The Witch, and the Wardrobe</TITLE>
</HEAD>
<BODY>
...
</BODY>
</HTML>

You can only have one title in the document, and that title can only contain plain text; that is, there shouldn't be any other tags inside the title.

When you pick a title, try to pick one that is both short and descriptive of the content on the page. Additionally, your title should also be relevant out of context. If someone browsing on the Web followed a random link and ended up on this page, or if they found your title in a friend's browser history list, would they have any idea what this page is about? You may not intend the page to be used independently of the documents you specifically linked to it, but because anyone can link to any page at any time, be prepared for that consequence and pick a helpful title.

Additionally, because many browsers put the title in the title bar of the window, you may have a limited number of words available. (Although the text within the <TITLE> tag can be of any length, it may be cut off by the browser when its displayed.) Here are some other examples of good titles:

<TITLE>Poisonous Plants of North America</TITLE>
<TITLE>Image Editing: A Tutorial</TITLE>
<TITLE>Upcoming Cemetery Tours, Summer 1995</TITLE>
<TITLE>Installing The Software: Opening the CD Case</TITLE>
<TITLE>Laura Lemay's Awesome Home Page</TITLE>

And some no-so-good titles:

<TITLE>Part Two</TITLE>
<TITLE>An Example</TITLE>
<TITLE>Nigel Franklin Hobbes</TITLE>
<TITLE>Minutes of the Second Meeting of the Fourth Conference 
of the Committee for the Preservation of English Roses, Day Four, 
After Lunch</TITLE>

HTML Input and Output

Input:

<TITLE>Poisonous Plants of North America</TITLE>

Output: See Figures 3.5 and 3.6.

Figure 3.5. The output in Mosaic (5K GIF)

Figure 3.6. The output in Lynx (4K GIF)

Headings

Headings are used to divide sections of text, just like this book is divided. ("Headings," above, is a heading.) HTML defines six levels of headings. Heading tags look like this:

<H1>Installing Your Safetee Lock</H1>

The numbers indicate heading levels (H1 through H6). The headings themselves, when they're displayed, are not numbered; they're displayed either in bigger or bolder text or centered or underlined or in all caps--n some way that makes them stand out from regular text.

Think of the headings as though they were items in an outline; if the text you're writing about has a structure, use the headings to indicate that structure, as shown in the next code lines. (Note that here I've indented the headings in this example to show the hierarchy better. They don't have to be indented in your document, and, in fact, the indenting will be ignored by the browser.)

<H1>Engine Tune-Up</H1>
    <H2>Change The Oil</H2>
    <H2>Adjust the Valves</H2>
    <H2>Change the Spark Plugs</H2>
        <H3>Remove the Old Plugs</H3>
        <H3>Prepare the New Plugs</H3>
            <H4>Remove the Guards</H4>
            <H4>Check the Gap</H4>
            <H4>Apply Anti-Seize Lubricant</H4>
            <H4>Install the Plugs</H4>
    <H2>Adjust the Timing<H2>

Note that unlike titles, headings can be any length you want them to be, including lines and lines of text (although because headings are emphasized, having lines and lines of emphasized text may be tiring for your reader).

Its a common practice to use a first-level heading at the top of your document to either duplicate the title (which is usually displayed elsewhere), or to provide a shorter or less contextual form of the title. For example, if you had a page that showed several examples of folding bedsheets, part of a long document on how to fold bedsheets, the title might look something like this:

<TITLE>How to Fold Sheets: Some Examples</TITLE>

The top-most heading, however, might just say:

<H1>Examples</H1>

Don't use headings to do boldface, or to make certain parts of your document stand out more. Although it may look cool on your browser, you don't know what it'll look like when other people uses their own browsers to read your document (and, in fact, it may look really stupid).

Also, it's a good idea to use headings hierarchically; that is, to start your document with a first-level heading and to use the headings in order. Don't skip levels. If you follow a first-level head with a fourth-level head, for example, readers will probably wonder what happened to the second and third level headings in between. Even though you may prefer the look of certain headings in certain places in your browser, they may look entirely different and be confusing to someone else using another browser.

HTML Input and Output

Input:

<H1>Engine Tune-Up</H1>
    <H2>Change The Oil</H2>
    <H2>Change the Spark Plugs</H2>
        <H3>Prepare the New Plugs</H3>
            <H4>Remove the Guards</H4>
            <H4>Check the Gap</H4>

Output: See Figures 3.7 and 3.8.

Figure 3.7. The output in Mosaic (8K GIF)

Figure 3.8. The output in Lynx (5K GIF)

Paragraphs

Now that you have a document title and several headings; let's add some ordinary paragraphs to the document.

Unfortunately, paragraphs in HTML are slippery things. Between the three versions of HTML, the definition of a paragraph has changed. The only thing they agree on is the fact that you indicate a plain text paragraph using the  tag.

The first version of HTML specified the  tag as a one-sided tag. There was no corresponding , and the  tag was used to indicate the end of a paragraph, not the beginning. So paragraphs in the first version of HTML looked like this:

The blue sweater was reluctant to be worn, and wrestled with 
her as she attempted to put it on. The collar was too small, 
and would not fit over her head, and the arm holes moved seemingly 
randomly away from her searching hands.<P>
Exasperated, she took off the sweater and flung it on the floor. 
Then she vindictively stomped on it in revenge for its recalcitrant 
behavior.<P>

Most browsers that were written at the time of HTML 1 assume that paragraphs will be formatted this way. When they come across a  tag, they start a new line and add some extra vertical space between the line they just ended and the one that they just began, as shown in Figure 3.9.

Figure 3.9. How paragraphs are formatted (8K GIF)

In the HTML Level 2 specification and the proposed Level 3 (HTML+) tags, the paragraph tag has been revised. In these versions of HTML, the paragraph tags are two-sided (...), but  indicates the beginning of the paragraph. Also, the closing tag () is optional, presumably to be backwards-compatible with the original version of HTML. So the sweater story would look like this in the newer versions of HTML:

<P>The blue sweater was reluctant to be worn, and wrestled
with her as she attempted to put it on. The collar was too small, 
and would not fit  over her head, and the arm holes moved seemingly 
randomly away from her searching hands.</P>
<P>Exasperated, she took of the sweater and flung it on the 
floor. Then she vindictively stomped on it in revenge for its 
recalcitrant behavior.</P>

The good news is that if you want to use the new version of the paragraph tag (as I do in all the examples throughout this book), most, if not all, browsers will accept it without complaint. (I haven't found any that have a problem with it.)

However, note that because many browsers expect  to indicate the end of a paragraph, if you use it at the beginning you may end up with extra space in between the first paragraph and the element before it, as shown in Figure 3.10.

Figure 3.10. Extra space before paragraphs (8K GIF)

If this bothers you overly much, you can do one of the following:

Go back to the old style of defining paragraphs.
Use  as a paragraph separator, rather than indicating the beginning or ending of a paragraph.
Leave off the first  in each set of paragraphs.

Some people like to use  tags to pad extra space around other tags to spread out the text on the page. Once again, the cardinal reminder: Design for content, not for appearance. Someone with a text-based browser is not going to care much about the extra space you so carefully put in.

HTML Input and Output

Input:

<P>The sweater lay quietly on the floor, seething from its 
ill treatment. It wasn't its fault that it didn't fit right. It 
hadn't wanted to be purchased by this ill-mannered woman</P>

Output: Figures 3.11 and 3.12 show the output.

Figure 3.11. The output in Mosaic (6K GIF)

Figure 3.12. The output in Lynx (5K GIF)

Comments

You can put comments into HTML documents to describe the document itself or to provide some kind of indication of the status of the document; some source code control programs can put document status into comments, for example. Text in comments is ignored when the HTML file is parsed; comments don't ever show up on screen--that's why they're comments. Comments look like this:

<!-- This is a comment -->

Each line should be individually commented, and it's usually a good idea not to include other HTML tags within comments. (Although this practice isn't strictly illegal, many browsers may get confused when they encounter HTML tags within comments and display them anyway.)

Here are some examples:

<!-- Rewrite this section with less humor -->
<!-- Neil helped with this section -->
<!-- Go Tigers! -->

Exercise 2.3. Creating a Real HTML Document

At this point you should know enough to get started creating simple HTML documents: You understand what HTML is, you've been introduced to a handful of tags, and you've even tried browsing an HTML file. You haven't done any links yet, but you'll get to that soon enough, in the next chapter.

This last exercise in this chapter shows you how to create an HTML file that uses the tags you've learned about in this chapter, so you can get a feel for what they look like when they're displayed on-screen and for the sorts of typical mistakes you're going to make. (Everyone makes them, and that's why its often useful to use an HTML editor that does the typing for you. The editor doesn't forget the closing tags, or leave off the slash, or misspell the tag itself.)

So. Create a simple example in that text editor of yours. It doesn't have to say much of anything; in fact, all it needs to include are the structure tags, a title, a couple of headings, and a paragraph or two, Here's an example:

<HTML>
<HEAD>
<TITLE>Company Profile, Camembert Incorporated</TITLE>
</HEAD>
<BODY>
<H1>Camembert Incorporated</H1>
"Many's the long night I dreamed of cheese -- toasted, mostly." 
-- Robert  Louis Stevenson
<H2>What We Do</H2>
We make cheese. Lots of cheese; more than eight tons of cheese 
a year. Your Brie, your Gouda, your Havarti, we make it all.
<H2>Why We Do It</H2>
<P>We are paid an awful lot of money by people who like 
cheese. So we make more.</P>
</BODY>
</HTML>

Save it to an .html file, and open it in your browser and see how it came out.

If you have access to another browser on your platform, or on another platform, I highly recommend that you try opening the same HTML file there so you can see the differences in appearance between browsers. Sometimes the differences can surprise you; lines that looked fine in one browser will look strange in another browser.

For example, the cheese factory example looks like Figure 3.13 in NCSA Mosaic (the Macintosh version) and like Figure 3.14 in Lynx.

Figure 3.13. The cheese factory in Mosaic (9K GIF)

And looks like this in the character-based Lynx browser:

Figure 3.14. The cheese factory in Lynx (5K GIF)

See what I mean?

Summary

HTML, a text-only markup language used to describe hypertext documents on the World Wide Web, describes the structure of a document, not its appearance.

In this chapter, you've learned what HTML is and how to write and preview simple HTML files. You've also learned about the HTML tags shown in Table 3.1.

Table 3.1. HTML tags from Chapter 3.

Tag                         Use
________________________________________________________________________________
<HTML> ... </HTML>          The entire HTML document
<HEAD> ... </HEAD>          The head, or prologue, of the HTML document
<BODY> ... </BODY>          All the other content in the HTML document
<TITLE> ... </TITLE>        The title of the document
<H1> ... </H1>              First-level heading
<H2> ... </H2>              Second-level heading
<H3> ... </H3>              Third-level heading
<H4> ... </H4>              Fourth-level heading
<H5> ... </H5>              Fifth-level heading
<H6> ... </H6>              Sixth-level heading
<P>... </P>                 Paragraph
<!-- ... -->                Comment

Questions and Answers

Why was HTML chosen as the language for the WWW when it's so limited?

At the time, the goal was simply to put hypertext information up on the Net so that it could be easily downloaded and formatted on the fly in a simple, device-independent way. Given those goals, HTML was an ideal language: simple, small, fast to download, and easy to parse. Since then, new features like images and forms and other media have been added. The limitations of HTML didn't become readily apparent until these new browsers and capabilities came along, and more and more people wanted to publish other kinds of information. And it happened so fast!

HTML Level Three should solve many of these limitations. But there's a long way to go yet before HTML allows full control over formatting and layout, simply because of the speed with which it needs to be downloaded over the Net and formatted. If each Web page you viewed took half an hour to load, would you want to read it?

Can I do any formatting of text in HTML?

You can do some formatting to strings of characters; for example, making a word or two bold. You'll learn about this tomorrow, in Chapter 5.

I've noticed in most Web pages that the document structure tags (<HTML>, <HEAD>,<BODY>) aren't often used. Do I really need to include them if pages work just fine without them?

You don't need to, no. Most browsers will handle plain HTML without the document structure tags. But including the tags will allow your documents to be read by more general SGML tools, and to take advantage of features of future browsers. And, it's the "correct" thing to do if you want your documents to conform to true HTML format.

I've seen comments in some HTML files that look like this:
<!-- this is a comment >
Is that legal?

That's the old form of comments that was used in very early forms of HTML. Although many browsers may still accept it, you should use the new form (and comment each line individually) in your documents.

For more information about this Web book, contact lemay@lne.com

Home What's Book- Reference Software Overview Talk To Us Page New Store Desk Library

For comments about this site, contact webmaster@mcp.com.
For technical support for our books and software, contact support@mcp.com.