Things you should know about HTML • James Harbeck

This is the unedited script for the presentation at the 2004 EAC conference. it probably contains a few typos and may in some cases be less explicit than the presentation would have been. But it should nonetheless serve the purpose.


<style> and <img> and <p> and <br>,
<blockquote> and <table>, <td> and <tr>,
<ul> but also <ol> and <dl>...
things you should know about HTML...

Padding or margin? Use float or align?
What special symbols will always work fine?
[Will] someone's old browser send your page to... well...
Things you should know about HTML...

Do your tags close?
Are they nesting?
Do you use CSS?
Once we’ve gone over these things you should know,
you will not have... to guess!


OK, to start with, how familiar are you all with HTML? Raise your hand if you know pretty much nothing. ... Raise your hand if you work with it fairly regularly.

I'm going to be covering mainly things that an editor's likely to need to know (or to find useful to know). I'll cover the basics, but I'll also cover some things that will be useful for people who already use HTML a fair bit. And I'll certainly cover a lot of turf. Let me just say – I'm glad I got a 90-minute session.

For more on what I won't be covering and a solid reference for everything HTML, I highly recommend

The most fundamental principle of HTML can be expressed in one word: nesting. An HTML tag is like a container – or, well, it's like a set of parentheses. Most HTML tags have an opening and a closing, and it's like open and close parentheses. You put in the opening tag... you can then put another tag inside it, but you have to close the inner tag before you close the outer tag!


You see how you close a tag: with a slash. See, a tag closing is just like a store closing: "We're slashing everything!"

Not all tags have to be closed, though. There are some that represent elements like line breaks, horizontal rulers, and images, things that just sit there. They don't close, because they're not containers, they don't affect stuff inside them. If you want to italicize text, you have to say where the italics start, and where they stop, so you're using the tag to affect other material. But you don't put anything inside a picture or a horizontal rule.


You'll notice that some of these tags have attributes. These are things that give further details as to the nature of the tag. And I'll be covering those, too. When you close a tag, you don't have to close the attributes as well.

The second thing that's really important in HTML is that it uses tables. Almost every website you visit today is made up of tables within tables within tables. You're all familiar with tables that look like tables...


...but this is a table too.


And this is made up of a bunch of tables! [show home page]

Oh, there's one other thing I might as well mention now: an HTML file is really just a text file that contains text using markup that a browser can read. Y'all probably know that, and there's a good likelihood that y'all know that all the images are just called by the file; they're not embedded in it – actually, I just showed you that. But one noteworthy consequence of their being plain old text files is that HTML reads double or mutliple spaces as single spaces. See, the HTML files tend to be formatted in a way that a human can read without going blind and crazy, with nice indents and so forth. [show code of page] One thing that follows from this is that those indents have to be ignored. How does that happen? Your browser interprets all double or multiple spaces as single spaces. One nice result of this is that when you paste text into HTML, those annoying double spaces that you can't get authors to stop putting after periods are automatically rendered as single spaces. To make a double space – should you for some bizarre reason wish to do so – use a non-breaking space character, which is normally made with Ctrl Shift space (and shows up as &nbsp; in the code). This is a character that looks like a space but acts like a letter, e.g., A or X. So if you put it between two words, the two of them and the space are treated as one word and won't be split over a line break. For a double space, use one of these and a regular space.


[music]But let's start at the very beginning... for that's a very good place to start... in words, you begin with A-B-C; in HTML it's H-E-A-D! H-E-A-D! The very first tag just happens to be... head![/music]

Actually, the very first tag is <html>, but that's actually optional, and you don't need to worry about it. You should worry about the headmatter in your HTML documents.


See this? [point to title bar of browser] That shows up on top of every browser, and it prints in the page header when you print the page out. There are a whole lot of web pages out there that have no title, or inappropriate titles, or... When you're putting out a book, you check your headers and footers, right? Same thing here. This tag is, suitably, <title>.

The other important tag for the head is <meta>. You'll see four meta tags here. A meta tag is really just information about the document, useful for search engines, mainly, and in some cases for browsers. In the first meta tag, I specify what kind of document it is and what the characterset is. This tag is actually added automatically by Dreamweaver and, I think, most other authoring applications. The next three tags give information about the document: who made it, what it is, and what it's about. The meta description tag is often what's used for the description in a search engine listing, so it's important. It's also important to make sure it's short enough not to be cut off in the middle in the search engine listings! The meta keywords tag is used by some search engines to categorize information. They're not as important as they used to be, as people were doing all sorts of naughty little things with them to improve search engine rankings. Google doesn't really use them at all.

You'll see one other tag in the head: link. This tag associates the document with another document. This isn't a hyperlink tag – it's not what you click on to go to another page. In this case, what it is is a specification of another document as the style sheet for this document. The browser knows to go to eg.css to look for the styles for use in this document.

There are a few other tags that can go in the head as well. I'll mention one of them – style – when I talk about cascading style sheets, anon. The others aren't really useful for our purposes, and you can find out about them at

Below the head comes – can you guess? (There is no <neck> tag.) It's the body. All the text, images, everything you see in the document, is in the body. So you start the body with <body>, and you end it with </body>.

In the body tag, you can specify some overall things for the page. You can specify a background image, for instance. Some people do some very ill-judged things with background images. Remember that the page has to be legible, no matter how attractive the background image is.


You can also specify a background colour if you don't have a background image. If you want it to be white, specify it as white – you usually use the hexadecimal code for white, "#FFFFFF". Not everybody does this. I highly recommend it. Here's why: not everyone has their default application background colour on their computer set to white. The average person programming a page will see the background as white and will assume that everybody will see it that way even if they don't specify it. Nuh-uh. You get people like me. I set my application background colour to a sort of beige – it's easier on my eyes. And when someone builds a page without specifying the background colour, expecting it to be white, I see this kind of thing.


And I'm not the only weirdo out there! So make sure your background colour is specified, even if you just want it to be white.

You can also specify what colour your links are. In fact, you can specify what colour a link appears as before a person clicks on it, while it's being clicked on, and after it's been clicked on. And you can specify what colour your text is to be. But you may prefer to use style sheets for that. Which I'm getting to...

So here's what you can specify in the body tag.


And I know you're all burning to learn the arcane mystery of specifying colours. Well, here it is. [Show from eg8]


Now, there are more tags that can be used in the body than I'm actually going to cover here. I want to cover the stuff that you all are going to be using, and using a lot. Text formatting. Let's begin with some of the more common tags, and I'll talk about a couple of clever things you can do with them. And before too long I'll also be talking about HTML characters and about cascading style sheets.

Your number one most common text tag is probably <p>. This means paragraph. Now, you can have text that's not in a paragraph tag or any other tag. If there's only one block of text on the page, a <p> tag won't be necessary. But I recommend it anyway, especially if you're using style sheets. Here's an example of why.


The <p> tag doesn't actually need to be closed. You can start a new <p> without closing the previous one. Most of the time, it won't make a difference on most browsers. But it will make a difference on some browsers, and there are some times when it will be a real problem. I'm thinking of a website I work on that's database-driven. We put the content in HTML formatting into the database, and it's pumped in to fill different parts under different headings on the page. The programmers didn't use heading or paragraph tags for the headings, because as far as they knew all the paragraph tags were closing. Tsk tsk tsk. Sometimes they weren't, thanks to text importing and exporting quirks in Dreamweaver.


You can see there is a tag on those headings – but it's <span>, which simply applies attributes. We had to get them to go back and fix this!

It can also affect things like horizontal rules. They may show up closer to the line above them if you don't close the paragraph tag. Tables may also behave differently after an unclosed paragraph.

If you want an extra blank paragraph line, by the way, make sure to put a non-breaking space in it. If you just have an empty <p> tag, it may not show up as an extra space. But it may... so just don't have empty <p> tags anywhere.


Another popular tag is <br>, the line break tag. Where a <p> tag normally means a full line space after it, a <br> tag just puts in a single line break. Some people like to use two <br>s in place of a <p>. I recommend against this in most circumstances. Say, for instance, I apply a style to a paragraph. It will apply to all the "new paragraphs" below if they're really only separated by <br> tags. And if that style specifies a different space between paragraphs than normal, well, I've defeated it by using the <br>s. Also, because a <br> isn't a new paragraph, it won't get any indent you specify.


Basic formatting for such things as bolding and italics is, of course, quite common. For bold, the tag is <b>; for italics, it's <i>. There are other ways of getting bold and italics. You can use style sheets – but if all you want is to bold or italicize, a simple <b> or <i> tag will be more space-efficient and easier to apply, and it will normally override your established style (I'll show you a way of getting some insurance on that in your style sheet). The people who are in charge of HTML standards have come up with a whole bunch of little tags that indicate just why you're bolding or italicizing text – for emphasis, for instance, or as a citation. Here they are:


Now, these tags are all wonderful and precise and all that stuff, but most web authoring software doesn't make it easy to use them – you have to hand-code them, which slows you down – and for most purposes it makes no difference what tag you use to produce a given visual effect, whether or not you think it should. Moreover, while you can specify specific appearances for them in your style sheets, you do run a risk that a given browser won't support the tag in question – they're not universally supported, and some are more broadly supported than others – and so you might not get the desired results.

I recently wished aloud that there was a way of tracking changes in HTML. Well, in fact, there is – there are tags to mark insertion and deletion, and you can specify the source of the insertion or of the reason for the deletion and the time and date you inserted or deleted the text. Now, not all browsers recognize these as valid tags, and if a browser doesn't recognize a tag, it just ignores it, so if you're using these to track changes, you may want to use other formatting to go with them, for instance strikethrough for deletions. Or put it inside a comment tag, which hides anything from view. (Just remember that it's still there in the code, and anyone can view your code.)

There is also, by the way, a tag that will show the text with the spacing and line breaks exactly as you have them, without any reflowing of the text, which browsers usually do. The tag is <pre>. I honestly can't think of anywhere I'd use it, however. There are other, more pleasing, ways of controlling appearance.

There's another word-level formatting tag that some people like to use: <u>. What does it stand for? [Underline!] Now tell me when you should use it. [Never!] "Never" is correct! It's an established norm in web text for underlining to mean a hyperlink. If you underline text that's not a hyperlink, you're going to confuse people. Also, you are no longer working on a typewriter. Underlining is in general a nasty thing to do to type and was really brought in to signify italics. But we can do italics here. And underlining bold or italics is really overkill.


There is one other similar formatting tag you can use, though it's not quite universally supported: <strike>, for strikethrough. Not that you'll get too many chances to use it. Unless you're trying to be cute.

You may not know this, but you can also do subscript and superscript. Note, however, two things: first of all, it doesn't automatically change the character size; you have to do that separately. Secondly, it can disrupt the line spacing. This is a bit of a pity, since the ... um ... people of questionable character who designed fonts for computers thought that the registered trademark symbol is a full-height character, when in fact it should be superscripted. But if you superscript it, even if you make it smaller first, you can screw up the line height or interfere with the line above.


One other real drag about these tags is that, depending on the web authoring software you're using, you may find yourself having to hand-code them in, rather than having the keystroke convenience you get with, for instance, italics and bold. So you may want to use styles instead. And styles actually give you more control over the size and vertical alignment. We'll get to them... anon!

There are some cases, however, where you think you'll want to use <sup> where in fact you don't have to because there are special pre-superscripted characters available that won't muck up your formatting. Here are the stars of the day.


Look! They actually got the TM right!

Beware of the difference between the masculine ordinal and the degree sign! The masculine ordinal is not necessarily perfectly round and may have an underline.

Also beware that, with the exception of the degree sign, these may not display properly on Macs, especially depending on how you encode them, thanks to differences in the character sets. You'll notice that there are two ways of encoding a given special character (well, actually, there are more than two, but there are two reliable ways), by name and by number. And there are some cases where the numbered way will be more guaranteed to work than the named way – even though in some cases the numbered way is not official (!).

That brings us to the whole matter of characters. There are many characters which should be rendered in HTML as character codes rather than simply typed in as you see them. In some cases, notably angle brackets, this is because if you have them as they are in the code it will go meshugah. In other cases, it's because a character that looks fine on one computer with one character set will look completely wrong in another computer with another character set. You've probably seen sites that have, say, a capital O with a grave accent in place of a punctuation mark. Using the right character codes will prevent this. Most of the time, anwyay. There are still those characters missing on the Mac...


Here are all the characters that should work everywhere.


Here are the characters that are likely to work pretty much everywhere, but you can't be 100% sure, and you'll probably do better using the unofficial number codes.


Here are the characters that the Mac doesn't have. Note that the fractions should display, they just won't display as single-character fractions. Some of the other characters will show up in somewhat crappy-looking versions in Internet Explorer on the Mac; they're provided by the browser in spite of the computer.

There are also a whole lot of other characters that have been introduced in the HMTL 4.0 standard, characters for symbolic logic and algebra, for instance, including the whole Greek alphabet. And many of them will show up on recent versions of Internet Explorer on PC, so you'll have most of your audience covered – but people with older browsers or with Macs won't see what you want them to. At least not yet. You can find out about these characters on

Now let's go back to text formatting. There are a few things I haven't dealt with yet. Headings, for instance. HTML allows six different levels of heading. I think a lot of people veer away from them because they look kind of ugly in their default presentation on most browsers.


They start way too big and they get way too small really quickly. But remember, you can apply styles to them – yes, yes, I'm getting to style sheets, right after I finish with the text tags. And since web authoring software pretty much universally makes it easy to apply heading styles, you've got a pretty good little style tool here. If I just specify in the style sheet how they look...



Now, there's one aspect of formatting that's an absolute staple on the web that I haven't even mentioned yet. Anyone? [lists] Lists! yes, lists. How many kinds of lists are there in HTML? [...] In fact, HTML offers five different list tags. However, two of them – directory list and menu list – are really redundant – just less-useful variants of the unordered list –  and even the guys who set the standards now say don't use them. But here are the other three:


unordered list, ordered list, and definition list. How many of you knew about definition lists? A definition list just allows two levels of indent: none, and some. You open it with a <dl> tag, and the term – no indent – is <dt> and the definition – indent – is <dd>. There are no bullets or numbers or anything. It can be useful on occasion, and not necessarily just for definitions. But don't use it just to get indents. <dt> by itself won't give you any indent – it doesn't give a hanging indent – and <dd> just gives you the same indent as <blockquote> does, and <blockquote> is more reliable.

Let's turn to unordered lists. There are a few things that not everyone knows about unordered lists. First, you can specify what kind of bullets to use: disc, square, or circle. Actually, you can also specify an image, but you have to do that with CSS. (Yes! I'm getting to CSS! Right after the lists.) You can specify bullet type for the whole list, or you can specify it item by item.


What fun. You should also know that the <ul> tag by itself serves as an indent marker, basically the same kind of thing as a <blockquote>. Text that's in a <ul> but not in an <li> is just indented. So you can get subheads in lists. In fact, here are a couple of ways of doing roughly the same thing – though note that your style settings may affect whether it's exactly the same thing.


You see that I can start a new list, or I can just put in line breaks. I could put in a <p> tag in place of the line breaks, but the gap would be much larger – rather too large, in fact.

Lists, by the way, are one place where going into the code can save you time over using a program like Dreamweaver. Dreamweaver won't even do that little subhead stunt I just showed you. And Dreamweaver will also make more bother if you want to rearrange items in a list. Best to go into your code view and just drag and drop. [Demonstrate?]

Ordered lists offer similar kinds of fun. In fact, with ordered lists, you have your choice of five different types of list, and, yes, yet again, you can specify them on the item rather than on the list... if you're a demented psycho.


You can also specify where to start the list – the list doesn't have to start at 1. You can do this with any type of list, though you specify the start always in Arabic numerals or it ignores it. But do be sensible about it – the indent isn't going to adjust automatically for your silliness.

There is one more very, very popular tag for text formatting. It's used everywhere. But I'm going to tell you to avoid using it. Guess what tag it is. [...] <font>! Yes, I'm going to tell you not to use the tag that everyone likes to use to specify their font face, size, and colour. And why am I going to tell you not to use it? [...] Because you should be using cascading style sheets!

[music]Yes! Yes! It's time for CSS! It'll clean up the mess! By now you need not guess... it's time - for - C - S - S![/music]

I might as well start by saying why they're called cascading style sheets. It's because you can apply them at several different levels, and the more local levels override the more global levels. So I can specify a style sheet document for a page. And then I can specify local styles for that page that may override some of the styles in the style sheet document. And then I can specify on a specific line a style that will override the page styles for that line. And within a style tag, I can apply another style tag to override the first style tag. In fact, look at the code on example 25 here. I've applied a style to a whole stretch of the document using the <div> tag – don't worry, I'll get to the <div> tag. Then I've applied a style to this paragraph to display the code. Then I've applied another style to this section of the paragraph to change the colour.

But note that if you override a tag with another tag, only the things actually specified in the overriding tag will be altered.


Just because the text in a given tag normally looks black because black is the default colour doesn't mean that the tag will make text black when it's applied. If it doesn't specify, it doesn't specify.

But let's get down to what you can do with CSS. And when it comes to text formatting – style and making things look pretty – the answer is pretty much everything you can do with HTML, and then some, and often more efficiently. This is not to say that CSS is the perfect solution for every bit of style and prettification. But it's as useful as style sheets in Word or Quark – in fact, it's a bit more useful in some ways, since you can also specify styles for tables and other graphic elements.

To specify a style for a given tag in HTML, you have a couple of options: you can set in the style sheet that all instances of a specific tag will have the attributes you specify, or you can set a named style that can be applied to one or all tags, even overriding the standard style for a given tag.

Now, in your handout, I refer to "selectors" and "subclasses". I do this mainly to confuse you and impress on you that this is difficult stuff you shouldn't play with. NO I DON'T! Actually, this is very easy and straightforward, especially if you have a nice reference sheet... which you now do. I put in those terms just because they're the technical terms for these things and it saves a bit of space to use them. But you'll probably find it easier to think of setting a style for a tag, creating styles that can be applied to a specific tag, and creating styles that can be applied anywhere.

[eg27] [go through examples]

You can see the syntax of the style definition: your selector or subclass or whatever you feel like calling it, and the definition following inside curly brackets. You don't need to put a space between the curly brackets and what they enclose; I just do it for visibility and tidiness. In the definition, you put the property, colon, values, and, as necessary, a semicolon before the next property, and so on. I'll tell you all about the style elements available in a few minutes.

You'll notice on the sheet that there are also a few extra little tricks for specifying styles. You can set a tag to show one way inside one tag and another inside another.


You can also specify the appearance of links, as you see on your handouts, and, yes, you can apply first-line and first-character styles!

How and where do you define these styles? There are three places you can do it: in a separate style sheet, in the head of your document, or right in the location you're applying the style. You'll find setting a separate style sheet document optimal, because it allows you to set a style sheet that you can apply to any number of HTML pages. You just put all the styles in the sheet in the standard syntax, with nothing else – no hello-how-ya-doin', no this-is-this, no so-long-see-ya-later. Then you specify what style sheet you're using in the head of your HTML page. The way to do that is in your handout. (Did everyone get a handout? It's also available for downloading from my website.) I don't bother specifying the media, but you can, and cross your fingers while you're at it, because you're still at the mercy of the compliance of the browser you're using.

You can also stuff the styles right in the head. Just open a style tag, list them, and close the tag. It's generally recommended to put a comment tag around the styles so really old browsers won't interpret them as text. But this isn't normally an issue.

And you can also specify them on the spot, if you have a one-off style you need just for the nonce and you don't want to have to bother setting it up in the head or the style sheet and then calling it forth. You just use the attribute style on any tag and put inside the quotes what you'd normally put in curly brackets – oh, but note that you'll run into trouble with fonts with multi-word names, as they also use quotes.

Oh, and how do you actually apply the styles? There are two tags and one attribute that you'll find useful. The tags are <span> and <div>. <span> is for little bits, inside a paragraph, say. In spite of its name, it doesn't span block-level tags! For that, you need <div>, which is short for division. You can set a style for as much of the document as you want with <div> – but remember that it has to obey the rules of nesting. You can't open a <div> before a table and close it in the middle of the table. Your browser will be unhappy with you. You can also apply the style directly to any HTML tag. And the attribute you use on these tags to specify your style? Naturally, whoever has style has class. So if you want to apply a bit of style... you use class. [music] If your tag is in some doubt, a plain brown bag, make it stick out for half a mile... use a little style. Set your style, give it a name, and then while you play the game, you make your pass... slip it in with class. [/music]

But just how much style can you have? [suave] How much style can a person possibly have? [/suave] Well, just with the examples I've shown so far, you get some idea of what can be done with CSS. But wait – there's more! I'd like to do a full run-down of the things you can do. Follow along on your handouts.

With fonts, you can specify the kind of font, and you can specify several variants in order of priority: if it doesn't have the first one, it'll use the second; if it doesn't have the second, it'll try the third, and so on. Use quotes to enclose font families with names of more than one word. You can specify bold, italic, or oblique. You can specify small caps. You have absolute and relative ways of specifying font weight, and, as long as the browser cooperates, you can specify up to nine different weights.


You can also specify up to... how many different sizes? Is the answer seven? Nooo... it's not! You can specify sizes in several different ways. It really comes down to what your screen resolution is. It has to be big enough to be legible, and small enough to fit on the screen. Between those two parameters, you really have several hundred possible sizes, though you'll probably want to stick within a normal range.


And you can specify all of these properties in a bunch if you want. The order in which you specify them within the font element doesn't matter. You can also specify line height in that bunch. And line height is basically what you'd call leading in typesetting. You can also specify word spacing, letter spacing, text decoration – yeah, this is where people get that annoying blinking text. Ever wonder who the perverted psycho was who came up with that one? I do. Because I want to track him down and stick a big flashing neon sign outside his front window.

The vertical-align property has a number of different values, and this can really be quite useful. Here's an example of how: You will remember I ranted a bit about how the registered trademark symbol isn't superscripted in the character set as it ought to be, yes?


Now you have a bit more control!

You can also set text to be all uppercase, all lowercase, or – and really, don't do this – capitalizing every word, meaning every word.

With bullet lists, you have the same set of options as with HTML, plus two: you can use nothing, or you can use an image.


Now, moving to the general properties, the first thing we see is display. In general you don't need to use this. But if for some reason you want some block-level tag – a tag that's supposed to get a line space before and after – to become an in-line tag, getting no space, or to become a list item, or not to display, well, you can do it with this. You could specify here that text inside a <del> tag – deleted text – doesn't display, just as a bit of backup.

Whitespace refers to the treatment of spaces between words. Normal means that multiple spaces display as one, as usual with HTML. Pre means preformatted text, just like the <pre> tag. Nowrap means it doesn't put in line breaks unless you explicitly put them in.

Colour can be specified in any of quite a few different ways. You can use the normal way I've already shown you; you can use only 16 levels of each colour (so instead of 33 you put 3, instead of CC you put C, and so forth); you can specify 256 levels in decimal rather than hexadecimal; you can specify in percent; you can even use one of sixteen attractive colour names.


These are actually HTML standard colour names; you can use them wherever you specify colour in HTML.

For the last two, you'll notice I've specified background-color as black so you can see them. Yeah, that's right – you can highlight text. That's the background-color property, a couple of lines down.

Width and height are self-explanatory. You can set the size of whatever you want with these, though generally people will use them on tables or images. You can set the width of a text element. But it won't squish the text down – if the longest word is too wide, it's too wide, and the margin gets pushed out.


And it won't stretch the line spacing reliably if you change the height, either. It'll stretch and squish pictures, though, just like it should.

Float is a way of getting text to flow around a picture or table – or whatever you want. It can be very useful for things like pull quotes, expecially when combined with some of the other properties below: margin, padding, border.

[eg35] [show code]

What's the difference between margin and padding? Margin goes outside the border, and padding goes inside it.

Now, by the way, you can do this stuff with tables, without CSS, and it'll work even on the most ornery non-compliant browser – it's a bit less elegant, though. I'll show you that when I get to tables.

Clear means that an element won't wrap around a floating element – it'll just duck past it.

While I'm talking borders, I might as well show you what the different border styles look like.


I've set the width to 6 and the colour to lime so you can see what's going on. Note that where you see the black parts, it's always black no matter what, so if you set the colour to black, it just looks solid black.

As to background images, it's the same deal as with HTML, but, again, you have more choices. You can repeat the image, you can repeat it vertically but not horizontally, you can repeat it horizontally but not vertically, you can use the image only once. You can make the image scroll up or stay put. And you can apply background images to whatever you want.


Whether you should is another question.

So you get the idea of the kind of power you get with CSS. I encourage you to play around with these things, to try them out in various situations so you know what they can do for you. I sure had a lot of fun doing it.

Now I think it's time for tables. In HTML, [music] you do as you can, you do as you're able, but you can't do too much without a table. [/music] If you look at your sheet, you'll see whole host of tags involved in making tables. In truth, most people in most cases use exactly three of those: <table>, <tr> – which defines a table row – and <td> – which defines a table cell (why not <tc>? the d stands for data). You open a table with <table>, then you start a table row with <tr>, and you start a cell with <td>. You put the contents in the cell (text, image, whatever). You close the cell, you start another, until you've gotten to the end of the row, then you close the row. And so forth. You just have to remember to put the right number of cells in each row. If you don't, you get a bit of a mess.


The browser will usually assume the table is as wide as the widest row and fill in each row from the left.

You can have a cell span multiple columns or multiple rows, or both.

Now, if you're using a lower-end authoring program, you can have some frustration with tables, because they handle them in very unsubtle ways. Some programs and a lot of people think tables always look something like this.


But, as I mentioned at the beginning, tables are actually the basis of almost everything you see on a website – tables within tables, assorted multicellular things holding images in each cell... Here, look at this.

[go back to, briefly describe table structure; perhaps have someone suggest a site to look at]

Now, for your purposes in handling text, and especially to start with, you're not going to need to think about tables of such all-fired complexity. But you will need to think about tables. And you'll especially be wondering about positioning them and getting the right kind of space in and around them.

To start with, most of the time, you're not going to have a visible border on your table. Even if you're presenting an actual table of items, there are more visually agreeable ways of doing so. Hmm... like this, perhaps...

[click "or, on the other hand, this" to bring up eg17 again]

Of course, that's using CSS. You won't always want to rely on CSS. You can get other usable effects with fairly basic table properties.


Note that I've done something here that's not official standard HTML: I've specified the background colour in the table row tag. Actually, they're encouraging people to use CSS to specify background colours now, and so I didn't include the bgcolor attribute in your handout. But it is nonetheless available.

Now, to the matter of putting text in tables. Let's say you have text that, for some reason, you want to show up in two columns on the page. First of all, there's no way of having the text actually flow from one column to another. Whatever text is in one table cell stays in the table cell. Sooner or later they'll work out a way of letting it flow, but it ain't here yet. So you have to be ready for the possibility of uneven columns. And don't break in the middle of the sentence, because you don't have final control over what size the font is in the person's browser, so you can't guarantee you'll end up at the end of a line.


Now, you see that nice little column down the middle. What's the best way of getting that? Here, you can see it better when I use align=justify. Oh, by the way, when is setting alignment to "justify" appropriate in HTML? [...] The answer is "practically never." Why? Because the words don't break in the middle. So you get these awful stretched spaces in some places. They did invent a character called "soft hyphen" – &shy; – to allow words to break, but most browsers don't use it properly, and you have to indert it manually into every word you want to allow to break. Forget it.

Anyway, how do you get that nice gutter, if you want it? (And why can I never keep my mind out of the gutter?) Do you use cellspacing? Cellpadding? Let's see...


Now, this gutter is 12 pixels wide. First we try with cellspacing=12. What does cellspacing do? it puts space between the cells, and also between the cells and the outside. It's useful for some things – I used it for those little spaces between the cells in the table I just showed you. But look what it does to the alignment: you're miles away from the left edge.

So we try with cellpadding=6. Six? Yes, because cellpadding is the space between the edge of the cell and its contents. Six here, six there – twelve. This can be quite useful in some circumstances. But here it pushes is six pixels from the margin. So what do we do?

We insert another cell and specify the width as 12. And we set the cellpadding and cellspacing – and border – to 0. Could be just leave them out of the specification? [...] No. Why not? Because if you don't say you want them to be zero, the browser will just set them at whatever it feels like setting them at. Usually this is cellspacing at 1 or 2 and the others at 0, but it varies.

Now, here's another thing you can do with tables: you can make lists that even CSS won't help you with.


See, you can't use tabs in HTML. Pity. You can't set up a nice hanging indent like you can in Word or Quark or whatever. And, though you have several options of bullet styles, you do not – not yet, at least – have the option to replace the bullets with words. So how do I do this? I make the table three cells wide, with the middle cell empty. I set the left cell to align right and vertical align top, and the right cell to align left. The usual defaults are left and middle. And I stick an empty row in, six pixels high, for the space between lines. I do not put a nonbreaking space in it, because then it will be as high as the text! It must be as empty as the essential nature of everything. Sorry, Zen moment there.

Now, about aligning. You'll see that you can set align for table cell, table row, and table. When you set it for table row, it sets for all cells in that row unless otherwise specified. But when you set it for table, it means something else altogether. It means how the table will align with the text around it.


So align=left means float left, and align=right means float right. And align=center means hang in the centre while the text just goes beneath.

I've done something else with these tables, by the way. You see how they have a nice bit of breathing space around them? How do you think I did that? [show code]

Now, what are all these other tags for tables on your handout? Well, some of them aren't even very widely supported by browsers. But they can come in handy on occasion. <th> is... well, it's in the wrong place, sorry, for a start; it should be under <tr>. It's like a <td> but it's meant for the header row of the table. It's useful if you want to have a different style for table header rows; just set a style for <th> and put that in place of <td> in the first row. Or, on the other hand, you can just make the top row <td>s and assign them a different class. Most browser, by the way, bold content in <th> cells by default. <caption> is meant to assign a caption to the table, either on the top or on the bottom. <tbody> says where the main body of the table starts, but it's not necessary and it's rarely used these days. And <col> and <colgroup> are intended to allow specification of properties such as width and alignment for a whole column. The don't replace <tr>; you still have to build the table row by row; but the idea is to specify things like width and alignment for the columns at the top of the table, before getting to the details, rather than just having to code it in each individual cell of a column. Not a bad idea, but not so widely supported yet, as far as I know. <thead> is meant to specify certain rows as header rows, with the idea that if the table is really long and you print it out, the header rows will appear on each new page. <tfoot> does the same for footer rows. They have some charming features, such as an approximation of a decimal tab. But they're not widely suported yet. If you want to know more about them, I recommend

That brings us to images. I've covered them to some extent already. But there are advantages to putting in an image with its own tag rather than as a background image or what have you. For one thing, you can control the size. You see, images all have their intrinsic size, and when you import them as background images or without specifying the size, they show up at that size. But you can resize images with the <img> tag. You can make a larger image smaller. It'll take the same amount of time to load, but it'll show up smaller. This can be useful, for instance, if you want to use an image more than once on a page, at least once at full size and one or more other times at smaller sizes. You see, your browser only needs to load an image onto a page once, no matter how many times it shows up. So let's say some uxorious gent wishes to do up his wife thus:


Naturally, there are other applications for this as well.

Do beware, however, that the browser will not always make the shrunken version look just as you'd like.

Now, as tempting as it may be to just use a smaller image and stretch it, you have to remember that it won't get any sharper – it won't fill in the missing detail. So this...


becomes this...


In fact, I'd advise resisting the tempation to resize pictures even just a little bit to make them fit. See, if a picture is 100 pixels by 100 pixels, and you want it to be 102 pixels by 98 pixels, the browser will just pick two columns of pixels to duplicate to make it two wider, and two rows of pixels to delete to make it two shorter. It's not going to average the information and do some amazing little figuring out like you see on cop shows where they say "enhance" and suddenly a blob of pixels becomes a parking validation sticker with a little smear on the date stamp. A picture is just a bunch of pixels as far as a computer is concerned, and that's all it has to work with. It has no awareness of the forms it's representing.

So why, then, outside of a few unusual uses, do I think resizing pictures is a good thing to have available? Well, look at this again – I showed it to you early on.


Those lines are made of a GIF that's actually one pixel square, stretched to suit. Now, I could have just specified the table cell width and the background colour, but your results are more guaranteed if you wedge everything with images set to specific size. Otherwise sometimes it just lets go a bit.

And this fact – that you can stretch a GIF to whatever size you want – can combine with a table structure to let you produce big images using rather small images.


So this, for instance, 482 pixels by 613 pixels, is made of an image 300 by 100 pixels, another 65 by 65, another one by one, and another one by 24. [show what's where] The fade at the bottom is 48 pixels high by 432 wide, and it uses a GIF just one pixel wide and 24 high. [show code] Totla of less than 20 kilobytes. Please remember that there are still people out there with dial-up accounts!

You can set an alt for an image, too. What does that do? If the image fails to load, or if it's taking too long to load and the site visitor wants to click through to somewhere, it says what the image is of or what it's linked to or whatever else you want. You'll see the text in the place where the image should be – until the image is there. Since many images used on websites are actually linked buttons with text on them (in type faces that aren't reliable available on all browsers), this can be useful.

You'll notice that images can have hspace and vspace – that means padding on the sides and padding on the top and bottom. The can also align. But the people who set the standards don't recommend using those anymore, and neither do I. You get more control by putting the image in a table – or by using CSS. The border attribute is for if you use a picture as a link. The browser will draw a border of whatever thickness you specify around the picture in the link colour. The default is not zero, so if you want your linked images not to have borders, you have to explicitly set the border to zero.

Oh – now, there's just one tag I haven't really covered yet that is really essential. No, what tag would that be? [a] It's <a>. What does <a> do? [It's a link.] Right, you use it to link to things. If you want a link, you put in <a href="whatever URL you want">, then what you want linked – either text or an image – and then, of course, </a>. Now, there's one attribute of the <a> tag that I forgot to put on your sheet because it's not actually official standard – but it works pretty much everywhere. That's target. You will have noticed that I opened a few examples in their own new windows. If you want to do that, you specify target="_blank".


There are other things you can specify the link to open up into that are useful if you're using frames. But since I don't recommend using frames, I'm not going to talk about them. So there.

Now, as to link paths, bear in mind that they're always relative to the page you're on.

[eg51] [go through eg51]


There's one other thing that's not official but that you can do: if you put mailto: in front of an email address, the person's web browser will try to find your email program, and, if successful, your email program will start a message to that address.

Now, is that all you can use <a> for? [...] Anchor. You can use it as an anchor. What's an anchor? It's a thing that allows you to go to a specific spot on a page. You set the spot by putting in an <a name="whatever"> tag – you have to close it too – and then you link to it by linking to the page plus number sign anchor name (if it's the same page, of course, you can just put number sign anchor name). So, now, this link goes to the next page, but not the top of the next page – it goes to an anchor partway down.


Fortunately, I've put in a link to go to the top of the page. [click to top]

As you may have noticed, it puts the anchor right at the top of the page. So if I put the anchor not in this section break but in the first line of text.... [go to Third] It kind of crowds the text a bit. Better to put the anchor a line or two above where you're going most of the time.

Now... I was going to give you a made-up exercise, but what I'd like to do instead is go to a couple of real sites and hit some specific pieces and see if you can say how to do them.

[do a recipe from tonyaspler]

[do a drug article with lots of side effects etc. from mediresource.sympatico]

[questions if time]

back to James Harbeck home page