Stuart Moulthrop & Nancy Kaplan
University of Baltimore
School of Communications Design · 2000

Week 1: Starting Points


1.1 | FUNDAMENTAL CONCEPTS


1.1.1 · Hypertext Transport Protocol (HTTP)

    Various protocols define types of Internet service; for instance, file transfer (FTP), USENET news (NNTP), e-mail (POP).

    Hypertext Transport Protocol is a set of rules governing the transmission of words, images, and other forms of information that make up pages on the World Wide Web.

1.1.2 · Clients and Servers

CLIENT: a personal computer connected to the Internet and running a client or browser program through which it issues requests for information to other computers (servers).

SERVER: a computer (dedicated PC, mini, mainframe) connected to the Internet and running a Hypertext Transport Protocol (HTTP) server program, allowing it to interpret and answer requests for information from other (client) computers.

Popular Web Client Programs: Netscape Navigator, Microsoft Internet Explorer (together about 90% of the market).

Popular Web Server Programs: iPlanet [formerly Netscape] (Windows and UNIX); Microsoft Internet Information Server (Windows); Apache (all platforms).

1.1.3 · Hypertext Markup Language (HTML)

    Content and layout of Web pages are controlled by markup documents written in Hypertext Markup Language (HTML).

    To see what Web markup looks like, use the View Document Source feature of your Web browser to examine the markup for this page.

    Standards for HTML are maintained by the World Wide Web Consortium ("W3C"), a committee of academic and industry officials, including Tim Berners-Lee, the computer scientist who invented HTML. However, the W3C has no formal authority and software companies have extended the standard language considerably (see below).

    HTML is a set of instructions that tell browser programs how to display information. These instructions are similar to the so-called invisible commands used in older word processing programs like WordStar and WordPerfect.

    HTML is much simpler than any programming language. You can learn the basics in a few hours: the hard part is knowing what to do with them.

    In the early Web days it took some effort to keep current with the HTML standard. In their attempt to dominate Internet software, Netscape and Microsoft added new browser features that relied on extensions to HTML. Design practices tended to change radically as new versions of Web browsers came into the market. That competition has died down now (actually it's moved on to the far more complex realm of Extensible Markup Language or XML, which is beyond the purview of this course). The basic outlines of HTML, embodied in the HTML 3.2 and HTML 4.0 standards published by W3C, are generally accepted throughout the Web world.

    Nonetheless it's useful to review the brief, interesting history of HTML:

    • HTML 1 (1991-93) -- In the initial scheme of things, Web pages looked like typewritten documents with graphics awkwardly stuck in.

    • HTML 2.0 (1993-94) -- Never officially released, this first revision of the language concentrated on interactive forms and did little to improve layout or graphics.

    • HTML 3 (1994-95) -- Netscape greatly expanded the range of HTML commands with Navigator 1.0 (1994) and 1.1 (1995), adding centering, background images, page color, tables, and dynamic documents. At first these features were considered suspect and non-standard. The HTML 3.0 standard was never formally approved.

    • HTML 3.2 (1996) -- When Microsoft entered the Web field midway through 1995, Netscape's "enhancements" showed up on the Internet Explorer feature list as well; this, along with the huge popularity of Netscape's innovations, prompted the W3C to issue a standard including all major additions except the advanced feature called frames.

    • HTML 4.0 (1997-99) -- HTML 4.0, perhaps the last major revision of HTML for a long while, supports an important new control system for typography and layout called Cascading Stylesheets (CSS). Along with Style Sheets, two new tags were added -- DIV and SPAN to make styling elements more flexible. HTML 4 also incorporates the Document Object Model, a powerful method for combining scripting languages like JavaScript and VBScript with elements of standard HTML.

    • Plug-ins and Auxiliaries -- In addition to HTML itself, a number of auxiliary technologies have appeared on the Web, including the programming language Java, Common Gateway Interface scripting (CGI), and various browser plug-ins such as Shockwave from Macromedia, QuickTime from Apple, and RealPlayer from RealMedia. These features greatly extend the function of Web pages, though of course they add technical requirements and narrow the eligible audience. They're also not covered in this basic course.
1.1.4 · Deprecated Tags

    Plug-ins and browser-specific tags add to the range of function in HTML, but every flow implies an ebb: deprecation is the process by which tags and practices are removed from the HTML universe.

    Adding to HTML is easy. In theory, anyone who programs a Web browser can propose a new tag (though you still have to convince other people to use it). Substracting is another matter. Tags can be formally deprecated only by the World Wide Web Consortium (W3C), and then only after a period of public comment. Once a tag is deprecated, it is assumed that the tag will not be developed further and that future versions of browsing software need not recognize it.

    Since W3C standards are only recommendations, however, software makers are not really obliged to drop support for obsolete but popular tags: they might lose market share by doing so. Like old soldiers, deprecated tags never die, they just fade away.

    In most cases a tag becomes deprecated only when a new construction can do the same thing more efficiently or powerfully. Removing deprecated tags therefore should not impair the function of your pages.

    You should avoid using deprecated tags and you should probably remove them from any existing pages you have published. How fast you do this depends on your situation. In some cases, the substitutes for deprecated tags depend on the Cascading Stylesheets Standard (CSS1), which at this writing is not fully supported by all major browsers. If you're serving a large, technologically diverse population, you should probably take your time.

    When we discuss deprecated tags in this course, we'll note them as such. Here are some of the major cases:

    <CENTER></CENTER> A Netscape "enhancement" to HTML. Since you can set ALIGN="center" attributes for headings, horizontal rules, paragraphs, and tables, there is virtually no need for this container anymore.
    <FONT></FONT> This container was once the only way to set text and link colors locally. It's superseded by stylesheet techniques. We do discuss this tag in Week 3, but do not recommend continued use.
    <LAYER></LAYER> Netscape introduced layers with version 3.0 of Navigator as a way to add a third dimension (among other things) to Web layout; but these innovations were effectively trumped by the Document Object Model and DHTML, and did not become part of the HTML 4.0 specification. Netscape unilaterally deprecated the layer construction and its related parts in 1998, so we don't cover this material.
1.1.5 · Other Key Terms
    "Page"
    A page is a collection of information (words, images, other media types) distinguished as a single entity. Pages are the basic units of information in Web publishing. Though Web pages often do look something like pages in a desktop publishing program, they can contain much more information than any printed page. Remember, the Web differs significantly from print.

    "Site"
    A Web site is a collection of pages connected by coordinated hypertext links. Typically a site serves a single purpose or expresses a unified concept -- corporate identity, information services, publication, etc.

    Uniform Resource Locator (URL)
    Every page on the Web has a distinct electronic address that may be written as a Uniform Resource Locator. The URL for the Communications Design home page, for instance, is:

    http://raven.ubalt.edu/departments/comDesign/index.htm

    Here's what the elements of this string mean:

    • http:// -- signals that this document uses Hypertext Transfer Protocol (HTTP); in other words, it's a Web page;

    • raven.ubalt.edu -- names the server (raven) and the domain (ubalt.edu) where this document is located;

    • /departments/comDesign/ -- specifies the path, that is the nested series of directories, in which the document is stored -- there's much more on this subject in Week 2;

    • index.htm -- gives the name of the page; note the .htm extension which is one of the two file extensions that may be used for Web pages (the other is .html).



1.2 | INTRODUCTION TO HTML


1.2.1 · Tags

    Tags are the basic commands or verbs of HTML.

    Tags are indentified by the special symbols < and >, which are called angle brackets (or more familiarly, as "less than" and "greater than" signs). The first word after the opening bracket is the tag's identifier, or the name of its HTML element (e.g., STRONG, BR, IMG). Tags also have formal names (e.g., "strong emphasis tag"), though most designers use short forms or nicknames.

    MARKUP SHOWING THE "EMPHASIS" TAG
    Before connecting the main relay to the fusion power reactor, make <em>very certain</em> that the primary switch is open.
    OUTPUT FROM THIS MARKUP
    Before connecting the main relay to the fusion power reactor, make very certain that the primary switch is open.
1.2.2 · Containers

    Most but not all tags come in pairs and may be thought of as binary, "on/off" switches; a related pair of tags is called a container because it affects anything contained between its two parts.

    In the second tag of the container, the "/" character (slash or stroke) indicates reversal of state. In the example above, it turns emphasis off.

1.2.3 · Attributes

    If tags are verbs, attributes are adverbs -- they modify the function of the tag or container.

    Tags can carry a great deal of internal information in the form of attributes. Attributes follow the tag identifier and have the general form:

    ATTRIBUTE="ARGUMENT"

    Though most browser programs let you omit the quotation marks most of the time, in some cases they are absolutely required (in the IMG tag, for instance, which we'll discuss in Week 3): so get in the habit of putting all your attribute values in quotes.

    TAGS WITH AND WITHOUT ATTRIBUTES
    <HR>

    <HR SIZE="8" WIDTH="50%" ALIGN="RIGHT">
    The first tag inserts a horizontal rule (shadowed line) using default settings; the second tag inserts a rule 8 pixels high, half the current window width, aligned on the right of the window.



1.3 | ESSENTIAL HTML TAGS




1.3.1 · Document Definition Containers
    <HTML>
    Almost all Web pages begin with this tag and end with its closing counterpart, </HTML>. Everything inside this container is identified as markup in Hypertext Markup Language.

    When other markup languages appear on the Web, this container will become crucial. Right now, nothing important will break if you leave it out, but the distinction is important and you should observe it faithfully.

    <HEAD>
    The HEAD container usually begins immediately inside the HTML container.

    The head of an HTML document contains a number of special containers for document definition:

      <TITLE> -- the contents of this container appear in the title bar of most browser programs (look at the top of your screen) and are also recorded in the history list (see the Go menu in Netscape Navigator).

      <ADDRESS> This container conventionally holds postal and e-mail addresses for the author of the page.

      <BASE> This container gives a base URL for the current document -- we'll return to this concept in Week 2.

    <BODY>
    On conventional pages (i.e., those that do not use frames), the BODY container holds the majority of page content -- virtually all the written text and tags.

    Attributes to the BODY tag can set background image and the color of key layout elements; we'll discuss these in Session 3.

    SCHEMATIC VIEW OF DOCUMENT STRUCTURE CONTAINERS

    <HTML>

      <HEAD>
        <TITLE> ... </TITLE>
        <ADDRESS> ... </ADDRESS>
        <BASE> ... </BASE>
      </HEAD>

      <BODY>
        Content of the page goes here...
      </BODY>
    </HTML>

    These containers form the 'skeleton' of the Web page; note that HEAD and BODY containers are separate, parallel divisions within the HTML container
1.3.2 · Format Tags and Containers
    <BR>
    The BR tag introduces a line break.

    Note that, with a few exceptions, line breaks typed into the markup are ignored when the page is presented to the viewer; line breaks must be encoded with specific tags.

    <BR> is one of the few solitary tags in HTML -- there is no </BR>.

    THE LINE BREAK PROBLEM EXEMPLIFIED
    MARKUP OUTPUT
    This line was broken
    by typing Return.
    This line was broken by typing Return.
    This line is broken<BR> with a BR tag. This line is broken
    with a BR tag.

    <P>
    The P tag signals the beginning of a paragraph by inserting a blank line. In the early days of Web design, before stylesheets arrived, <P> was used as a solitary tag even though properly speaking it is a container. You'll still see lots of markup in which the <P> tag is used simply to insert a blank line. However, stylesheets require that the <P> container be closed with </P>. Even if you are not using stylesheets in a particular project, you should close your paragraph containers because at some point you may wish to edit or re-use your code in a way that requires this feature.

    Note that repeating <P> does not create two blank lines -- browsers register only the first in any sequence of P tags; to skip multiple lines. The safest way around this problem is a series of BR tags, since <BR> is treated cumulatively:

      <BR><BR> = one skipped line
      <BR><BR><BR><BR> = two skipped lines, and so forth.


    <H1>, <H2>, etc.
    "H" stands for heading: the H container creates a heading of a given size from 1 (maximum) to 6 (minimum), thus:

    A heading in <H1>

    A heading in <H2>

    A heading in <H3>

    A heading in <H4>

    A heading in <H5>
    A heading in <H6>
    Notice the skipped lines between the examples above; these are not caused by <P> or <BR> tags. The H container forces a skipped line both before and after the heading; as a result, it is often preferable to use the <FONT> property and assign it a rule in your stylesheets (see Week 5 for more about this procedure).

    Note also that heading values below 4 are often smaller than the ordinary body type, thus useless.

    Simple List Containers: <UL> and <OL>
    The containers UL (unordered list) and OL (ordered list) create simple lists of the following types:


    SIMPLE LISTS
    Markup Output
    UNORDERED <UL>
    <UL>
    <LI>lions
    <LI>tigers
    <LI>beers
    </UL>
    • lions
    • tigers
    • beers
    ORDERED <OL>
    <OL>
    <LI>liars
    <LI>talkers
    <LI>boars
    </OL>
    1. liars
    2. talkers
    3. boars


    The primary difference is that items in the ordered list are numbered, while items in the unordered list are marked with "bullets" (simple dingbats).

    By adding a TYPE attribute to the initial UL tag, you can set the bullet shape to "SQUARE," "CIRCLE," or "DISC" (the default); likewise TYPE can be set in the OL tag to "I" (Roman numerals, uppercase), "i" (Roman numerals, lowercase), "A" (uppercase letters), "a" (lowercase letters), or "1" (Arabic numerals, the default).

    Descriptive List: <DL>
    The DL container creates a more complex list that is useful for glossaries, commentaries, bibliographies, and other two-decked structures:


    DESCRIPTIVE LIST
    Markup Output
    <DL>
    <DT>Non-combatant:
    <DD>A dead Quaker
    </DL>
    Non-combatant:
    A dead Quaker


    In the descriptive list, two elements take the place of <LI>: <DT> (described term) and <DD> (descriptive data).

    You can nest one type of list within another. The descriptive data following a term in a descriptive list could contain an ordered list, or an unordered list could be inserted within an ordered list, and so forth. These course notes contain several examples of this technique.

    <HR>
    This solitary tag (there is no </HR>) inserts a horizontal rule to mark a division of the page.

    You may also add SIZE (height), and WIDTH attributes, whose values may be given either as percentage of available space or as pixels. Browsers may interpret these variations somewhat differently, but the general effect is fairly constant.

    The NOSHADE attribute for HR converts the default shaded line into a solid bar.

    <PRE>
    The PRE or preformatted text tag is the most common exception to the general rule about line breaks in markup (see above): with PRE, breaks typed in the markup ARE carried through.


    PREFORMATTED TEXT CONTAINER
    Markup Output
    <PRE>
    Here at ZipDotCom
    we are ready
    to serve YOU,
    the savvy Internet
    consumer!!! </PRE>
    Here at ZipDotCom
    we are ready
    to serve YOU,
    the savvy Internet
    consumer!!!


    Many beginners see the PRE container as a formatting shortcut. But of course there is a catch. Everything within the PRE tag appears in a mono-spaced or "teletype" font.

    Seasoned Web users will thus see that you have used the PRE tag and will be apt to regard your content as merely "dumped" from older formats into your Web page.

    Shortcuts like this are generally regarded as slacking. Electronic publishers like to refer to shovelware, meaning content transferred without modification from old to new media -- and they don't mean shoveling snow. Use the PRE tag only when there are no reasonable alternatives.
1.3.3 · Character Style Containers

    <TT>
    Of course, you might wish to use a "teletype" font on occasion, perhaps to mark a change of tone or a discursive shift. The TT container does this, but doesn't pass along any line breaks from the markup.

    Related to TT is the <CODE> container, which is used to set examples of computer code for documentation or discussion; it is preferable to TT for specialized uses because it signals content type as well as defining appearance.

    <EM> and <STRONG>
    These containers set emphasis and strong emphasis; ordinarily they are interpreted as italics and bold, but substitutions may be made on systems that do not support these options.

    The EM and STRONG containers are preferable to the more precise <I> and <B> containers, partly because they do not lock in only one style (bold or italic) and partly because they convey information about content as well as appearance.

    EM and STRONG may be used together to produce doubly strong emphasis. For example:

      This is <STRONG><EM>really</EM></STRONG> important!

1.4 | WRITING MARKUP




    Menu-driven editors
    These simple programs (such as HoTMetaL, HTMLEditor, or Alpha) let you select HTML tags from menus and other visual arrays. They can be useful but also confusing, since there are so many tags and variations to choose from. Since in many cases you'll have to add attributes by hand, why not just learn the tags?

    WYSIWYG HTML "assistants"
    Programs like Microsoft's FrontPage, Claris HomePage, Adobe PageMill, and Macromedia's Dreamweaver promise Web authoring without the bother of coding. They can be valuable if you have to convert large amounts of regularly structured text or if you find yourself working with people who have not learned HTML.

    On the other hand, most serious Web authors avoid these tools, largely because they generate highly complicated and often non-standard code. Some of these programs exasperatingly re-generate code structures after the user edits them out.

    Storyspace
    This is one of the few Web authoring program that graphically maps hypertext structures as you build them -- a crucial asset for large, complex hyperdocuments. Follow the link above for more details and for examples of Web projects created with Storyspace.

    Simple text editors
    All you really need to write Web markup is a text editor such as WordPad or (for Macintosh) BBEdit. WordPad is a Windows system utility; a "lite" version of BBEdit circulates free on the Internet.



1.5 | ELEMENTS OF HTML STYLE




    Case sensitivity
    With two important exceptions, HTML makes no distinction between upper and lowercase letters -- so <STRONG>, <strong>, and <StRoNg> are all valid ways to write the same tag.

    The exceptions are NAME anchors (see notes for Week 2) and URLs for external pages stored on UNIX systems (also discussed in the next session).

    Begin tags on separate lines
    This helps separate tags from content, which helps when you're trying to edit one or the other.

    Skip lines and indent for clarity
    Browsers disregard skips and indentations just as they ignore line breaks, except of course inside a PRE container.

    Comment your markup!
    Comments in HTML are enclosed in a special container, thus:

      <!-- insert comment here -->

    Since HTML markup can be bewilderingly complex, you should write comments to help yourself and others understand what's going on. Comments are invisible to the casual reader and do not add significantly to load time.

    Keep embedded structures in order
    Containers frequently overlap, as in this example:

      <FONT SIZE="5"><STRONG>Big and Bold</STRONG></FONT>

    In this example, containers are closed on a last-opened, first-closed basis. Always close containers according to this logical sequence.

END OF NOTES FOR WEEK 1


Return to Top of Page

Course and Materials ©1997 - 2000 by Stuart Moulthrop and Nancy Kaplan