1.1 |
FUNDAMENTAL CONCEPTS
1.1.1 · Hypertext Transport Protocol (HTTP)
Various protocols define types of Internet service; for instance,
file transfer (FTP), USENET news (NNTP), e-mail (POP).
Hypertext Transport Protocol is a set of rules governing the
transmission of words, images, and other forms of information that make up
pages on the World Wide Web.
1.1.2 · Clients and Servers
|
CLIENT: a personal computer connected
to the Internet and running a client or browser program
through which it issues requests for information to other computers (servers).
|
SERVER: a computer (dedicated PC, mini, mainframe)
connected to the Internet and running a Hypertext Transport Protocol (HTTP)
server program, allowing it to interpret and answer requests for information from other
(client) computers.
|
|
|
Popular Web Client Programs: Netscape Navigator,
Microsoft Internet Explorer (together about 90% of the market).
|
Popular Web Server Programs: iPlanet [formerly Netscape]
(Windows and UNIX);
Microsoft Internet Information Server (Windows);
Apache (all platforms).
|
1.1.3 · Hypertext Markup Language (HTML)
Content and layout of Web pages are controlled by markup documents
written in Hypertext Markup Language (HTML).
To see what Web markup looks like, use the View Document Source
feature of your Web browser to examine the markup for this page.
Standards for HTML are maintained by the World Wide Web Consortium ("W3C"),
a committee of academic and industry officials, including Tim Berners-Lee, the
computer scientist who invented HTML. However, the W3C has no formal authority
and software companies have extended the standard language considerably (see below).
HTML is a set of instructions that tell browser programs how to display information.
These instructions are similar to the so-called invisible commands used in older word processing
programs like WordStar and WordPerfect.
HTML is much simpler than any programming language. You can learn the basics in a
few hours: the hard part is knowing what to do with them.
In the early Web days it took some effort to keep current with the HTML standard.
In their attempt to dominate Internet software, Netscape and Microsoft added
new browser features that relied on extensions to HTML.
Design practices tended to change radically as new versions of Web browsers
came into the market.
That competition
has died down now (actually it's moved on to the far more complex realm of
Extensible Markup Language or XML, which is beyond the
purview of this course).
The basic outlines of HTML, embodied in the HTML 3.2 and HTML 4.0 standards
published by W3C, are generally accepted throughout the Web world.
Nonetheless it's useful to review the brief, interesting history of HTML:
- HTML 1 (1991-93) --
In the initial scheme of things, Web pages looked like typewritten
documents with graphics awkwardly stuck in.
- HTML 2.0 (1993-94) -- Never officially released, this first
revision of the language concentrated on interactive forms and did little to improve
layout or graphics.
- HTML 3 (1994-95) -- Netscape greatly expanded the range of HTML
commands with Navigator 1.0 (1994) and 1.1 (1995), adding centering, background images,
page color, tables, and dynamic documents. At first these features were considered
suspect and non-standard. The HTML 3.0 standard was never formally approved.
- HTML 3.2 (1996) -- When Microsoft entered the Web field midway
through 1995,
Netscape's "enhancements" showed up on the Internet Explorer feature list as well; this,
along with
the huge popularity of Netscape's innovations, prompted the W3C to issue a
standard including all major additions except
the advanced feature called frames.
- HTML 4.0 (1997-99) -- HTML 4.0,
perhaps the last major revision of HTML for a long while,
supports an important new control system for typography and layout
called Cascading Stylesheets (CSS).
Along with Style Sheets, two new tags were added -- DIV and
SPAN to make styling elements more flexible.
HTML 4 also incorporates the Document Object Model,
a powerful method for combining scripting languages like JavaScript
and VBScript with elements of standard HTML.
- Plug-ins and Auxiliaries --
In addition to HTML itself, a number of auxiliary technologies have appeared
on the Web, including the programming language Java,
Common Gateway Interface scripting (CGI), and
various browser plug-ins such as Shockwave from
Macromedia, QuickTime from Apple, and RealPlayer from RealMedia.
These features greatly extend the function of Web pages, though
of course they add technical requirements and narrow the
eligible audience. They're also not covered in this basic course.
1.1.4 · Deprecated Tags
Plug-ins and browser-specific tags add to the range of function
in HTML, but every flow implies an ebb: deprecation
is the process by which tags and practices are removed from the HTML universe.
Adding to HTML is easy.
In theory, anyone who programs a Web browser can propose a new
tag (though you still have to convince other people to use it).
Substracting is another matter.
Tags can be formally deprecated only by the World Wide Web Consortium (W3C),
and then only after a period of public comment.
Once a tag is deprecated, it is assumed that
the tag will not be developed further and that future versions
of browsing software need not recognize it.
Since W3C standards are only recommendations, however, software makers
are not really obliged to drop support for obsolete but popular tags: they might
lose market share by doing so. Like old soldiers, deprecated tags
never die, they just fade away.
In most cases a tag becomes deprecated only when a new construction
can do the same thing more efficiently or powerfully.
Removing deprecated tags therefore should not impair the
function of your pages.
You should avoid using deprecated tags and you should probably remove them
from any existing pages you have published. How fast you do this depends
on your situation. In some cases, the substitutes for deprecated
tags depend on the Cascading Stylesheets Standard (CSS1), which at this writing
is not fully supported by all major browsers.
If you're serving a large, technologically diverse population, you should
probably take your time.
When we discuss deprecated tags in this course, we'll note them as such.
Here are some of the major cases:
|
<CENTER></CENTER>
|
A Netscape "enhancement" to HTML. Since you can set ALIGN="center"
attributes for headings, horizontal rules, paragraphs, and tables,
there is virtually no need for this container anymore.
|
|
<FONT></FONT>
|
This container was once the only way to set text and link colors locally.
It's superseded by stylesheet techniques. We do discuss this tag
in Week 3, but do not recommend continued use.
|
|
<LAYER></LAYER>
|
Netscape introduced layers with version 3.0 of Navigator as a way
to add a third dimension (among other things) to
Web layout; but these innovations were effectively trumped by
the Document Object Model and DHTML, and did not become part
of the HTML 4.0 specification. Netscape unilaterally deprecated
the layer construction and its related parts in 1998,
so we don't cover this material.
|
1.1.5 · Other Key Terms
- "Page"
- A page is a collection of information (words, images, other media types)
distinguished as a single entity. Pages are the basic units of information in Web
publishing.
Though Web pages often do look something like
pages in a desktop publishing program, they can contain much more information
than any printed page. Remember, the Web differs significantly from print.
- "Site"
- A Web site is a collection of pages connected by coordinated hypertext links.
Typically a site serves a single purpose or expresses a unified concept -- corporate identity,
information services, publication, etc.
- Uniform Resource Locator (URL)
- Every page on the Web has a distinct electronic address that may be
written as a Uniform Resource Locator. The URL for the Communications Design
home page, for instance, is:
http://raven.ubalt.edu/departments/comDesign/index.htm
Here's what the elements of this string mean:
- http:// -- signals that this document uses
Hypertext Transfer Protocol (HTTP); in other words, it's a Web page;
- raven.ubalt.edu -- names the server (raven) and the domain (ubalt.edu) where this document is located;
- /departments/comDesign/ -- specifies the path, that is the nested series of directories, in which the
document is stored -- there's much more on this subject in
Week 2;
- index.htm -- gives the name of the page; note
the .htm extension which is one of the two file extensions that may be used
for Web pages (the other is .html).
1.2 |
INTRODUCTION TO HTML
1.2.1 · Tags
Tags are the basic commands or verbs of HTML.
Tags are indentified by the special symbols < and >, which are called angle brackets (or more familiarly, as "less than" and "greater than" signs).
The first word after the opening bracket is
the tag's identifier, or the name of its HTML element
(e.g., STRONG, BR, IMG). Tags also have formal names (e.g., "strong emphasis tag"), though
most designers use short forms or nicknames.
|
MARKUP SHOWING THE "EMPHASIS" TAG |
| Before connecting the main relay to the fusion power reactor, make
<em>very certain</em> that
the primary switch is open.
|
|
OUTPUT FROM THIS MARKUP |
|
Before connecting the main relay to the fusion power reactor, make
very certain that the primary switch is open.
|
1.2.2 · Containers
Most but not all tags come in pairs and may be thought of as binary, "on/off" switches;
a related pair of tags is called a container because it affects
anything contained between its two parts.
In the second tag of the container, the "/" character (slash or stroke) indicates
reversal of state. In the example above, it turns emphasis off.
1.2.3 · Attributes
If tags are verbs, attributes are adverbs -- they modify the function of the tag or
container.
Tags can carry a great deal of internal information in the form of
attributes. Attributes follow the tag identifier and have the
general form:
ATTRIBUTE="ARGUMENT"
Though most browser programs let you omit the quotation marks
most of the time, in some cases they are absolutely required
(in the IMG tag, for instance, which we'll discuss in
Week 3): so get in the habit of putting all
your attribute values in quotes.
|
TAGS WITH AND WITHOUT ATTRIBUTES |
<HR>
<HR SIZE="8" WIDTH="50%" ALIGN="RIGHT">
|
|
The first tag inserts a horizontal rule (shadowed line) using default settings;
the second tag inserts a rule 8 pixels high, half the current window width,
aligned on the right of the window.
|
1.3 |
ESSENTIAL HTML TAGS
1.3.1 · Document Definition Containers
- <HTML>
- Almost all Web pages begin with this tag and end with its closing
counterpart, </HTML>.
Everything inside this container is identified as markup in Hypertext Markup Language.
- When other markup languages appear on the Web, this container will
become crucial. Right now, nothing important will break if you leave it out,
but the distinction is important and you should observe it faithfully.
- <HEAD>
- The HEAD container usually begins immediately inside the HTML container.
- The head of an HTML document contains a number of special containers for document
definition:
<TITLE> -- the contents of this container appear
in the title bar of most browser programs (look at the top of
your screen) and are also recorded in the history list (see the Go
menu in Netscape Navigator).
<ADDRESS> This container conventionally holds postal and
e-mail addresses for the author of the page.
<BASE> This container gives a base URL
for the current document -- we'll return to this concept in
Week 2.
- <BODY>
- On conventional pages (i.e., those that do not use frames),
the BODY container holds the majority of page content -- virtually all the written
text and tags.
- Attributes to the BODY tag can set background image and the color of key layout elements;
we'll discuss these in Session 3.
|
SCHEMATIC VIEW OF DOCUMENT STRUCTURE CONTAINERS |
<HTML>
<HEAD>
<TITLE> ... </TITLE>
<ADDRESS> ... </ADDRESS>
<BASE> ... </BASE>
</HEAD>
<BODY>
Content of the page goes here...
</BODY>
</HTML>
|
|
These containers form the 'skeleton' of the Web page; note that HEAD and BODY containers
are separate, parallel divisions within the HTML container
|
1.3.2 · Format Tags and Containers
- <BR>
- The BR tag introduces a line break.
- Note that,
with a few exceptions,
line breaks typed into the markup are ignored when
the page is presented to the viewer; line breaks must be encoded with
specific tags.
- <BR> is one of the few solitary tags in HTML --
there is no </BR>.
|
THE LINE BREAK PROBLEM EXEMPLIFIED |
| MARKUP |
OUTPUT |
This line was broken by typing Return.
|
This line was broken by typing Return.
|
|
This line is broken<BR> with a BR tag.
|
This line is broken
with a BR tag.
|
- <P>
- The P tag signals the beginning of a paragraph by inserting a blank line.
In the early days of Web design, before stylesheets arrived, <P> was used
as a solitary tag even though properly speaking it is a container.
You'll still see lots of markup in which the <P> tag is used simply
to insert a blank line.
However, stylesheets require that the <P> container be closed with
</P>. Even if you are not using stylesheets in a particular
project, you should close your paragraph containers because at some
point you may wish to edit or re-use your code in a way that requires
this feature.
Note that repeating <P> does not create two blank lines --
browsers register only the first in any sequence of P tags; to skip multiple lines.
The safest way around this problem is a series of BR tags,
since <BR> is treated cumulatively:
<BR><BR> = one skipped line
<BR><BR><BR><BR> = two skipped lines, and so forth.
- <H1>, <H2>, etc.
- "H" stands for heading: the H container creates a
heading of a given size from 1 (maximum) to 6 (minimum), thus:
A heading in <H1>
A heading in <H2>
A heading in <H3>
A heading in <H4>
A heading in <H5>
A heading in <H6>
- Notice the skipped lines between the examples above; these are not caused by <P> or <BR> tags.
The H
container forces a skipped line both before and after the heading; as a result, it is
often preferable to use the <FONT> property and assign it a rule in your stylesheets (see Week 5 for more about this procedure).
- Note also that heading values below 4 are often smaller than the ordinary
body type, thus useless.
- Simple List Containers: <UL> and <OL>
- The containers UL (unordered list) and OL (ordered list) create simple lists
of the following types:
|
SIMPLE LISTS |
| Markup |
Output |
| UNORDERED <UL> |
<UL>
<LI>lions
<LI>tigers
<LI>beers
</UL>
|
|
| ORDERED <OL> |
<OL>
<LI>liars
<LI>talkers
<LI>boars
</OL>
|
- liars
- talkers
- boars
|
- The primary difference is that items in the ordered list are numbered, while
items in the unordered list are marked with "bullets" (simple dingbats).
- By adding a TYPE attribute to the initial UL tag, you can set the bullet shape
to "SQUARE," "CIRCLE," or "DISC" (the default); likewise TYPE can be
set in the OL tag to "I" (Roman numerals, uppercase), "i" (Roman numerals, lowercase),
"A" (uppercase letters), "a" (lowercase letters), or "1" (Arabic numerals, the default).
- Descriptive List: <DL>
- The DL container creates a more complex list that is useful for glossaries,
commentaries, bibliographies, and other two-decked structures:
|
DESCRIPTIVE LIST |
| Markup |
Output |
<DL>
<DT>Non-combatant:
<DD>A dead Quaker
</DL>
|
- Non-combatant:
- A dead Quaker
| |
- In the descriptive list, two elements take the place of <LI>:
<DT> (described term) and <DD> (descriptive data).
- You can nest one type of list within another. The
descriptive data following a term in a descriptive list could
contain an ordered list, or an unordered list could be inserted within
an ordered list, and so forth. These course notes contain several examples
of this technique.
- <HR>
- This solitary tag (there is no </HR>) inserts a horizontal rule to mark
a division of the page.
- You may also add SIZE (height), and WIDTH attributes, whose values may be given either as
percentage of available space or as pixels.
Browsers may interpret these variations somewhat differently, but the general
effect is fairly constant.
- The NOSHADE attribute for HR
converts the default shaded line into a solid bar.
- <PRE>
- The PRE or preformatted text tag is the most common exception to the general
rule about line breaks in markup (see above): with PRE,
breaks typed in the markup ARE carried through.
|
PREFORMATTED TEXT CONTAINER |
| Markup |
Output |
<PRE>
Here at ZipDotCom
we are ready
to serve YOU,
the savvy Internet
consumer!!!
</PRE>
|
Here at ZipDotCom
we are ready
to serve YOU,
the savvy Internet
consumer!!!
| |
- Many beginners see the PRE container as a formatting shortcut.
But of course there is a catch. Everything within the PRE tag appears in a
mono-spaced or "teletype" font.
- Seasoned Web users will thus see that you have used the PRE tag and
will be apt to regard your content as merely "dumped" from
older formats into your Web page.
Shortcuts like this are generally regarded as slacking. Electronic publishers
like to refer to shovelware, meaning content transferred
without modification from old to new media -- and they don't mean
shoveling snow. Use the PRE tag only when there are no
reasonable alternatives.
1.3.3 · Character Style Containers
- <TT>
- Of course, you might wish to use a "teletype" font on occasion, perhaps to
mark a change of tone or a discursive shift. The TT container does this, but doesn't
pass along any line breaks from the markup.
- Related to TT is the <CODE> container, which is used
to set examples of computer code for documentation or discussion; it is preferable to
TT for specialized uses because it signals content type as well as defining appearance.
- <EM> and <STRONG>
- These containers set emphasis and strong emphasis; ordinarily they are
interpreted as italics and bold, but substitutions
may be made on systems that do not support these options.
- The EM and STRONG containers are preferable to the more precise
<I> and <B> containers, partly
because they do not lock in only one style (bold or italic) and partly because they
convey information about content as well as appearance.
- EM and STRONG may be used together to produce doubly
strong emphasis. For example:
This is
<STRONG><EM>really</EM></STRONG>
important!
1.4 |
WRITING MARKUP
- Menu-driven editors
- These simple programs (such as HoTMetaL, HTMLEditor, or Alpha)
let you select HTML tags from
menus and other visual arrays. They can be useful but also confusing, since
there are so many tags and variations to choose from. Since in many cases
you'll have to add attributes by hand, why not just learn the tags?
- WYSIWYG HTML "assistants"
- Programs like Microsoft's FrontPage, Claris HomePage, Adobe PageMill,
and Macromedia's Dreamweaver
promise Web authoring without the bother of coding. They can be
valuable if you have to convert large amounts of regularly
structured text or if you find yourself working with people who
have not learned HTML.
- On the other hand, most serious Web authors avoid these tools, largely
because they generate highly complicated and often non-standard code.
Some of these programs exasperatingly re-generate code structures after
the user edits them out.
-
Storyspace
- This is one of the few Web authoring program that graphically maps
hypertext structures as you build them -- a crucial asset for large, complex
hyperdocuments. Follow the link above for more details and for
examples of Web projects created with Storyspace.
- Simple text editors
- All you really need to write Web markup
is a text editor such as WordPad or (for Macintosh) BBEdit.
WordPad is a Windows system utility; a "lite" version of BBEdit
circulates free on the Internet.
1.5 |
ELEMENTS OF HTML STYLE
- Case sensitivity
- With two important exceptions, HTML makes no distinction between
upper and lowercase letters -- so <STRONG>,
<strong>, and <StRoNg> are
all valid ways to write the same tag.
- The exceptions are NAME anchors (see notes for
Week 2) and URLs for
external pages stored on UNIX systems (also discussed in the next session).
- Begin tags on separate lines
- This helps separate tags from content, which helps when you're trying
to edit one or the other.
- Skip lines and indent for clarity
- Browsers disregard skips and indentations just as they
ignore line breaks, except of course inside a PRE container.
- Comment your markup!
- Comments in HTML are enclosed in a special container, thus:
<!-- insert comment here -->
- Since HTML markup can be bewilderingly complex, you should write
comments to help yourself and others understand what's going on.
Comments are invisible to the casual reader and do not add significantly
to load time.
- Keep embedded structures in order
- Containers frequently overlap, as in this example:
<FONT SIZE="5"><STRONG>Big
and Bold</STRONG></FONT>
- In this example, containers are closed on a
last-opened, first-closed basis.
Always close containers according to this logical sequence.
END OF NOTES FOR WEEK 1
Return to Top of Page
Course and Materials ©1997 - 2000 by Stuart Moulthrop
and Nancy Kaplan
|