Editor's Picks(1-4 of 12)
Guidelines for HTML Standards
This article concludes our introduction to HTML with a presentation of some valuable guidelines for working with HTML documents and code that will help maximize their maintainability and reusability. Of central importance is the need to understand HTML and its role in Web applications, to plan ahead for maintainable and reusable code, and to adopt a consistent policy on coding style.
Coding Style Guidelines
Consistency is absolutely a prerequisite for maximizing maintainability and reusability. These general guidelines for coding style can form the basis of a set of standards that will help ensure that all developers in a project—or, better, in all projects across an organization—write code consistently.
Use well-formed HTML.
Pick good names and ID values.
Limit line length.
Standardize character case.
Use comments judiciously.
Use Well-formed HTML
Although Web browsers are generally forgiving and can ignore many mistakes, rendering most HTML as the document author intended, it is still a good idea to use well-formed HTML code, for a number of reasons.
Well-formed markup code is a concept that has gained importance with increased implementation of XML. While browsers did not, in general, enforce HTML language rules very closely, XML parsers do. Code is considered well formed when it is structured according to the rules for XML 1.0. These rules relate to character case, tags, nesting, and attribute values.
In general, when most browsers encounter an unrecognized or extraneous tag, they ignore them. However, different browsers might deliver results in different—and unpredictable—ways. In addition, future versions of browsers might adhere to standards more closely than do current versions. Finally, code that includes such elements can be harder to read and understand, making maintenance more difficult.
Lowercase names—To be well-formed, element and attribute names must be in all lower case. In versions through 4.01, HTML is not case-sensitive. However, XML is case-sensitive, and it follows that the XHTML 1.0 recommendation is also case-sensitive. So, to ensure that code keeps working and to maximize reusability, this must be planned for.
Closing tags—All nonempty elements must have corresponding closing tags. Empty elements—those previously signified with a single tag, such as
—must be followed immediately by a corresponding closing tag, or the tag must end with "/". For example,
are both examples of well-formed code.
Nested elements—All nested attributes must be properly nested—for example:
Note that the tag and its corresponding closing tag, , are both nested inside the
If elements overlap, then they are not properly nested, as illustrated in the following code:
While many browsers have accepted overlapping elements and given the expected results, they have always been, strictly speaking, illegal in HTML, and future versions of browsers might not support them.
Attribute values—Attribute values, even numeric attributes should be quoted—for example:
Code validation: Another step toward improving HTML code is to validate it against a formal published grammar and to declare this validation at the beginning of the HTML document. For example, the following line declares validation against the public HTML 3.2 Final grammar:
Assign meaningful Names and ID Values
Use a consistent scheme for assigning the value of name and ID properties. They should be as short as reasonably possible, but without giving up descriptive power. Also, use mixed-case property values to help readability (see Listing 2). In this code snippet, the check box names express not only what the purpose of the element is, but also information about the element's type. The code also illustrates the use of mixed case to help readability.
Listing 2: Example of Good Element Names
HTML primarily refers to elements by their name property, while DHTML and client-side scripts use the ID property. Although DHTML documents IDs must be unique in the document, in general, there is no reason not to use the same value for an element's name and ID properties. Using the same value for these properties can reduce confusion that might arise when mixing HTML and client-side scripting.
Use indentation consistently to enhance the readability of the code. When elements carry over more than one line of code, indent the contents of elements between the start tag and the end tag. This will make it easy to see where the element begins and ends. Also, use indentation to align code at attribute names (see Listing 3).
It is a good idea to use no more than two to four spaces for each level in indentation, so as not to use up all the available line length in indentation. If possible, set up the development tool to convert tabs to spaces so that the indentation will be the same when the source is viewed in different editors or as printed output.
Listing 3: Indent Code Consistently
To log into the system, enter your user
Limit Line Length
Break up lines when they run too long. It is much easier to read and understand code when you can see the entire line at once. When lines of code are so long that the reader must scroll right and left to read them, it requires much more cognitive effort to understand what the code is doing. Alternatively, in some applications, long lines might wrap to the next line at the nearest word break. In either case, source code is much easier to read and understand if the developer takes explicit control of line length.
HTML is not sensitive to line breaks, so the developer can break lines at will between keywords for readability. For example, Listing 4 illustrates a code snippet in which two elements have word-wrapped to the next line because they were two long for the editor window.
Listing 4: HTML Source Code with Uncontrolled Line Breaks
Compare this with Listing 5, where the developer took explicit control of line length. Here the code is much easier to read because the developer used line breaks and indenting to visually organize the source code.
Listing 5: HTML Source Code with Explicit Line Breaks
Keep the limitations of printed output in mind as well. Lines longer than 80 characters will often wrap in printed output without consideration for word breaks, making source code very difficult to read.
Standardize Character Case
Source code is easier to read if the developer has applied a consistent set of rules for the use of character case—for example, the use of lower case exclusively for HTML tags. When scanning source code, the reader can unconsciously apply a visual filter, focusing attention on the HTML keywords.
The approach taken in code that appears in this article is to use all lowercase letters for HTML tags and the names of its attributes, while using mixed case and a modified form of Hungarian Notation for some attribute values (see the sidebar entitled "Hungarian Notation").
Hungarian Notation is a convention for naming identifiers that adds a prefix to the name to provide information about the type and scope of the identifier. Dr. Charles Simonyi, a Microsoft Chief Architect at the time, introduced Hungarian Notation in the early 1980's. Long an internal Microsoft standard, variants of the convention have been widely adopted outside of Microsoft as well.
As an example of a simplified Hungarian Notation scheme, variables that contain a string could be prefixed with the character s, and a variable with global scope could be indicated with a gprefix. In this case, then, the variables sTemp and gsName in source code would be immediately identifiable as string variables with local and global scope, respectively.
In general, HTML is not a typed language, and Hungarian Notation plays a more important role in other types of Web development. However, in some cases it can add to readability. For example, the names or IDs of form elements are likely candidates for a modified form of Hungarian Notation. The prefix "btn" or "cmd" might be used for an input button. Text boxes might be prefixed with "txt," and check boxes might be prefixed with "chk" or "cb."
Use Comments Judiciously
Good comments can be invaluable for understanding and maintaining code. However, the unique nature of HTML introduces a trade-off between the value of thorough comments and the efficiency of the Web application.
The Web server reads in the HTML code and sends it as a stream of text over the network to the browser. Only after arriving at the client does the browser parse and interpret the HTML code, displaying the visible elements and ignoring the comments. The obvious implication is that the comments add nothing to the document as the browser displays it, yet they add to the processing overhead on both the server and client computers, and they increase the amount of data transferred. With almost 50 percent comments, Listing 6 illustrates what is probably excessively commented code.
Listing 6: Heavily Commented HTML Code
The trick is to find an appropriate level of commenting that balances these two issues. It is a good idea to comment the major logical flow and document sections to help readers quickly gain an overview of the code. Also comment dependencies and assumptions. Consistently following the other design and coding guidelines as suggested in this article—especially the ones related to naming and metadata—will help create self-documenting code.
Listing 7 illustrates how fewer comment lines and more descriptive element names can combine to provide effective documentation with a lot less overhead.
Listing 7: Lightly Commented HTML Code
Use Well-formed HTML
Avoid Style attributes in html
All non empty elements must have corresponding closing tags.
use Lowercase names
All nested attributes must be properly nested—for example:
Attribute values, even numeric attributes should be quoted
Pick Good Names and ID Values
Use a consistent scheme for assigning the value of name and ID properties.
Documents IDs must be unique in the document
Use indentation consistently to enhance the readability of the code
Standardize Character Case
Hungarian Notation is a convention for naming identifiers that adds a prefix to the name to provide information about the type and scope of the identifier.e.g. txt for text
Use Comments Judiciously