TidyWhite Program

Standardize White Space Layout

For HTML Files

Version 2019.11.26

/Home /Professional /Papers /TidyWhite

Description Revised 2019-11-26

Contents

Introduction

I write HTML code using Notepad. (I am now using CryptPad, the Encrypting Editor Notepad Replacement.) I love Notepad (and CryptPad) because it is small, fast, and simple. This is practical to do because I keep the HTML code simple. I use the principles of minimal design, which also keeps the code simple.

I follow some very simple rules for layout of the white space in the HTML file but because I am human and not a machine, I am not consistent.

The primary purpose of the TidyWhite program is to automatically format the white space in the HTML file to my standard layout. The primary value of this action is to make it easier to perform file comparisons. It also makes it easier to read, maintain, and update the HTML code.

I was struggling with one issue involving layout. In a file comparison it is important that line length be limited to make it easy to spot changes. However in Notepad it is important to have paragraphs be a single long line because I depend on Notepad's Word Wrap to automatically break the lines at the width of the Notepad window. This is important to make it easy to read when I change the width of the window to whatever is appropriate depending on the screen size and other factors. Well, now I have an option to select long lines for a file I am editing and short lines for a file I am comparing.

Installing the Program

Program installation is recommended for a TidyWhite directory under the Documents directory for an individual user. The Program files directory can be used for an administrator install for all users. The program is completely stand-alone so a simple copy is all that is needed to install. You can carry it on a USB stick and use anywhere.

I do not provide an install program because I want you to know for absolute certain exactly what is required to use the program. Uninstall is very simple. Erase the program! Setup programs mysteriously make unknown changes to your computer. I won't even change the registry for you. I provide the complete source so you can know for certain exactly what the program does. Compile it for yourself, if you are able. Examining the code is not easy but it is available, if it is a concern to you. By these methods, you can trust the software. There is so much out there that is not trustworthy. You don't have to trust me. Full disclosure is the best trust there is.

I provide an example .REG file which adds a new context menu entry for the TidyWhite program. This makes it easy to use. These settings have been used and tested on Windows 7 and Windows 10.

The TidyWhite program is compiled with the .NET framework version 3.5, which is included automatically with the Windows 7 OS. The program has only been well used and thoroughly tested with the Windows 7 and Windows 10 OS.

The program has only been used a little with the Windows 8 OS (Yuck!).

The program has not been used or tested with Windows Vista or XP. Unfortunately, the .NET framework version 3.5 is not installed in Windows Vista or XP so you will need to install it to use the program in Windows XP. There are likely to be issues with the program. I don't use XP anymore so I won't be able to help you.

Using the Code

I am providing the complete Visual Studio 2015 Express for the Desktop C# project zip file with the executable and the complete source code for TidyWhite, for those who enjoy software code. You might be interested in adding your own special functionality.

TidyWhite.zip

I hope you find the source code instructive for picking up some ideas. I have learned a lot by examining the code of others. I hope examining my code will be useful to you.

Release Notes

A history of changes to the TidyWhite program.
  • Version 2019.11.26 unreleased
    • Preserve STYLE sections for those who use them.
    • Abandoned the idea of version numbers and just made the version the release date.
    • Converted the project to Visual Studio 2015 Express for the Desktop.
    • Added the registry entries for HTML to Windows 10 changes using OpenWithProgIds tricky insert of IE.AssocFile.HTM instead of htmlfile.
    • </A> I added a special quirk for file compare programs to make some differences easier to resolve. If there is no whitespace after the tag, there is a newline at the next whitespace.
  • Version 1.1 released 2014-09-12:
    • Preserve ANSI characters for those who still use them.
    • Removed double space after <HR...> because it is not rendered with doublespace. I am torn with making it more visible or accurate. I also found that the rendering can be inconsistent. Sometimes it has no space before or after.
  • Version 1.0 released 2013-10-13:
    • Added double space for <HR...>.
  • Version 0.4 released 2013-05-05:
    • Removed newline for space after </A> which can cause dangling link underline.
  • Version 0.3 released 2013-04-30:
    • Fixed blank line at the beginning and end.
  • Version 0.2 released 2013-04-22:
    • Added SCRIPT formatting bypass.
    • Added additional HTML block and special tags.
    • Added an option to change 72 character line limit.
  • Version 0.1 released 2013-04-21:
    • Initial Release.
  • Using the Program

    The TidyWhite program reads the file supplied on the command line, reformats the white space, and writes the revised file over the top of the old file. This is safe to do because nothing is changed except the white space. If you are concerned about changing the original file it is a simple matter to only change a copy. This is exactly what I do while testing. I then use a file comparison program to quickly spot the differences and see whether I like what was done.

    A second argument on the command line, changes the default 72 character line limit to a user specified value. A non-numeric value becomes 0. A value of 0 bypasses line limit.

    These are the standard rules I use for tags and white space. I only check and primarily use only HTML version 3 tags.

    1. Content within PRE, SCRIPT, and STYLE tags is never changed at all. Arguments to a tag are never shanged.
    2. I change all whitespace to a single space
    3. All tags are uppercase.
    4. There are no spaces at the beginning and end of a line.
    5. There are no double spaces and no triple spaced lines.
    6. Non-text HTML tag elements appear on a line by themselves.
    7. Block elements which are usually rendered preceded or followed by empty line space are preceded or followed by a blank line. These are the only double spaced lines!
    8. Lines longer than the limit are broken at a space before the limit. If there is no space before the limit the line will be broken at the next space.
    9. There are no blank lines at the beginning or end of the file.
    I am tempted to place each sentence in a paragraph on a blank line but it would be too tricky to accurately identify sentences.

    The rationale for some of these choices is to make the text easy to copy and paste entire lines without being required to copy HTML tags. The double spacing is for the lines to look similar to the way they are rendered, if you ingore the HTML tags.

    These are the HTML tags I process to place on a separate line with a note about HTML tags using double spaced lines or other considerations. I do not process optional tags. In my HTML code I always leave them out. In alphabetical order:

    • <!...> Newline after the tag.
    • </A> Newline after the tag at the next whitespace. Newline before the tag only if there is whitespace before the tag. Therefore, this option is controlled by existing formatting. An improper newline would cause an embarassing dangling link underline unless intended.
    • <BLOCKQUOTE...> Newline doublespace before the tag.
    • </BLOCKQUOTE> Newline doublespace after the tag.
    • <BODY...> Newline after the tag.
    • <BR> Newline after only with space removed before.
    • <DL...> Newline doublespace before the tag.
    • </DL...> Newline doublespace after the tag.
    • <DT...> Newline after the tag.
    • <DD...> Newline after the tag.
    • <FORM...> Newline doublespace after the tag.
    • </FORM...> Newline doublespace after the tag.
    • <H1...> Newline doublespace before the tag.
    • <H2...> Newline doublespace before the tag.
    • <H3...> Newline doublespace before the tag.
    • <H4...> Newline doublespace before the tag.
    • <H5...> Newline doublespace before the tag.
    • <H6...> Newline doublespace before the tag.
    • </H1> Newline doublespace after the tag.
    • </H2> Newline doublespace after the tag.
    • </H3> Newline doublespace after the tag.
    • </H4> Newline doublespace after the tag.
    • </H5> Newline doublespace after the tag.
    • </H6> Newline doublespace after the tag.
    • <HR...> Newline doublespace before the tag.
    • <IMG...> Newline after the tag.
    • <INPUT...> Newline after the tag.
    • <LI...> Newline after the tag.
    • <META...> Newline after the tag.
    • <OL...> Newline doublespace before the tag.
    • </OL> Newline doublespace after the tag.
    • <P...> Newline doublespace before
    • <SELECT...> Newline after the tag.
    • <SCRIPT...> Newline doublespace after the tag.
    • <TABLE...> Newline after the tag.
    • </TABLE> Newline after the tag.
    • <TEXTAREA...> Newline after the tag.
    • <TITLE...> Newline doublespace before the tag.
    • </TITLE> Newline doublespace after the tag.
    • <TD...> Newline after the tag.
    • <TH...> Newline after the tag.
    • <TR...> Newline after the tag.
    • <UL...> Newline doublespace before the tag.
    • </UL> Newline doublespace after the tag.