File sizes

Ross Woods, 2011

Larger files are slower to download off the Internet, and the same information can have different size files, depending on the kind of file.

I ran this experiment on two separate .doc files. The first was much shorter than the second.

1 .doc (Word) file from MS Word 97.
2 html file directly from MS Word 97.
3 html file from MS Word 97, then slimmed down with reduced font tags, etc.
4 html file from MS Word 97them slimmed down with reduced font tags, etc. as well as minimal head and no end of lines returns (The whole file is all one line.)
5 pdf file made from 1
6 pdf file made from 2
7 pdf file made from 3
8 File 1 pull up in Word 2000
9 8 saved as Word 2000 html
10 9 saved as pdf

The results are as follows, in kbs:

    File one File 2
1 Word 97 21 388
2 html 3 189
3 html 2 145
4 html 2 142
5 pdf 95 412
6 pdf 64 359
7 pdf 64 354
8 Word 2000 20 424
9 html 11 1,041
10 vpdf 108 925

Lessons to learn

  1. The size of the pdf file is fairly unpredictable, and can be bigger than the doc file.
  2. .Doc and pdf files are very inefficient compared to basic html code (no surprise).
  3. Further shrinking the html file still makes some difference, but not that much.
  4. Shrinking a skinny html even further to "all-one-line" doesn't make much difference.
  5. Pdf files made from skinny html files are somewhat smaller than those made from Doc files, but not necessarily always the same efficiency.
  6. Word 2000 is clunky.