Commentary By Ron Beasley
I'm 66 and still remember pre WYSIWYG word processors. The first one I used was called Manuscript. To do italics or bold you did an alt something or other and you couldn't see if it worked until you printed it. The purpose of the word processor was to make a printed document. At Salon Tom Scossa points out what all of those who do cyber documents knew - word processors suck.
For most people now, though, publishing means putting things on the Web. Desktop publishing has given way to laptop or smartphone publishing. And Microsoft Word is an atrocious tool for Web writing. Its document-formatting mission means that every piece of text it creates is thickly wrapped in metadata, layer on layer of invisible, unnecessary instructions about how the words should look on paper. I just went into Word and created a file that read, to the naked eye, as follows:
the Word
Then I copy-pasted that text into a website that revealed the hidden code my document was carrying. Here's a snippet:
<!�[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>And it goes on:
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>And on:
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>The whole sprawling thing runs to 16,224 characters. When I dumped it back into Word, it was an eight-page document.
When you copy and paste a Word document into a blog post all of this HTML like code raises hell. As site administrator of Newshoggers I have cleaned up several blog posts before I was finally sucessful in convincing my fellow bloggers not to do that anymore. I use Open Office but it's not much better than word. If I'm doing a long post I will use open office but when I'm finished I copy it into a text editor to eliminate formating code and then copy that into my blog editor and do the HTML formating there,
Not unlike newspapers word processors like Word have not kept up with the real world.
Or you could try to save the document as HTML...
ReplyDeleteI save as .txt to scrape off the formatting. I haven't done a lot of checking, but haven't had any problems.
ReplyDeleteBrucie
ReplyDeleteThe HTML created by Word is still incompatible with blogging platforms.