File Format Wars

My awareness of this issue started May of 2005. An organization called OASIS ( the Organization for the Advancement of Structured Information Standards) announced after two years of work that it had created and adopted a new office file format, called OpenDocument Format (ODF). Within that umbrella, they set up a word processing format, a spreadsheet format, a presentations format, a graphics format, a chart format, and various others. The specifications for the formats were made freely available so that any software program would be free to use them. The intention was to create a format that wasn't tied to any single program. Much like the jpeg file format for pictures (which can be opened up by almost any program that can open pictures).

The structure of the ODF is essentially a zipped collection of XML files. The XML structure makes it easy to view the information, even if the file gets corrupted. The fact that it is zipped means it uses up less storage space, and is faster to transfer. (OK, you probably could have figured that out on your own).

When I first heard about the format, I also heard about OASIS's intention to send the format to the International Organization of Standards for the official adoption into an ISO standard.

About a month after the creation of the ODF was announced, Microsoft announced that they were working a new file format of their own. This file format, called OpenXML, or sometimes Office OpenXML or OOXML, is also a collection of XML files compressed into one file. It was also announced at that time that the OpenXML would be the file format used in Microsoft Office 2007. The new file format would be indicated by adding an “x” at the end of the familiar formats (i.e., .docx, .xlsx, etc.). You may have already seen these formats around.

The ODF and the OpenXML sound similar, but they are not compatible.

Later that year (I think it was around October, I've slept since 2005, and I'm working mostly off memory), OASIS submitted the ODF to ISO. It went through everything that ISO puts it through and around May of last year, ISO announced that ODF was officially approved as ISO/IEC 26300 (that I had to look up), with the publication of the standard occurring in November.

In the meantime, Microsoft completed work on the OpenXML and submitted it to Ecma International, a standards organization in Europe, in December 2005. Ecma gave their approval in December of last year, and Microsoft turned around and submitted it to ISO for their approval. That hasn't been decided yet.

However, the standard for OpenXML isn't quite as open as the standard for ODF. While anyone is free to use ODF because of how it is licensed, there are some legal questions about whether “just anyone” can use the OpenXML because of it's license. In an attempt to ease the issue, Microsoft has pledged that they will not sue anyone who uses the OpenXML standard.

I should add that Microsoft Office 2007 cannot open ODF files, and to my knowledge, the only program right now that can open OpenXML files are Microsoft Office 2007. However, some plug-ins are being developed on both sides that will help with the cross-sending of files.

In the meantime, there are several programs that are using ODF. OpenOffice.org and StarOffice were first, but AbiWord, Koffice, Google Docs and Spreadsheets, and several other programs have followed.

So, why am I boring you with all this (other than the fact that a change is coming, whether you like it or not)? Simple, that just sets the battle lines. There is more at stake here.

In 2005, Peter Quinn, the Chief Information Officer in Massachusetts, was looking at all of the various government agencies in the state. Most of them were using Microsoft Office, although quite a few were using WordPerfect Office. Consequently, transferring files from one department to another caused some headaches. In addition, some older files were getting harder to open properly, since the file formats frequently changed. He started having a lot of concern about the long-term readability of the state's files 10, 20, 50 years in the future.

Then he heard about ODF, and he put the ball rolling for a change. Starting Jan. 1, 2007, all state agencies (about 50,000 computers) must make all documents available in at least one of two formats: the first, PDF (I shouldn't have to describe the benefits for that), and the second is OpenDocument Format. For various reasons, not only were the old WordPerfect and Microsoft Office file formats rejected, but so, too, was OpenXML (which, at the time, hadn't been released yet). Microsoft tried to get OpenXML added to the approved list, but they were unsuccessful.

To be honest, I haven't heard how that is going. There was a lot of controversy in the state because of the announcement, and Quinn even resigned over the issue (although that had more to do with having the international spotlight shining on him as it did the actual transition). But as of about October last year, the switch-over was still on track. Surprisingly, I haven't heard anything about it since then.

But notice, this isn't talking about what software to use, only the format. The 50,000 government computers are free to use whatever software they want provided they save the files in at least one of those two formats. But this is an issue because, whoever gets one of those files from the state has to be able to read it, and that effects a lot more than just 50,000 computers.

In the meantime, the initiative is spreading. Although it hasn't been approved yet, Texas, Minnesota, and California have all introduced legislation that would force similar changes for state document formats. I have also heard rumors about other states, but I haven't seen anything definite yet. So far, Indiana is not one of the states I have heard about. (However, after a couple of other initiatives I have heard about brewing within the state government, it wouldn't surprise me at all.)

On the other side of the coin, the United States Department of Transportation and the Federal Aviation Administration have both announced that they are banning Microsoft Office 2007, as well as Microsoft Windows Vista, from their departments. No official word yet on what they will move to when the time to upgrade does come (unofficially, the "L" word was tossed out as a possible alternative, but I'm not going to go there because I don't want to extend this any longer than it has to be). The National Institute of Standards and Technology (NIST) has also banned Vista, but I haven't heard about Office 2007.

The “war” is just getting started and has a long way to go, and I have no idea what the result will be. But one way or another, whether Microsoft succeeds in promoting OpenXML or OpenDocument Format wins out, the old .doc, .xls, and .ppt file formats are going to be a thing of the past before too much longer. Consequently, it is worth keeping an eye on.