PDF/A in Amsterdam

In the last days I’ve been participating in the first PDF/A International Conference in Amsterdam, trying to get a better understanding on the facts around the topic.

To simply put it, PDF/A is a PDF 1.4 with some more rules. And is an ISO standard (ISO 19005-1).

For those of you who are wondering why do we have yet another file format (which seems to be a branch of the oldie PDF) please learn that PDF/A aims to be the format in which documents are to be stored for long term archiving.

The idea is excellent for various reasons, and the PDF/A originators (which is not necessarily Adobe) are not the only ones who thought of this. Microsoft also tries to jump into the wagon with XPS – which was not designed to be an archiving format but it seems they think is useful for this as well.

The need is there, as organizations are tired of having to deal with old file formats always when going deep in the electronic archives. And we need to take into consideration the fact that electronic archives are not too old these days. As a fun fact, in the opening keynote Thomas Zellman showed a 5 inch floppy disk to the audience. I think that was an excellent idea of reminding everyone that many things (think content here) we create today, would need to be used a long time from now. And 5 inch floppies are not too old. Think 8 inch floppies and punch cards.
Therefore, archivists all over the globe are trying to think how to reinvent their job of storing and managing paper and bring electronic content along (yes, “revelation” – paper will not disappear). If you have worked with archivists you will find out that this job is highly conservatory (couldn’t help the wording 😉 ). It’s in their nature not to change things and most of them they would not want to tackle anything but paper at all.
How do you address this? Make it a standard! “It’s ISO so it’s good”. At least easier to swallow by the archive world. Second, by deriving it from the ubiquitous PDF you get a file format which can be read by a lot of software and can be generated easily by others.

Of course, there are rules to take care if you want to be compliant.. Read all about it on the www.pdfa.org website, I’m not going into details here.

How is this relevant to the Content Management area?

First of all, it’s relevant to my PHD thesis since the objective of PDF/A is to be self contained (content and metadata). Which is how i store my objects in my great repository (wink).

Idea coming through: How about to define a storage area inside an ECM system so that everything you put there is stored/converted transparently by the CM system as a PDF/A including all its metadata?

Of course, there are some issues to ponder on, but i think this sounds good. The file format needs to evolve a bit to allow more content types to be included (think 3D, multimedia) and also to do more than a primitive implementation of digital signatures and metadata. But the scene is set.

Related to evolution, sadly (?) enough PDF/A needs to undergo ISO certification, so we all could expect the 2.0 version in 2010 i guess (and some speakers from the conference felt the same way).
I’ll stop for now, there were a lot of interesting things discussed in the conference and a lot of study cases and very interesting people to meet or rejoin for a beer.

Cannot help but add one more thought: Is IT Fashion? Rory Staunton thinks so.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s