Speaking of Privacy, Consider Paradata

In the past two years I became a one woman band advocating for the need to distinguish between the terms metadata and paradata. Granted, most of this advocating manifested in a series of headdesking and ranting in my twitter account, but a woman’s gotta do what a woman’s gotta do. And that’s writing a post.

If you’ll go here and here, you’ll see I’ve already wrote about this before, but in a very unstructured way. So now I’d like to make the argument for the need to distinguish between metadata and paradata, and why I find it so very important. Some of it is based on many blogposts and semi-professional and professional publications I’ve read, some of it is the way I’ve learned to think of it.

I would like to suggest that metadata is bibliographic information about an item, and that paradata is information that describes the relationship of a person/persons to an item. For example, a book title and a date of publication is metadata; information about when a specific individual borrowed that book and where from is paradata; you can think of it as user information. Similarly but not identically to the distinction between primary and secondary sources, at times paradata can be metadata and metadata can be paradata: for example, a list of books borrowed by specific person can be seen as metadata when it’s important enough – say, when that person is an historical persona, a subject of a study, or a suspect of a crime. The question is, who decides when paradata is important enough to become metadata, and why. I think that a good place to start is by thinking what that data is to begin with (a book? a bus ticket? phone conversation? code? makeup?), who is the individual in question (a person of a minority group? a celebrity? politician? a cop, or a soldier? a minor? a deceased person?), who will be able to see and access the information in the short and long term, and the purpose of exposing this information when the person in question does not wish for it to become public or seen by others.

That’s where it becomes an ethical question – and that is why I think it is important to distinguish between the two. Because it is not merely a technical or philosophical debate for librarians and related service providers. It’s an ethical issue, whether in the form of the Patriot Act or a picture of a person that was posted online and tagged without their consent. It’s a question of who gets to gather, produce, use, and expose information about people, and under what circumstances. At times, it can a question of life and death. At almost all times, it is a question of quality of life, and how we choose see other human beings, and the reality and society we wish to live in.

At this point some of you may wonder why I don’t use the word privacy to describe metadata/paradata with, especially after this post. After all, it seems that paradata is a sort of “private” metadata, if all metadata is potentially paratada and vice versa. The reason is that privacy is subjective term that varies among different persons, locations, statuses, nationalities, time and space. What is private for some is public for others. Metadata and paradata, however, are clearer distinctions between information about a person’s actions and interactions and their list of publications.

To put it plainly, we are not in Kansas anymore: the false dichotomy of private and public is less relevant when I can be online and public when I’m inside my locked house, and when I can text an intimate and private message on a crowded bus on my way to work. Public and private use of items and expression of opinions is a false dichotomy because human beings can rarely be reduced to dichotomies and stereotypes, despite the very best efforts of mainstream media.* In my humble opinion, this debate is not really about the private and the public: it is about just how much we trust our governments and our own judgement when it comes to other human beings: are we/they all suspects, data to be gathered?

So when it comes to technology and privacy, I strongly suggest that we will start using the word “paratdata” more often. Ethical and political discussion aside, since I heard this “internet of things” thing is on its way, it might be useful.

*Another argument against the word “privacy” in relation to metadata is that the concept of privacy as still seen by too many people is drawn from the Victorian era, where “private” began and ended with the physical house’s boundaries and limitations. Among many other dangerous dichotomies, if home is “safe”, outside was “dangerous”. And home is not always safe, especially not to marginalized or powerless people.


About this entry