As We May Automate

In the outside world, all forms of intelligence whether of sound or sight, have been reduced to the form of varying currents in an electric circuit in order that they may be transmitted. Inside the human frame exactly the same sort of process occurs. Must we always transform to mechanical movements in order to proceed from one electrical phenomenon to another? It is a suggestive thought, but it hardly warrants prediction without losing touch with reality and immediateness. (Vannevar Bush. “As We May Think.”)

Things throw themselves together but it’s not because of the sameness of elements, or the presence of a convincing totality. It’s because a composition encompasses not only what has been actualized but also the possibilities of plentitude and the threat of depletion. (Kathleen Stewart, “Weak Theory in an Unfinished World.”)

I’m a well-trained information retriever. I was raised in a family in which complete and coherent sentences and direct communication were a challenge, and I have been an information professional and a librarian for ten years (sans an MLIS, which was incompleted due to illness). I know and understand how information retrieval works from various perspectives: the system/interface, the title/item in question, the human being, and the industry. I know how to get to that piece of information you are convinced you are looking for. Or I can make suggestions about the way you are looking for items or topics, explaining how the interface works, and why you are not getting the search hits that you were hoping to get. I can tell you why you are required to pay for access for an article if you are not logged in with your library’s user account, and will explain about the academic publishing industry and open access, and then we will probably talk a bit on just how much academia is broken. I know that you got used to Google, or grew up with it, and expect all information retrieval systems to behave in a similar manner – even if they are not search engines. I will explain to you and to the next person, and to the person after the next person, the difference between search engines and databases. I will explain Boolean search and command line logic, if applicable. I know that after a while, you will look at me as a human-formed information retrieval system, and not as a human being whose job is to connect between persons and information in various forms and ways. I’m all prepped for it. Most times, I take is as a compliment. This day and age, it is a bit like being a magician.

But being a data-context magician comes with a heavy cost, this day and age: seeing humanity and understanding of complexity decrease. For there is a thing that is much harder to explain in an age where people expect to get a soundbited answer within seconds after they start their search: that putting together context and having the time to listen to other people’s contexts is what makes us humane, and that yes, it’s worth your time. That there are facts, and there are opinions and there are world views; and that the same answer I will give to two different people on the same question will most likely be interpenetrated in different manners. It is more than OK; it is the thing that makes our culture richer, more diverse, and stronger, and dare I say it – better. It is the kind of thing I advocate, wholeheartedly: reminding people that information retrieval is just the first step in creating and sustaining our beautiful and rich republic of ideas; That what you will do with the search results later matters more than the way you got to acquire that piece of information.

And this is also where I am struggling within the information professional and hacktivist community when it comes to using the term “metadata” as “data about people’s behaviors when they’re interacting with “technology”, whatever that might be”. To me, this is a blunt reduction of humanity and human behavior and understanding of human-data behaviors, and whenever we cooperate with this kind of narrative, informed discussion about human beings, human lives, technology, and data just dies. Here is where I’d like to remind, again, that acquiring data is just the first and possibly the least meaningful step in data interpretation. It is the mindset and the context behind the query that matters, as well as the question of what you do with the acquired information later, whether it is correcting cataloging errors, adding a new entry, plan a new UX, or deciding whether to launch a missile on a house based on prior intelligence. I hope that you would agree with me that one of these things is not like the others (hi there, Citizenfour).

Phone taps, names, locations, ISBNs and ISSNs (unique identifiers given to books, journals, and conference proceedings), and the activity of every person who is using a smartphone or a tablet are often described as “metadata”. This term is used simultaneously for discussing government surveillance, publishers’ catalogs, and the information that human beings share with one another in various “technological” means and occasions. As information professionals of various fields, we all know that naming has purpose and power, and cataloging has consequences. And specifically, as a metadata librarian within the professional community, I would like to ask: what is ours and the public’s understanding of the term metadata? My concern is that when we relate to all data about anything as metadata, we are single-handedly diminishing the discussion about the meanings and purposes of interaction between data and human beings; For assigning every interaction related to human beings and data in large quantities with the term “metadata” (or “big data”) is an act of cataloging – in this case, a faulty one.

When catalogers assign a book with the wrong entry or with a typo in the title or the ISBN, the immediate consequence is that the book will be harder to locate later in a simple browse or search. This is also true when librarians and knowledge professionals relates to all human-data interaction as ‘metadata’. A prime example of this lack of distinction and why it is dangerous professionally is magnificently presented in Simon Spero’s “LCSH is to Thesaurus as Doorbell is to Mammal“.

When assigning any tracking of data with the term “metadata” I fear that we are making a graver error – a moral one. So here I would like to sharpen the differences between the different interactions of data and human beings in the professional mindset, and offer to assign an old-new term to the discussion about “metadata” that relates to human activity and interactivity with still objects: paradata, in order to differentiate the life-cycles of information about human beings and still objects, whether in government surveillance and social network systems or in an inter-library loan system.

Measuring and displaying the life-cycle of people and the life-cycle of items can seem to be similar at times, descriptively. But normatively, data about people is different from data about objects. From the point of view of a sysadmin or a UX designer (descriptive), both can move from one place to another, “meet” or “interact” with a various number of people and other items and get different titles, names, and editions over time. Yet, from an ethical and moral point of view, people and items are very different from one another. I would like to offer that it means that the information about persons and the way that information is being handled should also be different. While research about an object and its purpose may change over time or that there might be a disagreement about the value and purpose of said object, still objects cannot change or express different opinions over time. Still objects cannot exercise choice. They do not have will or an agency. This is why human beings and still objects should have different life cycles, that should be recognized by assigning different meanings – and hopefully, a vocabulary, architecture, and philosophy – to the workflows and their life cycles within the technological terrain.

For the past three years I have been worried that subjecting both human beings and still objects to the same language, system design, or a research method might have serious repercussions about the way we see and enable humanity and human interactions, growth, and choices in this technological world of ours. When we classify people with still frames of their actions, we do not allow them to change. We objectify them, de-humanize them, and turn them into metadata, into objects. If we treat human beings as representations of data related to said human beings, as nouns and not as verbs,[1] we are depriving human beings from their own humanity and agency. And we contribute to this misclassification with one word, one mindset. As a metadata librarian and a human, I strongly object to this process, language, and mindset.

Culturally speaking, a more serious implication of misrecognition and misclassification of paradata is not only not being able to obtain the requested “item”, but misidentifying and generalizing an existing phenomenon of dehumanization in our post-information revolution capitalist society. When said misidentification is popular or seemed as conventional, the moral error described above can lead to grave consequences. From privacy concerns to state sponsored executions, our professional community cannot continue to treat information about people the same way we treat information about objects.

How many of us hand-waved or didn’t pay attention to the ethical and moral aspects of treating paradata as metadata in creation of systems, choosing anthologies, or simply browsing for information in a “quicker” way? How many of us were drawn to a beautiful and “accurate” design or a workflow, forgetting the reality and the human beings who are behind the schema, who will be using the system, and who might be targeted by the system? And how many of us “chose” the cheaper solution and didn’t fight for the fine details of the patron’s privacy?

As Karen Coyle wrote, metadata is constructed for a purpose to facilitate an activity. So today I would like to ask you to stop and think about the activity that we, as a responsible professional community, creating by misidentifying a phenomena, and start using the term paradata. For a freer, more accurate, and hopefully a better future, it is vital that human beings and their industries/governments will be able to tell the difference between these two terms. Until they do, none of us is truly free of de-humanization as a technology user, and any arrest or investigation based on what is publicly known as metadata can only be seen as random and arbitrary – and for now, it is mostly what it is.

[1] The metaphor of verbs and nouns was originally suggested by Dena Shunra on Twitter: In the future, another term might help to differentiate between interactions between human beings with still objects and human beings interactions with other human being using technological devices.

About this entry