A science writer will often post about an interesting or important scientific discovery, and have a sentence like “The data is clear.” That’s a sure invitation for a pedant to show up and correct their grammar, insisting that “data” is a plural noun, so the phrase would be “The data are clear.” Normal people don’t talk like that, and even in print it just seems weird to the reader.
The controversy stems from whether or not data is to be considered a countable or uncountable noun. As an uncountable noun, it can be used with verbs conjugated in the singular form, but historically it is considered the plural form of the countable noun “datum”, which is Latin for a “thing given” (i.e., “There are 69 datums”).
When I spoke with Peter Sokolowski, a lexicographer for the Merriam-Webster Dictionary, he told me that data’s transition between its historical roots and contemporary use is related to a lexical phenomenon called “semantic bleaching,” where a word’s original meaning is lost or diminished over time. An example of semantic bleaching include the contemporary use of the word “literally,” whose Latin root, littera, means “letter.” In the case of “data,” it has transitioned from “things given” to mean something like “a collection of information in aggregate” when used in everyday speech.
To those who write about science, when there is a choice between being literally correct and communicating ideas, communication is the priority. Some readers will be distracted by awkward word usage, while others are distracted by the urge to post an immediate correction. Did those readers completely miss the point of the science post? After all, when you can’t see the forest through the trees, there’s no use rearranging deck chairs on a sinking ship. Read about the data controversy at Motherboard. -via Digg