The data and the metrics of civilization
The UK just had its last conventional census, from now on they are going to do data mining to get their answers instead of the 210+ year old traditional door-knocking every ten years. Don’t expect the U.S. to follow suit anytime soon- we officially ditched the insane British measuring system back in 1975 but when was the last time you bought gas by the liter?
Someone told me last night that they were listed in the phone book- I almost had to laugh, we had been looking for their number as their old neighbors were coming back to town and the thought of looking in the white pages never entered anyone’s mind.
The first rule in computing is garbage in, garbage out. Yet so many of our metrics are based on census data as are much of our decisions by government. The data that we need is often there- but not readily available, or it’s overwhelming and the tools don’t exist to analyze it easily. The Internet has given us a treasure trove of data, but it’s not always in the right formats. The whole issue is very similar to the issues that “Esperanto” tried to conquer with the world’s many languages (which have much smaller data sets and should be easier to standardize)- and if you’ve heard of the “Semantic Web” and understand it at all- you know how complex this all is.
IBM, the company whose founder once thought there might be a market for a total of 5 computers worldwide, before going on to give us the PC jr and legitimize Microsoft’s MS-DOS OS, has been making major moves into the “smarter city” as they call it- (I wrote about it last Oct.). Smart political campaigns run on databases- which slice and dice voters for targeted Get Out The Vote and personalized issue tailoring. There is a whole industry of site-selection analytics that counts rooftops vs. retail- and still depends on data that’s up to 10 years old.
The number of databases that every one of us show up in is mind-boggling. From our utility and phone bills, to our health records, DMV, tax records, employment records, banking, and then what Google collects, you have ceased to be a human being- just a sum of all data points.
Credit rating agencies have been instrumental in the financial collapse of the global economy, yet their business is based on analyzing this imperfect data too. It’s amazing anyone survived before the invention of the binary counting device that now rules over us from birth to death and beyond. Look at what’s in front of you- you aren’t reading this on paper, delivered by a human to your doorstep.
One of the people I’ve met via this site is a semantic data consultant who shared an interesting Open Source project in development that could give us all a better way to analyze our data and develop new strategies of opportunity. We’ve already got some prime players in this field in Dayton with LexisNexis as one of the pioneers in data-gathering and mining, Teradata in storage in retrieval, even the much smaller Woolpert is a leader in GIS systems. Citizen Dan is in the pre-beta stage of pulling data from multiple data sources and making it available to us.
From their site:
Citizen Dan is a free, open source system available to any community and its citizens to measure and track indicators of local well being (see further about). It is available for use now, but is also undergoing active development with support from a number of innovative cities.
Citizen Dan is an exemplar instance of Structured Dynamic’s; open semantic framework (OSF), a generalized framework for deploying semantic platforms for any domain. By changing its guiding ontologies and source content and data, what appears for Citizen Dan can be adopted for virtually any subject area.
Go to the Concept ExplorerAs configured, the Citizen Dan OSF instance is a:
- Appliance for filtering and analyzing data specific to local community indicators
- Means to visualize local data over time or by neighborhood
- Meeting place for the public to upload and share local data and information
- Web data portal that can be individually tailored by any local community
- Potential node in a global network of communities across which to compare indicators of community well-being.
via Citizen Dan | A Community Instance of theOpen Semantic Framework.
And while the Dayton Development Coalition talks about being a leader in “Sensor Technology”– sensors just gather more data that fills databases which will later need data dissection.
With the latest census winding down, and the political redistricting battle seemingly in stasis, maybe it’s time to re-assess all we think we know from our data- and decide what we really gain from being omnipotent? The more data we gather, the more we know, for all its technological wow-factor- is it really improving our quality of life?
When the founding fathers talked about “life, liberty and the pursuit of happiness”- did they have metrics in mind to quantify what would make us happy? Was it enough to have representation when it came to taxation (the root issue of our first revolution)- have we better representation now with our arcane finance reporting laws and auctions that get passed off as elections? What is the “standard of living” and who should it apply to? Are all animals equal, but some more equal than others thanks to our imperfect data used as a foundation of governing? Or are we using the numbers for the right reasons?
The Luddite movement in England was a reaction to the mechanization of the textile industry. In calling someone a Luddite you are writing them off as an obsolete cog in society, yet, maybe, as it proves with many great thinkers- they were just ahead of their time? Maybe the metric we’ll discover when we have the grand unification database is that happiness is the direct opposite state of data overload- ignorance truly is bliss, and for all the wants of man to marshal over our humanity- the true indicators of a successful civilization are much more basic: food, shelter, safety, health- all without an abundance of government trying to solve our problems.
when was the last time you bought gas by the liter?
In February. I paid with colones.
Thought you might find this relevant to this post: Ushahidi, a Kenyan crowd-sourced way to gather information about a crisis.