Every Citizen needs a Data Dossier

Schrier's Data Dossier

Schrier’s Data Dossier

Governments collect a lot of data on citizens.  Private companies like Google, Amazon and even Safeway collect even more.   In fact, a whole new thriving business of data brokers has emerged.  These are companies like Datalogix which indexes, mashes, cross-correlates, buys and sells our personal information.

On  May 27 the Federal Trade Commission released its report “Data Brokers:  a Call for Transparency and Accountability”.   The report demonstrated the pervasiveness of the data brokering business.  The brokers use billions of data points to build profiles – dossiers – on every American.   The data comes from both online and offline sources.   Online sources include searches you make using Google or Bing, as well as things you buy from Amazon and other e-retailers.  Offline sources include purchases you might make with loyalty cards from companies like the grocery chains.

The “billions of data points” include a wide variety of information such as age, religion, interest in gambling and much more.   Here is a list of 200 such fields.  From this data the brokers make inferences and classify people into affiliations such as “bible lifestyle” or “rural everlasting” (older people with low net worth).

Americans are rightly concerned with the amount of data collected on us by our governments.   Government data collection is widely reported in the press.  But private companies collect similar vast amounts of information.   That fact is not widely reported.  Examples:

  • License Plate Recognition.   Cities and other police forces collect large quantities of license plate scans which include location and time-of-day information.  For example, Seattle Police deployed 12 police units and collected about 7 million license-plate records in one year, identifying 426 stolen cars and 3,768 parking scofflaws.  But most of those records capture normal citizens parking their cars in front of their houses.  However private companies such as Digital Recognition Company collect 70 million scans a month and have a database of 1.5 billion such scans.   Such data is used to repossess vehicles when the owner defaults on a loan.  At least police departments report to elected officials who can oversee and manage how the information is used.  But who oversees the private scanners?
  • Facial images.  The National Security Agency (NSA) collects millions of images each day, including about 55,000 of high enough quality for facial recognition.   But Facebook alone has 1.23 billion active monthly users who post 300 million photos a day (2012 statistic).  Facebook users willingly “tag” the photos, adding the names to the faces.  This has created one of the largest facial databases in the world.   Such data could be used to automatically recognize people when they enter a restaurant or bar, or to display advertisements tailored to them in public or when walking down the street.
  • Drones.  There is great weeping and gnashing of teeth over the potential use of unpiloted aerial vehicles by government agencies.   The Seattle Police Department was so roundly criticized about potential drone use that the Mayor ordered the program ended.  Seattle’s drones were given (“gifted”) to the City of Los Angeles igniting a debate there.  Obviously people are concerned about the video and other data such drones might collect.   In the meantime however, commercial use and uses of such technology are exploding, ranging from real estate to news media to farming and private photography.
  • Sensors.  The Internet of Things is upon us.   Sensors are being added to almost every conceivable device.   Sensors on cars will be used to tax drivers for the number of miles they drive, partially replacing gas taxes.   Sensors on cars also are already being used to track drivers who break laws or otherwise have poor driving habits, and their insurance costs may increase.  Fitness sensors track our activity.   Refrigerators, furnaces, homes, even coffeemakers (“your coffee machine is watching you”) are getting sensors.

Who is collecting all this information?  What are they using it for?   What are we to do?

Perhaps we need to follow the example of the Fair Credit Reporting Act, which requires the credit reporting companies to provide reports to individual citizens, but also allows those citizens to challenge information found in the reports.

Perhaps we need a “Citizen Data Dossier” law and portal – a secure online site or vault where everyone could find the information collected by each data broker and each government agency about them.    In addition, individuals could challenge the information, ask for it to be replaced or removed and allow citizens to “opt out” of how their information is collected and used by the broker.

Biker-Hells-Angel-Type

Biker-Hells-Angel-Type

Governments, of course, represent a somewhat different issue.   Clearly convicted sex predators should not be allowed to “opt out” of government collection of their conviction data or have it removed from government records.   But certainly those who have false conviction data or other data (e.g. incorrect notice of suspended driver’s license) should be allowed to correct that information.

One thing is for certain:   once such data is available, we will discover how much of our information is available, and what private companies infer about us using it (“this guy is a Biker/Hell’s Angels type“).   And I suspect we will be scared and upset.

Leave a comment

Filed under big data, government, open data

Leave a comment