More on Omnipresent Anonymity

You may remember I wrote this essay on a scifi setting. Since then I haven't been able to get it out of my head, so I'm going to write more on it in a less sci-fi sense.

We publish loads and loads of information to the internet. Common practice is to shake your finger at anyone who posts personal information to the internet.

But it's just impossible to not post personal information to the internet and still participate. There is a fragile layer of anonymity between your anonymous accounts and you. This gets even more fragile if your anonymous accounts are used to communicate with real-life friends, such as a facebook user who calls himself a random string of characters. You don't even have to know the communications: you just have to know that they happen.

Most geeks promote privacy awareness. Go into the options, set everything to private. Hide hide hide!

There are three problems with this. 1) It only takes a few "open" nodes to de-anonymize ("dean") an entire network, and given the lack of incentive, people are likely to remain open. 2) Going "quiet" and hiding away your info cripples the functionality of a social network. 3) It's becoming ever easier to mine even tiny leaks of data. Whether a friend accidentally mentions your name, or someone backtracks your IP address, your anonymity is not guaranteed no matter how "hidden" you try to be.

As our social use of the internet grows ever denser, it will become almost impossible to live your life without posting photos and video of yourself to the internet in the same way that we use email today. It's theoretically possible that some of that might be private, but if even a fraction of it is public, a huge portion of those connected can be deaned (de-anonymized). It's going to get worse and worse as our real social networks become increasingly reflected in our software social networks.

In my opinion, privacy advocates are really grasping at the last fragments of How Things Were. Right now, it's basically possible to keep hiding. I don't know for sure that all of my friends have various anonymous accounts, but they almost certainly do. I know I do. Right now, that's fine: there are very few programs automatically trawling the internet trying to make these connections. There aren't very many things actively trying to dean the internet.

But it's getting easier and easier, both in terms of available data to analyze and in terms of algorithms to do the analysis. It's important that privacy advocates realize that the idea of promoting anonymity from the user side is going to be basically impossible.

So, the problem needs to be turned on its head. The problem isn't "how do you keep users from posting private data", the problem is "how do you keep private data from being abused".

An analogy I liked is the one about public meatspaces. If you and a friend are chatting in public, you might talk about his medical condition. If a stranger then rushed up to try to sell him homeopathic lies targeted at his medical condition, nobody would ever think your friend was the one who was weird: it would be the spy that was weird. If he did it a lot, the police would take him in for being a public nuisance.

So if I'm talking to you on the internet and we mention your medical condition, Google offers us homeopathic lies instantly and automatically, and probably logs that targeted ad for reuse later. Your medical insurer two years from now, if they find that conversation somehow, might deny you coverage based on the fact that you had a "pre-existing condition". There are already rumors of insurers canceling treatment based on social media - such as the depressed person whose treatment was discontinued because she was seen smiling in a picture on one of her friend's social network pages.

Even if it was possible, it's inhuman and dangerous to make us responsible for staying absolutely silent, absolutely hidden. Humans are social beings, the internet is a social medium. Being abused by corporations and repressed by governments is nothing new. We don't generally say, "well, you shouldn't have said that you disagreed with the government!" We generally say, "hey, the government shouldn't be persecuting you just because you said something!"

This is the same situation. Obviously, being force-fed stupifyingly bad ads is not the same thing as being rounded up and blacklisted in McCarthy's Big Red Book, but both stem from the same problem of people outside a conversation listening in and taking action based on what they hear. Conversations that were never intended to include them and certainly were never intended to be definitive, permanent, public statements.

So, there are two basic responses to this.

One is to revamp the underlying architecture to make everything incredibly privacy-aware by default. I think this is a good idea, but I also think it's pretty much impossible, since a social network is inclusive by nature. As social networking becomes the primary use of the internet, it seems unlikely that we're going to create a new network that specifically damages that function. Still, we could do better than we're doing right now.

The other option is to do what we always do: legislate it.

It can be legislated from two sides. One side, probably the easier side, is to restrict contact between entities and use of data gathered on-line. You are allowed to gather data. You can dean the network, find that XRSOXGAL8727 is Jenny Smith. But you aren't allowed to use that information for anything. No targeted ads. No redirecting links. Only when a user explicitly allows this sort of thing is it allowed. Steep fines if you break that rule.

There are a few problems with this. One is that spam is basically impossible to track down. Every day I get spammed with fairly well-targeted spam: my interests are pretty transparent, so spammers target me pretty specifically. A lot of this spam comes from anonymous sources, people I could never track down, people the government would have to spend a fair amount of time tracking down. Still, that's okay, because I'm not trying to stop anonymous spammers. I'm trying to stop Google.

Another problem is that it might be difficult to draw the line between someone who is actually trying to socially contact you and a spammer. For example, if you post about squids and I send you a link to an interesting squid article, I'm not trying to spam you, I honestly think you'll find it interesting. I think that line can be drawn fairly easily - number of entreaties per day per organization or somesuch - but it is an issue that has to be kept in mind.

Also, there is the problem of UNTARGETED spam. When I visit a science site, there are sciency ads. This is not because they know who I am, it's because it's a science site. However, I can forsee a lot of cases going to court that stumble over this distinction. Personally, I would prefer to make it illegal for them to spam me without my permission, targeted or not. But I understand that's an extreme view.

Anyhow, we also have another kind of legislative option: to make it illegal to dean data. To make it illegal to trawl the internet gathering data, tying it together, figuring out who is where, doing what. To make it illegal to watch you as you travel.

Now, this second option is the one I chose for my sci-fi setting, so I sat down to think. In my sci-fi setting, you can tell someone has deaned data because data has a structure, and high-context (non-anonymous) data has a very specific structure, even if it has been reduced to nothing but lists of foreign keys. Because the backbone has largely been replaced by a mesh network of freely available personal wireless routers, a lot of private individuals scan passing traffic for this pattern and report it to the cops if found. This can net them a tidy profit.

If a corporation is suspected of using illegally deaned data, they can also be investigated using a similar pattern-finding tool. This leaves their data private: the feds never see the actual data. This has led to a (so-far fruitless) search for a way to store high-context data without the signature pattern.

That future is a bit dystopian under the covers, and I don't think that method would work in reality. But then I got to thinking:

The problem is caused because there is so much easily-searchable information. However, searching the information causes more easily-searched information.

While the corporations are busily deaning your data, using your private messages to sell you useless crap, you could be busily deaning their data.

This isn't very possible on the current brand of internet. But what if the "next, more privacy-conscious" iteration of things actually tracks patterns of requests instead of simply tracking user connectivity? What if, by fundamental design, it was The Completely Transparent Thing? Instead of privacy, it has anti-privacy. Well, within any given persona: you could still have your anonymous account.

Now, if your anonymous account and your real account are both pinged, you'll know. You can track down the ping-er. Which may also be an anonymous account. It will be immediately obvious if it was chance, or if it was an auto-deaner, or if it was your would-be boss trying to find out if you have a juicy personal life. By attempting to dean you, they effectively dean themselves.

This is not really a good solution, because that level of monitoring reveals way more than you want to have revealed. The time you visited a link blind and found yourself staring at something horrible. The fact that you checked so-and-so's page 80 times in one hour. And, of course, the muddy propagation problems where the SocialReader9 bot visits your page, reads the details, and then gives it to others without you knowing.

I don't think there is a solution, really. Not with current technologies. But I was tickled by the "balance of power" option, where someone attempting to gather information on you incidentally exposes more information about themselves.

I wonder if it could be made to work somehow...

Anyhow, it's a problem that can't be addressed by simply telling people not to put their personal information on-line.

