Wednesday, February 22, 2012

Big Data = Big Abuses

I've had a rough time defending myself on G+: I think that targeted ads are bad, except when they are explicitly asked for. I also think that corporations need to be shackled, legally, from tracking their customers.

These are pretty severe opinions, and I caught a lot of flak from people who disagree. Which is fine. The arguments can get heated, but it's not like a political argument: there are points being made and disagreed with, rather than opinions simply being stuck to. However, since my opinion is apparently pretty far from the norm, it needs a fairly large amount of defense.

Let me start by establishing that big data != small data.

The first line of defense against my opinions is that companies have had all this data forever, they're just using it more effectively now, and that's fine.

That doesn't hold water for me. Having something you can't use and having something you can use are simply different categories. If you have an old musket your great-great-grandaddy used in the revolutionary war and your neighbor has an AK47 they use to shoot crows, are you going to argue those are the same situation?

Whether you are for or against gun control, you need to acknowledge that those are different categories of gun ownership and use.

It's the same with data: having a giant stack of receipts you use to file taxes as opposed to having a giant database you use to predict exactly what each individual customer, by name, will buy this week... those are fundamentally different. Whether you are for or against the right to do such a thing, you still have to acknowledge that they are fundamentally different categories of data possession and use.

Now let me argue that big data = big abuses.

Our current economic woes come from big data.

Let's start with the sub-prime mortgage crisis. This was a situation only made possible because of big data. The massive amount of storage, networking, and communication required to analyze, package, resell, and distribute mortgages is fundamentally big data.

If they had to shuffle actual papers around, it would have been impossible to bundle and resell these mortgages in quantity. Because of this, it would have been impossible to shift and seemingly diminish risk, and fewer such mortgages would have been approved. The crisis may have still happened to some extent, but it wouldn't have been a bubble popping, just a pinprick on ordinary economic skin.

It's probably still a tender topic, so I imagine some readers may take offense to what I just wrote. Instead of spending a day on that one topic, I'll just move on to other topics with the same fundamental situation: big data causing big abuses.

In my opinion, the largest big data abuse is algorithmic trading, specifically high-frenquency microsecond transactions. Algorithmic trading has played a role in, and arguably been the cause of, several of the more recent stock market disasters.

The trading of stocks to take advantage of moment-to-moment price fluctuations is the way to make money on the market, as you'll see at those Wikipedia links. There is no law against it, and people have become great successes because of it. "Quants" are mathematicians turned billionaires.

So, why would you hate that? Why would you hate success? Are you some kind of horrible success-hating freak, you freak?

My reason is simple: it is success at the price of the market. "Zero sum" success. Their success did not come from making the world or the market better. It comes at the price of making it worse. Abusing it. On the best of days, it's a tax on people who are trading for real reasons related to the actual performance of the company in question. On the worst of days, it crashes the market entirely.

The argument can be made that this is effectively stealing. If a bully takes a geek's lunch money, we don't call that success. We call it stealing, even if all we see is the bully making a gesture and the long-suffering geek timidly handing over cash. Wow, sure looks voluntary, why are you complaining about it?

We can argue that what the bully does is illegal and what the quants do is legal, but let's leave aside the question of legal and talk about the question of benefit. The money HFT nets you... where does it come from? It comes from people who are participating in the market in the way the market was intended to work. It comes out of their pocket, without their permission. (It may also come from other HFT, but for the sake of argument, we'll consider that a closed pool of "HFT cash".)

How does that benefit the market?

It doesn't. Actual participants are made poorer for the sake of some people who have found a nice loophole. The loophole of big data.

With enough data, they can figure out what trades will happen when, at what prices, and hop on top of that. They can always be on top, skimming the cream off the milk everyone else is buying.

I hope that my point is clear: I'm considering everything from the perspective of a healthy market and economy. Rather than just hailing anyone with cash as a "success", I only hail those that actually improve things as a success. You can make as much money as you want, as long as you do so by offering better products or services. Otherwise, you're a burden on humanity and I want you gone.

I've used financial examples for big data, because that's the industry that's had big data the longest. However, there are other examples, such as the RIAA's persecution of its users based on data obtained from sifting through endless thousands of torrents and data transmission records. Once again, big data allowing for a big abuse, although the big abuse in this case was less devastating than crashing the world economy.

You can argue that not all big data automatically results in a big abuse. That's true: big data can have a positive result, especially in medical data, where it can lead to fairly serious breakthroughs as we discover the links between similar and dissimilar cases.

But the point is that you can't simply ignore big data. If there is data, it can be mined. For example, here is Target mining data. The data they are mining? Strictly the things you've purchased from them recently. Surely you can't fault them that! It's barely even "big data"!

That's right. With what I would consider the smallest, sparest kind of big data, Target has figured out how to tell when your baby is due - without you even buying pregnancy-related goods from them, or telling them you're pregnant. This isn't proof that Target is evil, or even an example of evil big data. It is an example of the sorts of things you can tell from even a relatively small amount of data.

We can't treat data as if it is unimportant. We need to consider any data to be a danger, just like any gun is a danger, just like any moving vehicle is a danger, any poison is a danger. In any given case, the data may be secured and well-behaved, just as in most cases a gun, car, and jar of rat poison is secure and well-behaved. But, on the other hand, we consider it illegal to wave a gun around while asking for more money.

Now, big data cannot be opted out of.

A lot of people want to "vote with your feet". If you don't like big data being used against you, stick to suppliers that don't use big data against you!

There's a lot of problems with that. The first is that it relies on you to know which places are going to use big data against you. Many of the big data applications are subtle enough that you won't notice.

The second is that opting out usually doesn't accomplish anything. For example, I don't use Facebook, but they still track me. Moreover, the fact that everyone I know uses Facebook extends their reach to me even if I don't use them - games released for Facebook only, posts only available to Facebook members, and so on. It is the same with Google, Steam, Microsoft, Apple - opting out of their services does nothing to actually stop them from affecting you.

The third problem with opting out is that you really can't. Big data is everywhere. What cell phone provider will you use that doesn't abuse their data? What internet provider? There are no innocent parties. The corporate culture doesn't allow for it.

You can argue that when abuses get bad enough, enough people will vote with their feet. I would say that's unlikely. Classically, corporations have been extremely good about maintaining a grip on their abused customers. Humans can get used to anything - for example, we've gotten used to high-frequency trading.

"Then what's the problem, if people aren't upset?"

... well, go back and read the essay again. The problem is that big data causes big abuses. What you don't fear can still hurt you. We weren't afraid of HFT, or sub-prime mortgages, or music distributors. They still caused us a lot of problems - even if we weren't even participating!


Oh, right, targeted ads.

I think that any targeted ads you didn't ask for are just like high frequency trading. It's a situation where the company is data mining to figure out the sort of things you might be about to do, and then pressuring you to do them in its favor. You were about to do them anyway, the market was about to function fine, this is just them stepping in to take their cut because they know you.

I won't argue that it's evil, but I will argue that it is anticompetitive. It is very difficult for a newcomer to compete with that. It'll get more difficult as established megacorporations band together to share your data, forming pan-industry teams to lock down their dominance.

There are some situations where it could theoretically benefit you - for example, if they are having a sale on something and you happen to want that something, you could get it at a lower price. However, if they are truly someplace you want to shop, you should be giving them explicit permission to send you notifications on that sort of sale.

Otherwise, it's them using their market share to pressure you to spend more and to wall out their competition. I consider that bad for the market and bad for the consumer. I consider it almost identical to HFT: taking advantage of someone participating in a useful (but predictable) manner.

It's not at HFT scale yet, but it's the same fundamental kind of thing, to me.


Soyweiser said...

You know what is sad, the was a shadowrun rpg supplement that basically predicted all these dirty tricks. The only think they where wrong about was the time frame. it took 10 years for it to become business practice, not 60 years.

Craig Perko said...

Hey, was that the Shadowrun supplement that got all their computers seized by the FBI?

Fun times, fun times.