academia | advice | alcohol | American Indians | architecture | art | artificial intelligence | Barnard | best | biography | bitcoin | blogging | broken umbrellas | candide | censorship | children's books | Columbia | comics | consciousness | cooking | crime | criticism | dance | data analysis | design | dishonesty | economics | education | energy | epistemology | error correction | essays | family | fashion | finance | food | foreign policy | futurism | games | gender | Georgia | health | history | inspiration | intellectual property | Israel | journalism | Judaism | labor | language | law | leadership | letters | literature | management | marketing | memoir | movies | music | mystery | mythology | New Mexico | New York | parenting | philosophy | photography | podcast | poetry | politics | prediction | product | productivity | programming | psychology | public transportation | publishing | puzzles | race | reading | recommendation | religion | reputation | review | RSI | Russia | sci-fi | science | sex | short stories | social justice | social media | sports | startups | statistics | teaching | technology | Texas | theater | translation | travel | trivia | tv | typography | unreliable narrators | video | video games | violence | war | weather | wordplay | writing

Wednesday, February 01, 2006

Copyright madness: a googol of crazy pills

Reuters reports that "Newspapers take aim at Google in copyright dispute":
The Paris-based World Association of Newspapers, whose members include dozens of national newspaper trade bodies, said it is exploring ways to "challenge the exploitation of content by search engines without fair compensation to copyright owners."
"The news aggregators are taking headlines, photos, sometimes the first three lines of an article -- it's for the courts to decide whether that's a copyright violation or not."

The campaign comes as a pending U.S. court case pits Agence France Presse against Google. AFP sued the company last year, alleging that Google News carries its photos, news headlines and stories without permission.
The World Association of Newspapers said it would seek a meeting with European Commission officials and look into whether the news aggregators are infringing on their copyrights or brands.

It's hard to know how to process a story like this. It's the type of story so ludicrous, when I explain it to my non-techie friends they often don't believe me.

See, if a news website, or any website, doesn't want a search engine, Google News or whoever, to copy or link to their content, there's a way to stop them, and it's a lot easier than going to court. Here's the code you need to keep Google News from ever, ever using your content:

# va a l'enfer, Google
User-agent: *
Disallow: /
That code goes in a text file called "robots.txt" which goes in your web folder, and (non right away, but after a few days' delay) Google will never post your stories again. It's the same way folks have been controlling what search engines see since the web was started. And it's enforced by US law, with the 1999 precedent Kelly v. Arriba, where a search engine that ignored it was forced to pay $350k to a photographer whose images they linked to in full size (even though the photographer didn't use robots.txt anyway).

To win a lawsuit you are supposed to demonstrate that you reasonably tried to stop the offending activity; failure to do so can be used to call into question the sincerity of your damages claim. What mystifies me is that Agence France Press & co. never decided to merely stop the activity. Did their lawyers confuse them? Or did they, like some other news providers, find their number of visitors increasing thanks to Google News?

There are two reasons I can think of why the robots.txt solution might not be good enough for AFP:

  1. Not every search engine properly respects the robots.txt standard.
  2. AFP might want to be included in Google's regular search results, but not in Google News (because people use the first to go to sites, and the second to not have to go to sites).
But Google now does process robots.txt correctly (it had some early hiccups). And it's ridiculous to demand that a company apply varying standards to you according to your whim. Google may be the biggest search engine, but it's not public utility and it's not a monopoly. And if you can sue Google for profiting by their links to you, which are essentially a collaborative effort because you can prevent them at will, why can't they sue you for profiting from the same links?