Amazon Finally Steps on Page-Scraped eBooks

Kudos to Amazon (though it’s far, far late coming) for finally pointing out the obvious:

Making eBooks that repackage freely accessible (but not public domain) web content are scams, and aren’t okay to sell.

There’s a specific kind of eBook fraud: “books quickly created from automatically gathered content crawled from the Web”. It’s taken a long time, but Amazon finally sent out an e-mail last night clarifying the content guidelines policy.

“…just because you find content on the web does not mean it is in the public domain. […] We can’t accept content that closely matches content that is freely available on the web, for which you do not hold the sole publishing rights, or that which is not in the public domain.”

Which is a good step. But hey, I wouldn’t be me if I didn’t point out the completely strange (and license-ignoring) justification for enforcing this policy:

“content from Wikipedia and content with private label rights are not allowed since it disappoints our customers to pay for content that is freely available on the web.” [emphasis mine]

It’s strange because they’re focusing on a really ambiguous standard. Not “is it legal” or “does it break the ToS/ToU for the website” or “does it violate copyright”… but whether the reader is pissed off because what they bought was cheaper elsewhere.

This is an important distinction – especially since they’re name-checking Wikipedia.

You can’t just copy something from, say, Clarkesworld, because it’s under copyright. (There’s a potential issue about “sole publishing rights” – for example, I retain anthology rights to the stories in The Crimson Pact, even when the exclusive period ends and the rest of the rights revert to the authors.)

Peter Watt’s backlist is under a Creative Commons license. Specifically, a Attribution-NonCommercial-ShareAlike license. So folks could take the text and convert it to whatever format, but are not allowed to sell it (the NonCommercial bit).

Wikipedia, however, is under (mostly) an Attribution-ShareAlike license. You can reuse the content in another work and sell it as long as you allow others to do the same with the file you created. But Amazon’s policy contradicts Wikipedia’s own license, since you don’t hold the sole publishing rights.

I hope Amazon is using the “pissed off customer” standard simply because Wikipedia-scraped eBooks just… well, feel scammy.

Even if they’re technically legal.

3 thoughts on “Amazon Finally Steps on Page-Scraped eBooks”

SirNolen says:

May 24, 2012 at 13:28

I remember from when I was a Grad Assistant (6 years ago) that there are a number of websites and companies that can look for plagiarism by cross-checking student papers with content on the web. Quick and easy. I have to wonder why Amazon doesn't utilize one of these services, or create their own. I can only assume it's due to the expense, or possibly not wanting to slow down the e-publishing process.
Steven Saus says:

May 24, 2012 at 14:14

Yeah, like TurnItIn. But that would turn up a LOT of matches on magazines that have free web content but sell eBook editions (or projects like my eBook guide or Kris Rusch's "Freelancer Survival Guide". Those false positives would cause a real problem.

Additionally, it would suddenly make Amazon responsible for content – a choice they
Steven Saus says:

May 24, 2012 at 14:16

Stupid phone. I was trying to write that Amazon has avoided ANY responsibility for content (remember the "how to be a pedophile" book a while back?). This is a pretty significant policy departure, and worth paying attention to.

Comments are closed.