Shakespeare.txt.jpg

Warning! This is old. It was last updated in June 2013 and may be obsolete, outdated, unsafe or just embarrassing. Treat with caution.

A JPEG compression experiment

JPEG image compression is lossy. Every time you edit and save a picture, some of the original content is lost. But it's difficult to see that with the naked eye, so I compressed Shakespeare instead.

A book with text that starts '!O Romep+ Rpldo  wiepffnre arr!riov Romep@'.

“O Romep+ Rpldo wiepffnre arr!riov Romep@
Dgoy thz gatggr `me tefusf sgx n`me!”

That's the balcony scene from Romeo and Juliet, compressed at “maximum” quality in Photoshop: I loaded the text as a RAW, then outputted the compressed file back to plain text.

Even on ‘maximum’ quality, almost all the characters are replaced by their neighbours in the alphabet. On an image, that would be a minuscule change in colour, undetectable to the eye: but rearranged into a different form, even ‘maximum’ quality is enough to render the text a significant challenge to decipher.

So I tried it at various qualities, all the way down to Photoshop's ‘minimum’. Then, for the heck of it, I got them all bound as books.

Six books piled up. The spine of the top reads 'The Tragedy of Romeo and Juliet'; the rest are jumbled strings of characters.

At higher qualities, the text still maintains the character of a play, but the words grow increasingly incomprehensible.

As the quality degrades, many characters were converted into ASCII control codes: in this case, for publishing, I rendered them as spaces (save for vertical tab and carriage return, which were converted to new lines). Worse, a lot of new lines become corrupted into regular characters, reducing the play to a string of nonsense.

A jumble of text.

But the strange thing is this: on the front of each book is the JPEG image it was derived from. And, for all but the lowest quality, they appear utterly identical to the naked eye.

Book covers.

We're sensitive to data loss in text form: we can only consume a few dozens of bytes per second, and so any error is obvious. Conversely, we're almost blind to it in pictures and images: and so losing quality doesn't bother us all that much.

Should it?

Update

Hello unexpected influx of readers! A few notes for you:

  • Yes, I'm aware this is pretty meaningless (see my original tweet) but it seems to have sparked some discussion. I didn't expect it to really go very far.
  • No, I'm not selling the books. I try not to create too many useless physical objects. If you're hell bent on it, or want to see the full text, I've uploaded all the source files. I ask that you don't sell anything you make with it, although given I'm ripping off Shakespeare and the JPEG algorithm that's probably a bit rich.
  • If anyone wants to commission a professionally bound hardcover set for some net art exhibit, let me know. I'll happily pretend to be a serious artist.
  • Also, I give pretty good talks about the future and the present, and you should totally book me for your conference.

Contact

Tom on YouTube Tom on Twitter Tom on Instagram Tom on Facebook

A production of Pad 26 LimitedPrivacy