Hacking Amazon E-Books with Spy Style

by bartitsu59

Greetings from France.  This article aims at giving you the opportunity to use your Kindle content as you like, but is not a way to encourage sharing your books all over the Net.  I value creativity in all its forms and hope you will find this little hack a bit creative too.

It's possibly not the easiest way to do the task at hand, but it was really fun to set up and it does not involve any suspicious program or website.

As an avid reader, I was immediately seduced by the possibility of saving some space and having all of my books fit in a neat, small e-book reader.  This is true also for the 2600 issues I bought, especially for the digest volumes, since I discovered 2600 quite late and it allowed me to enjoy previous articles that I did not have the chance to read until now.

I have now nearly 200 books that I'm reading through two different models of Amazon's famous readers.

But recently, I've been more and more concerned about the bond that is slowly forming between such a big corporation and my favorite leisure.

What will happen if one day Amazon decides that my books should be upgraded to their new fancy format or be lost forever?  What if they decide that this upgrade will not be free?  And should I lose all these books I've spent quite a lot of money for if I decide to give a chance to another e-book reader, such as a Kobo reader?

Last but not least, I wanted to find a solution that is close to the UNIX philosophy.

A friend of mine advised me to have a look at online converters such as Zamzar, but I'm a bit paranoid - I don't know for sure what kind of metadata is hidden in the AZW format... maybe my reader 's serial number, my client number, or anything that identifies clearly the device or the customer the book was bought for.  And in that case, I would not be so confident as to potentially leave that kind of information on a website.

Of course, there are offline tools such as Calibre, but this would infringe tenet eight of UNIX philosophy: avoid captive user interfaces.

So I decided I would try to capture the content of my books with offline tools, and then convert each book into an open format.  I found Markdown to be a valid option (mainly because it can quickly be converted to HTML, which can be handled by any device I have at home).

The irony of this is that this hack will be done using a tool designed by another one of the GAFA members (Google, Apple, Facebook, Amazon), even if the principle that will be described in the coming few lines is not bound to any tool in particular.

The Tools

I'm in my 40s, and I recall seeing some action movies where a spy would use a micro-camera to capture information from confidential papers.  This is more or less what I'm proposing to do here.

So I am using Apple tools, as well as open-source tools.

The first tool that I wanted to use is the snap-shot function that is triggered whenever you do a "Command+Shift+4" on Mac OS X.

Fortunately, there is a corresponding command line utility, which is a good place to start - an AppleScript snippet - and it will be the heart of this hack.

So you can do a:
:screencapture shot.png
And you will have your screen captured into a PNG file. If you now submit this image to an OCR tool, like the free and powerful "Tesseract," then your image will be converted into a text file, so let's try it with:
:/usr/local/bin/tesseract shot.png text -l eng
This is how we will capture the text from our book, but I now need someone to turn the pages while I'm taking the pictures, right?

Fortunately, AppleScript can be really helpful here, but of course feel free to adapt this technique to any scripting tool that suits your needs.

This first step complies with the sixth precept of the UNIX philosophy: use software leverage to your advantage.

Preparation

The first thing to do is to ensure full readability is given to Tesseract.  This is quite easy - you just need to open the Kindle app before running your script, and maximize it.  You can also enter the "View Options" menu to choose a bigger font.

Then I suggest you deactivate all readings on screen that are not part of the book itself.  In particular, please disable the popular highlights in the Settings tab.

Finally, you can hide the toolbar by right-clicking on it and choosing the relevant option.  Nothing but the text of the book should now be displayed on-screen.  But wait, we still have the progression data: number of pages, percentage read, and the other metrics I never understood (location).

So we will need to tell the screen capture tool to limit the capture to a restricted portion of the screen.  To do this, I suggest you use the screen capture shortcut ("Command+Shift+4").  Then your mouse pointer will change to a crosshair with coordinates near.

Use this to determine the useful area of text that will be analyzed by the OCR tool (in my case "150,0,1300,850" was fine) and note it somewhere.

Scripting

It's now time to open the script editor and chose a meaningful name for our script.  I would suggest: screwDRM.scpt

The first hurdle to overcome is to tell our script to activate the Kindle application while the latter is maximized and try to send it a "Right Arrow" keyboard event, to see if we are able to flip pages automatically.

After a while of Googling, you will find that:
:tell application "Kindle" to activate
tell application "System Events"
	key code 124
end tell
does exactly what we want. This is really the key feature of AppleScript that makes this trick possible. I will let you find an equivalent feature for your OS of choice, but Microsoft gives you a hint if you want to do the same with PowerShell: technet.microsoft.com/en-us/library/ff730976.aspx

Wrap this into a "repeat loop" and the pages will be flipped for you.

The next step is dead simple - we just need to call in sequence screencapture and Tesseract to capture the text on the fly:
:set shellCommand to "screencapture -R 150,0,1300,850 -T1 -m /Users/Jerome/ebooks/" & i & ".png"
do shell script shellCommand
delay 1
set shellCommand to "/usr/local/bin/tesseract /Users/Jerome/ebooks/" & i & ".png /Users/Jerome/ebooks/" & i & " -l eng"
do shell script shellCommand
delay 1
You will probably notice the -T1 that tells screencapture to take the picture after a delay of one second.  Also, you will notice the explicit delay 1 instructions after the screen capture and after the OCR.

I've put this in to allow time for my Mac to do each step.  Since this involves some computation and quite intense IO operations, it makes sense in my opinion (I guess it could be shortened with a faster CPU and an SSD drive).

Of course, I also specified to Tesseract the dimension of the screen to be captured (with the "-R" option) that I determined during the preparation.

Even if it could rely on more open tools (I'm counting on clever Linux users to fix that), this is a nice way to comply with the seventh principle: use shell scripts to increase leverage.

Ending Our Script and Cleaning Up

The last difficulty I have overcome is the detection of the end of the book.  First, I started with an estimation of the number of pages, which I used for my "repeat loop."

For example, I would count the number of pages I would flip until I got to 10 percent read - say 23 - and would then estimate the number of pages to be captured to be 250, and would write:
:repeat with i from 1 to 250
	tell application "System Events"
	...
	end tell
end repeat
I admit it was not very clever, but it worked until I could find a more acceptable solution.

I wanted to stick to pure scripting techniques, in the tradition of UNIX scripts.  As we are producing pure text files (fifth principle: store data in flat text files), we basically need to compare the current text file being processed and the last one produced just before.  If the two files are identical, it will then mean that we are at the end of the book with no more pages to flip.  You can easily do that with the UNIX command diff that tells the differences between two files.

So, all we need is to "diff" the last two files and find a way to capture the result, so that two files reported as identical would break the processing loop.  Fortunately, diff returns an exit value depending on the result of the comparison.

In an AppleScript, an exit value difference of zero means that there is no error, so all we need to do now is to use a try statement to break the loop if no error happens.

Wait... no error?  Yes indeed, since an error will be triggered as long as the files compared differ, we want to break the loop only if the files are identical, i.e., if no error happens (exit value "0", interpreted as "no error" by AppleScript).

This leads to the final version of our script:
:tell application "Kindle" to activate
repeat with i from 1 to 999
	tell application "System Events"
		key code 124
		set shellCommand to "screencapture -R 150,0,1300,850 -T1 -m /Users/Jerome/ebooks/" & i & ".png"
		do shell script shellCommand
		delay 1
		set shellCommand to "/usr/local/bin/tesseract /Users/Jerome/ebooks/" & i & ".png /Users/Jerome/ebooks/" & i & " -l eng"
		do shell script shellCommand
		delay 1
		try
			do shell script "diff -q /Users/Jerome/ebooks/" & i & ".png /Users/Jerome/ebooks/" & (i - 1) & ".png"
			exit repeat
		on error
			# last images are different so continue
		end try
	end tell
end repeat
At the end of this, you might add a clean up phase, consolidating all of the TXT files into a single one and deleting the PNG files, but that require that you add a statement with administrator privileges at the end of each do shell script.

Furthermore, we cannot clean the files at each iteration, since we rely on the result of the previous iteration to detect the end of the book.

I prefer to execute the following three statements in a regular terminal window:
:for i in {0..999}; do rm "$i.png"; done
for i in {0..999}; do rm "$i.png"; done
for i in {0..999}; do cat "$i.txt" >> book.txt; done
Conclusion

Of course, as OCR is never perfect, you need to do a bit of proofreading after that, and to replicate the original layout (cover, titles, formatting, etc.) in Markdown (or whatever format you prefer).

But all-in-all, the possibility of reading a book even on an old 300 MHz FreeBSD laptop is a nice addition (with a homemade program in Scheme that converts the book from Markdown to HTML).

Feel free to use this hack for useful tasks, but I would be equally satisfied if it inspired new hacks with a similar approach.

This is what I like about hacking: the ability of finding alternative ways to do things, with a supplement of fun or creativity.

Code: screwDRM.scpt

Return to $2600 Index