Reading, converting and editing EPUB ebooks (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Wed, 30 Mar 2011

Reading, converting and editing EPUB ebooks

Since switching to the Archos 5 Android tablet for my daily feed reading, I've also been using it to read books in EPUB format.

There are tons of places to get EPUB ebooks -- I won't try to list them all, but Project Gutenberg is a good place to start. The next question was how to read them.

Reading EPUB books: Aldiko or FBReader

I've already mentioned Aldiko in my post on Android as an RSS reader. It's not so good for reading short RSS feeds, but it's excellent for ebooks.

But Aldiko has one fatal flaw: it insists on keeping its books in one place, and you can't change it. When I tried to add a big technical book, Aldiko spun for several minutes with no feedback, then finally declared it was out of space on the device. Frustrating, since I have a nearly empty 8-gigabyte micro-SD card and there's no way to get Aldiko to use it. Fiddling with symlinks didn't help.

A reader gave me a tip a while back that I should check out FBReader. I'd been avoiding it because of a bad experience with the early FBReader on the Nokia 770 -- but it's come a long way since then, and FBReaderJ, the Android port, works very nicely. It's as good a reader as Aldiko (except I wish the line spacing were more configurable). It has better navigation: I can see how far along in the book I am or jump to an arbitrary point, tasks Aldiko makes quite difficult. Most important, it lets me keep my books anywhere I want them. Plus it's open source.

Creating EPUB books: Calibre and ebook-convert

I hadn't had the tablet for long before I encountered an article that was only available as PDF. Wouldn't it be nice to read it on my tablet?

Of course, Android has lots of PDF readers. But most of them aren't smart about things like rewrapping lines or changing fonts and colors, so it's an unpleasant experience to try to read PDF on a five-inch screen. Could I convert the PDF to an EPUB?

Sadly, there aren't very many open-source options for handling EPUB. For converting from other formats, you have one choice: Calibre. It's a big complex GUI program for organizing your ebook library and a whole bunch of other things I would never want to do, and it has a ton of prerequisites, like Qt4. But the important thing is that it comes with a small Python script called ebook-convert.

ebook-convert has no built-in help -- it takes lots of options, but to find out what they are, you have to go to the ebook-convert page on Calibre's site. But here's all you typically need

ebook-convert --authors "Mark Twain" --title "Huckleberry Finn" infile.pdf huckfinn.epub
Update: They've changed the syntax as of Calibre v. 0.7.44, and now it insists on having the input and output filenames first:
ebook-convert infile.pdf huckfinn.epub --authors "Mark Twain" --title "Huckleberry Finn"

Pretty easy; the only hard part is remembering that it's --authors and not --author.

Calibre (and ebook-convert) can take lots of different input formats, not just PDF. If you're converting ebooks, you need it. I wish ebook-convert was available by itself, so I could run it on a server; I took a quick stab at separating it, but even once I separated out the Qt parts it still required Python libraries not yet available on Debian stable. I may try again some day, but for now, I'll stick to running it on desktop systems.

Editing EPUB books: Sigil

But we're not quite done yet. Calibre and ebook-convert do a fairly good job, but they're not perfect. When I tried converting my GIMP book from a PDF, the chapter headings were a mess and there was no table of contents. And of course I wanted the cover page to be right, instead of the default Calibre image. I needed a way to edit it.

EPUB is an XML format, so in theory I could have fixed this with a text editor, but I wanted to avoid that if possible.

And I found Sigil. Wikipedia claims it's the only application that can edit EPUB books.

There's no sigil package in Ubuntu (though Arch has one), but it was very easy to install from the sigil website.

And it worked beautifully. I cleaned up the font problems at the beginnings of chapters, added chapter breaks where they were missing, and deleted headings that didn't belong. Then I had Sigil auto-generate a table of contents from headers in the document. I was also able to fix the title and put the real book cover on the title page.

It all worked flawlessly, and the ebook I generated with Sigil looks very nice and has working navigation when I view it in FBReaderJ (it's still too big for Aldiko to handle). Very impressive. If you've ever wanted to generate your own ebook, or edit one you already have, you should definitely check out Sigil.

Tags: , ,
[ 11:17 Mar 30, 2011    More tech | permalink to this entry | ]

Comments via Disqus:

blog comments powered by Disqus