September 30, 2009

What is EPUB?

Just as the music industry went through a major change with the rise of digital music and MP3s, the publishing industry is undergoing a paradigm shift away from traditional print models.

In 2009 the current economic conditions have made it more expensive than ever to produce printed content. Rising costs of raw materials for printing and costs for distribution and delivery have all played a major factor in the decline of print. Couple these production factors with a sharp decline in advertising sales and a steady incline in sales of electronic reading devices and you can understand why publishers are looking for new digital publishing business models.

The argument used to be that people would never read a book, newspaper or magazine on a screen, but that argument is fading fast. Just look at these statistics:

  • Electronic book sales accumulated to over 25 million dollars in the first quarter of 2009 (IDPF).
  • eBook sales are up 154.8 percent for the year, while traditional book sales are down 4.1 percent (Association of American Publishers).
  • 171 new digital initiatives, including iPhone applications, social network integrations and other eReader applications, are underway by The Assocation of Magazine Media members as of July 2009 (MPA).

A majority of people will spend eight hours or more a day at work looking at a computer screen. Younger generations and children have been using computers and the Internet their whole life. Reading on screen has become more comfortable and convenient.

What is EPUB?

The EPUB format is based on the original Open eBook (OEB) format, which was used from 1997 to 2007. In late 2007 the International Digital Publishing Forum (IDPF) announced that the EPUB format would replace the OEB format.

The EPUB format is not a pure XML format; rather, the EPUB format is XHTML content wrapped in a .zip file package (.EPUB) that contains XML files to describe the content and metadata (IDPF).

The EPUB (.EPUB) zipped package consists of three parts:

  • Open Publication Structure (OPS)
  • Open Packaging Format (OPF)
  • OEBPS (Open eBook Publication Structure) Container Format (OCF)

The image below shows what an EPUB file contains when unzipped.

The Open Packaging Format file (OPF) acts as a container manifest and defines all of the files included in the EPUB zip file. The OPF file is an XML file that consists of three elements:

<metadata> Bibliography and rights information about the entire book.

<manifest> Defines the pages and resources used in the <spine> element.

<spine> Specifies the order for the book pages.

The Open Container Format packages the entire set of EPUB files together. The container.xml file specifies the locations for the OEBPS folders that contain the content files and the OPF XML files.

Creating an EPUB file

As EPUB content does not need to follow a defined XML structure, creating an EPUB is relatively easy.

Adobe created a desktop reader application called Adobe Digital Editions that provides a nice interface for reading and bookmarking EPUB files. Adobe's support for EPUB is also found in Adobe InDesign. Starting with Adobe InDesign CS3 and expanded in InDesign CS4, Adobe created an export function for creating EPUB files directly from InDesign.

Unlike other XML Exports, no special tagging is needed if creating an EPUB file from InDesign.

Adobe recently published a whitepaper detailing the creation of EPUB files (Adobe Digital Editions) from InDesign, as well as tips and tricks to working with the format.

Adoption of the EPUB format

Possibly because of the simplicity involved in creating EPUB files, the format has taken off and become popular among prominent publishers. In May of 2008 the American Association of Publishers officially announced their support of the EPUB format as the standard eBook format (IDPF).

Publishers that currently produce titles in this format include:

  • Random House
  • Harper Collins
  • Penguin
  • Simon & Schuster
  • Pan Macmillan
  • Oxford University Press
  • Hachette Book Group
  • CQ Press
  • Workman Publishing

EPUB books are also the most widely supported format for reading on digital eBook devices.

Primary examples of EPUB reading devices and applications:

  • Sony Reader eBook device
  • Plastic Logic eBook device
  • Stanza and Classics - iPhone applications
  • O'Reilly's Bookworm - web application
  • Adobe Digital Editions - desktop application

Barnes & Noble recently announced that it would begin offering EPUB titles to compete with the Amazon Kindle. Barnes & Noble plans on launching their eBook service with 700,000 titles, growing to over one million within the first year. That figure more than doubles the 325,000 titles currently available for the Amazon Kindle (Dignan).

Some topics to further explore:

  • EPUB vs. PDF: Why create another format?
  • Is it a problem that EPUB is not as structured as other eBook formats such as DITA or DocBook?

Posted at 03:22 pm by Ivan Mironchuk

Drupal Association Organization Member

Case study

Multi-Channel Publishing for the Daily Racing Form

DPCI helped the Daily Racing Form to automate the creation of page layouts through XML and Adobe InDesign scripting > more

All case studies


Press Release

DPCI Celebrates its 18th anniversary on April 27th, 2017. "I attribute our success to a singular focus on content technologies and on constantly looking to optimize our operations,"; states Joe Bachana, President and founder of the company. more
Alltop, all the top stories