From Print to DAISY: Striking a Balance
In principle, a DAISY Digital Talking Book (DTB) should reflect the structure of the original print publication. Extensive markup, however, requires significant resources, and alternative format producers must strike a balance between resources available, production requirements, the needs of readers who are visually impaired or have a print disability and - in a library - broader collection development goals. This document aims to help producers strike that balance.
Three Basic Types of DAISY DTBs
- Audio with NCX: DTB with structure. The NCX is the Navigation Control Center, a file containing all points in the book to which the user may navigate. The XML textual content file, if present, contains the structure of the book and may contain links to features such as narrated footnotes, etc. Some DTBs of this type may also contain additional textual components, for example, index or glossary, supporting keyword searching.
- Audio and full text: DTB with structure, complete text and audio. This type of DAISY DTB is the most complete and provides the richest multimedia reading experience, and the greatest level of access. The XML textual content file contains the structure and the full text of the book. The audio and the text are synchronized.
- Text and no audio: DTB without audio. The XML textual content file contains the structure and full text of the book. There are no audio files. This type of DAISY DTB may, for example, be rendered with synthetic speech or with a refreshable braille display.
The Richest Reading Experience
DAISY DTBs with audio provide the richest reading experience and are currently the most expensive to produce if human voice narration is used (rather than synthetic speech). A full text/audio DAISY book gives the end user the choice as to how the book will be read. It will meet the needs of someone who wants to read a novel from end to end with a DAISY hardware player, without using any of the navigation features or other advanced functionality. At the same time, that book can be used by a university student reading the book with a DAISY software player - going to specific pages to read the selections for the next tutorial, spelling out proper names for accuracy in essay writing, word searching throughout the book, bookmarking passages for later reference, etc. It's the same DAISY DTB, but it's the end user who determines what functionality to use and how to use it.
There are numerous DAISY production tools available that support both the creation of the book structure and the recording of the human narration which is linked to the text content during the production process (or in some cases, in the post production process). Some also support 'importing' of a text source file for DAISY production with human narration, while others support the production of a DTB with synthetic speech (from a source file).
Some DAISY books require the creation of an xml or xhtml file prior to recording (or generation with synthetic speech). This extra step in the production process requires additional time as well as staff resources with additional skill sets.
DTBs produced with synthetic speech rather than human voice are faster and less expensive to create, however, not all end users are accepting of audio rendered with synthetic speech. This is changing rapidly, and more and more people are becoming willing to accept DAISY DTBs with synthetic speech, particularly for non-fiction reference and study materials.
How Much Structure is Enough?
There are factors to consider. For example, if a book that is to be added to a library's general reading collection contains 5 levels of headings and subheadings, should the DAISY structure be complete and include all 5 levels if it would require many hours of work to create the structure? Would the first 3 levels (headings with levels 1 through 3) be sufficient to provide 98% of all end users with the navigation they would need to use the book efficiently and effectively?
DAISY DTB Structure = Print Structure
Different organizations establish different practices for structuring and DAISY production generally, however, it is recommended that if the resources can be made available and/or if more efficient alternative ways of creating the structure can be implemented, that the levels of structure in the print book be replicate in the DAISY DTB.
Many books, particularly fiction, have limited structure, often a single level only. Resources required to create the structure are minimal as compared to those that may be required to create the structure of a multi-level work of non-fiction.
These are decisions that need to be made within each producing organization. Identification and allocation of resources, existing skill sets (and potential for improved or expanded skill sets within existing staff resources) and output goals have to be considered. If an organization has set a production goal that cannot be reached with existing resources without limiting the degree of structure in the books it creates, then there may be no alternative. It is important to remember however that it is even more time consuming and thus more expensive to revisit and update a collection at a later date.
At a minimum, original DAISY DTB productions should include both headings and pages. A novel produced with page access can be used effectively for both pleasure reading and study purposes (again, the needs of the end user should be kept in mind). It does not require a significantly greater amount of resources to include pages, but there is significant benefit. Converted books (produced from analogue tapes) on the other hand, if not tone indexed in the analogue tape, require significant resources to include pages. Many organizations therefore may determine that this is not an effective use of resources, and produce converted DAISY DTBs with headings only.
Synthetic speech, noted earlier, is an option that some organizations may wish to consider for publications such as magazines, television guides, newspapers, etc. Documents and materials which will be used in part to glean information are well suited to this.
Other applications to consider for synthetic speech rendering are indices, bibliographies and notes which are often present in works of non-fiction. It can be much faster and far less expensive to produce a source file of an index and render synthetic speech from it, than to produce it with human narration.
Note that a source file is required for the above applications. Other factors such as quality of speech synthesizing program/system and complexity of vocabulary and/or multiple languages in the source document should be given consideration in determining if this is a viable option.
Note also that DAISY DTBs which are text only use significantly less storage space than those which contain audio.
Summary of Factors
Summary of factors to consider when choosing synthetic speech rather than human voice narration:
- User acceptance of synthetic speech
- Frequency of publication (e.g., production of a daily newspaper would be impossible to sustain with human voice narration)
- How quickly the materials are needed
- Quality of speech synthesizing program/system
- Complexity of vocabulary and/or multiple languages in the source document
- Importance of completely accurate pronunciation (e.g., readers might rely on a glossary to learn how to pronounce new vocabulary)
As noted earlier, a DAISY DTB should reflect, as closely as possible, the original print publication. This cannot be overstated. The type of book in and of itself will in part determine the amount of structure and what type of DAISY DTB is most appropriate. For example, a mass market novel rarely contains much structure and can generally be enjoyed by an end user with the limited structure present in the print publication. The more complex a print publication is, the more complex the DAISY DTB is likely to be (if it indeed reflects the print). A print publication intended to be used as a reference book, will need to be produced so that the DAISY rendering affords the end user parallel or better access to the information contained within it.
Structure and the Print Book
The following list progresses from categories of books where, generally speaking, minimal structure will be required (and similarly, will likely not be present in the print book) to categories where extensive structure is more critical:
- Children's picture books
- Mass market fiction
- Literary fiction
- General non-fiction
- Instructional publications
- Reference materials
- Scholarly works
If the alternative format producer is a library, it must also consider broader collection development goals. These goals may include compatibility with the minimum standard for sharing resources among libraries. They may also factor in annual production quotas for total number of books added to the collection as well as the breadth and depth of subjects covered.
In principle, a DAISY Digital Talking Book (DTB) should reflect the structure of the original print publication. Extensive markup, however, requires significant resources, and alternative format producers must strike a balance between resources available, production requirements, the needs of visually impaired and print-disabled readers and - in a library - broader collection development goals. This document aims to help producers strike that balance.
Text is available under the terms of the DAISY Consortium Intellectual Property Policy, Licensing, and Working Group Process.