DAISY Pipeline: The Character Set Switcher

Original Author(s): Olaf Mittelstaedt

The Character Set Switcher is necessary for a much used but somewhat dated authoring software called Sigtuna DAR 3. Whereas all newer authoring software represent text in utf-8, Sigtuna only ‘understands’ window-1252 (or any other language representation from the windows character set like Arabic, Russian, Chinese, Japanese)

Input required: a valid(!) DAISY 2.02 file set


Ctrl + N (or click: New Job Wizard)

choose=>Modify and Improve=>Multi-Format=>Character Set Switcher

Image of Pipeline wizard with highlighted path to Character Set Switcher

Hit ‘Next’ and browse for the path to your *.ncc Input File and select a path for your changed file set.

NOTE: The ‘Browse’ function might be set to look only for *.xml files. Change that to be looking for all (*.*) files for the *.ncc to be visible!

Image of Pipeline configuration window with highligted Parameters

There are Optional Parameters:

  • Output encoding: In the example above windows-1252 is inserted, if not set the output encoding will be the default utf-8.

  • Linebreaks : This is normally left as System default, but can be changed.

  • XML Validation Report: As a last item you can browse for the path where you would like to have that report.

Hit ‘Finish’ and run that job with Ctrl + F1

Please re-validate the result (see Transformation example one).

See also

DAISYpedia Categories: 

This page was last edited by PVerma on Saturday, August 28, 2010 22:49
Text is available under the terms of the DAISY Consortium Intellectual Property Policy, Licensing, and Working Group Process.