Hidden Characters

Image for Hidden Characters
Image for Hidden Characters
Image for Hidden Characters
Image for Hidden Characters

One essential tip for eBook creators:

In every program that supports it; display hidden characters.



One thing eBook creators need to do is to remove unwanted paragraph breaks, and it is useful to see where these might be in in the text that you work with. In Microsoft Word, you can show hidden characters by clicking on the 'pilcrow' sign in the main ribbon.

You may be amazed to see how you have been adding extra (empty) paragraphs in your document in order to space the paragraphs out. This is not acceptible in ePub books. As a matter of fact it isn't really acceptable in most contexts where the content structure should be kept separate from the presentation (HTML, XML etc).

If you need space between the paragraphs then this should be achieved through a style rule for the paragraph (margin on the bottom or top).

InDesign is probably our preferred tool for building eBooks, so you need to switch on 'Show Hidden Characters', found at the very bottom of your 'Type' menu.

Line Breaks

If you download a public domain text from the Gutenberg project, then you need to have a look at the line endings, which are probably paragraph breaks. This is becasuBe you are looking at a facsimile of the original book, scanned, and converted to text. These forced line breaks need to go (unless this is poetry*! - see a future post on this special case), since you will want to allow the text to reflow when it goes into the ePUB / eReader. You can find a screencast here that shows you the technique that I use for this paragraph break removal.

Soft Line Breaks

These are created by using SHIFT-ENTER at the end of a line and creates a line break but in the same paragraph. There are some situations where you might want this and - if you are producing HTML markup then the equivilant is a soft break (<br>). This, however is not a good idea for an eBook, because when the text reflows to fit the format, there may be some unexpected results.

Unwanted Spaces

Often, when you download a text from a scanned public domain source, you may also encounter extra spaces at the beginning of the first line in the paragraph. Of course, you may want an indent on the first line, but putting space characters in the text won't work, and you need to style the paragraph with a line-indent. The extra spaces can be removed using a technique similar to the one described in the screencast referenced above. If the indent has been achieved with 2 or more spaces, then you can easily use the search and replace tools at your disposal. However, if you find only one space at the begining of a line, then you cannot use this technique because there is a danger that you will replace all spaces between the words! So, use GREP in InDesign to rid yourself of those pesky spaces:

Here is the pattern match: 


But then, you need to style the paragraphs with a line-indent (this just targets the first line), which results in the equivilant CSS - text-indent.

Wanted Spaces

I came across a situation recently where there was space (of more than 2 characters), inside a line of a poem. Clearly, you cannot just add the space character, nor can you use non-breaking spaces. The answer was to create a character style in InDesign and select the single space. Then in the ePUB edit the CSS to provide a left margin of measure of (say) 4em...

*See an upcoming article on the special case of Poetry.



Posted on 06 Mar around 3pm


Commenting is not available in this channel entry.