Description of the translation system
=====================================
$Id$
This is a description of the translation method used in the Norwegian
svnbook since it started way too long ago. The system is designed to try
to make translation and keeping the files in sync with the English files
as easy as possible. If you are new to the translation, browse through
this file, hopefully you won’t find the system too painful to follow.
The translated files are copies of the English files, which all the
translations are based upon. To ensure proper updating and make
syntactically checks against the English files possible, nothing is
removed from the English sections in the files, only added. The English
text is kept in the files, but are commented out with special markers:
Denne utgaven av boken er blitt oppdatert for å dekke nye
funksjoner og forandringer i oppførselen for Subversion 1.1.
Her er en rask oversikt over pekere til store forandringer i
1.1.
It may look as a tedious task to add those markers, but this is taken
care of by a macro defined in src/nb/tools/svnbook.vim . If you also set
"foldmethod=marker" in Vim, the translated English text is automatically
folded away. (The only editor that has been used for the translation is
Vim, so the description here evolves around that one.)
Editmode and commitmode
=======================
Because the English text is still present, but merely commented out, it
leads to problems when double dashes ("--") are used there. Double
dashes inside XML comments are forbidden, so they have to be escaped in
some way to not cripple the XML. This is done by the
src/nb/bin/clean_files script which can switch the files between two
modes: “commit mode” and “edit mode”. The difference between those two
modes is:
* Commit mode
An unique character (U+FCE2, also known as ﳢ) is inserted
between all occurences of double dashes in everything that is
commented out, i.e. the English text. This makes the XML valid and
can be checked into the repository or be built.
* Edit mode
All the ﳢs are removed, and the files are now ready for
diffing against the English files or something else. You probably
won’t need to run this command unless you have to resolve conflicts
or check that nothing have changed in the English sections. It can
also be practical to use if you have to copy+paste examples or text
with lots of double dashes. Just exit your editor, run "make
editmode" and continue editing. This mode is only for editing
purposes, and THE FILES MUST NOT BE CHECKED IN WHILE IN THIS MODE.
Run a “make commitmode” to get all files in shape first.
Normally you don’t have to think about running these make commands,
except when you’ve translated text that contains double hyphens. If
there are some double dashes which needs escaping, the validation
process (you always run “make valid” before checkin, right?) will return
errors like
book/ch-basic-usage.xml:327: parser error : Comment must not contain
'--' (double-hyphen)
and you will need to run “make commitmode” to heal the files.
The actual translation
======================
To translate a paragraph, the English text has to be commented out and
the translation follows immediately after the commented-out block. This
is done with the predefined macro in src/nb/tools/svnbook.vim which
filters the block through the src/nb/bin/dings_it script. Weird name,
but it works.
Mark the paragraph with Shift-V in Vim and press . The macro expects
the dings_it script to exist at ~/bin/dings_it, and if it does, the
block is automatically commented out and all the elements with the
attributes intact are copied after the block, and the cursor is placed
in the right position ready for writing. Repeat this process through the
file.
Some conventions exists, much to the lack of protests, and to keep
things consistent, over to some formatting rules:
* Width of text lines is 72 characters.
* As usual, use no TABs, only spaces.
* When marking and commenting out English paragraphs, don’t break the
element structure, for example by including a without
also including the start tag. In essence, try to only include the
actual text when commenting out the original text, not block-level
elements.
* To make reformatting cleaner and to avoid cluttering the diffs with
large sections of wordwrapping, the whole translation uses “flowed
text” (wrapped lines are terminated by a space) and every sentence
is written on a new line. This limits wordwrapping to the current
sentence only and doesn’t mess up the whole paragraph.
Fuzzy translations
==================
At some point or another, you will come across sentences or words that
can be hard to translate. Instead of fiddling around with the sentence
for many minutes, mark it with a commented-out ¤ (U+00A4, currency sign)
just before the dubious word, sentence or translation. As long as it is
properly marked, it can be searched up and fixed later.
The TRANSLATION-STATUS file
===========================
The Subversion book isn’t exactly small, so to keep things tidy and
manageable, the current status of the translation is stored in the
TRANSLATION-STATUS file. This file is generated by "make status" and
lists the translation status of every file.
To be able to generate this file, special markers are used in the
DocBook files to mark untranslated blocks and parts which need
proofreading. The format of the marker is:
"" — Start
"" — End
"type" can be "TR" for untranslated parts, or "CHK" for parts that need
checking/proofreading.
This text is not translated yet.
This text is translated, but needs checking.
The markers should be placed at the beginning of a separate line.
Keeping the translation updated
===============================
With the rapid development of Subversion, not keeping the translation in
sync with the English version soon makes the information old and
incomplete. To make the process of synchronising the translated version
with the English version as safe and easy as possible, Subversion itself
is merging new changes from the src/en/ directory structure into the
src/nb/ area. This is taken care of by "make sync". This make process
does the following things:
1. sets all the files into edit mode (removes the ﳢs) to
minimise the chance of conflicts,
2. merges the changes since the last synchronisation into the
translation from the English version, and
3. sets all files back to commit mode.
You can then copy the exact changes into the translated sections. Using
a visual diff tool in these circumstances is a great help.
To avoid unnecessary conflicts, it is very important that the English
text remain unchanged. Don’t remove or add whitespace, for example. New
content is _added_ to the file, it is not _replacing_ something. Use
those @ENGLISH markers. If this is done properly, syncronising against
the English version is quite easy and seldom produces conflicts. An
example of a sync session goes as follows:
cd src/nb/
make sync
# New updates from the book are now merged into the working copy. By
# using tools like svndiff
# or vimdiff
# (part of the Vim package) it is fairly easy to copy the changes from
# the English sections to the translated sections.
#
# After you’ve done editing the files, ensure that all files are ready
# for commit:
make commitmode
# Then, check that nothing in the English parts have changed, i.e. the
# XML file corresponds to the English original:
make bookdiff
# Check that the XML doesn’t contain errors:
make valid
# If xmllint(1) is not available, make a test build:
make book-html
# Update the TRANSLATION-STATUS file:
make status
# Commit:
svn checkin
That’s mostly it. Doing this on a regular basis or when big things
happen in the English section eases the sync pain.
But if the last sync was done a long time ago or there have been
horrible revolutions in the src/en/ directory so things gets really
messy after the merge, you have the option to do the merge in smaller
steps. Let’s say that a file rename happens in r1000 and there are lots
of file changes before and after this rename. What you can do, is to
first merge in all the changes from when the last synchronisation was
done (stored in the LAST_SYNC file) up to r999, and then take the
following merges in decent steps:
make sync HEAD=999
This will only merge the changes up to r999, hopefully making life
easier. The step involving “make sync” can be repeated several times
even if the files in the working copy are modified.
Some make commands
==================
Almost every action on the files can be done from various make commands.
In addition to the makes available in the English version these can also
be used:
* make sync
Merge all new changes from the English XML files to the translated
version.
* make bookdiff
Check that the English parts of the translated files are identical
to the original English data. Creates two files — bd.tmp/eng.txt and
bd.tmp/norw.txt — which can be studied if there are inconsistency
between the versions.
* make status
Update the TRANSLATION-STATUS file with the current statistics.
* make engdir
Create a book/eng/ directory which is nice to have around if it’s
necessary to diff against the master files.
* make commitmode
Prepare the files for commit.
* make editmode
Remove the ﳢ entities to make it easier to compare the files
against the original version, for example the files in
src/nb/book/eng/ which is created by make engdir.
Scripts
=======
* src/nb/bin/clean_files
Converts between editmode and commitmode.
* src/nb/bin/dings_it
Used during the translation. Takes care of commenting out the
English text and copies all the XML elements. Called from the
macro in svnbook.vim and is expected to exist as ~/bin/dings_it .
* src/nb/bin/genstat
Generates translation statistics for use in the TRANSLATION-STATUS
file.
* src/nb/bin/h2u
* src/nb/bin/u2h
Two scripts which is not used anymore, converts to/from numerical
entities to/from UTF-8.
* src/nb/bin/strip_english
Removes the @ENGLISH markers with the English text from the XML
files, useful if the XML source is going to be distributed and
duplication of the English text is not a good thing.
vim: set fo=tcq2w tw=72 fenc=utf8 ts=2 sw=2 sts=2 et :
vim: set com=b\:#,fb\:-,fb\:*,n\:> :