README.md 4.88 KB
Newer Older
1
2
3
Converter hdoc_to_pdf
-----------------------

4
The purpose of this converter is to obtain a PDF file from a hdoc document.
5
6
7
8
9
10
11
12
13
14
15


License GPL3.0
--------------

http://www.gnu.org/licenses/gpl-3.0.txt


Credits
-------

16
17
18
*   2016
    - Raphaël Debray
    - Baptiste Perraud
19
20
21
22
23
24
25
26
27
28
29
30


Dependance
----------


This project can be used alone if you only want to convert a hdoc file into a PDF file.


User documentation
------------------

31
There are two different ways to use the converter hdoc_to_pdf: by running a script run.bat/run.sh or by command line using a terminal (allows the user to specify some parameters).
bperraud's avatar
bperraud committed
32
The folder samples contains a hdoc file which may be used for some tests.
33

34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#### Running the script run.bat/run.sh:

Use this method if you do not want to use a terminal.

1. Download hdoc_converter.zip and unzip it.
2. Add your source file into the input folder. It must be a .hdoc file.
3. Place _only one file_ in that folder.
4. On Linux or Mac, run the script run.sh. On Windows, run the script run.bat.
5. Your file has been converted, the result is in the output folder.  

#### Terminal:

By using the terminal you can specify some parameters to the conversion at the moment: the source file.

1. Download hdoc_converter.zip and unzip it.
2. Open your terminal and go into the folder hdoc_to_pdf.
3. Run the following command:

    "ant -buildfile hdoc_to_pdf.ant"

    You can specify the source file by adding parameters.
    Use -DInputFile to specify the source file.
    Exemple:

    "ant -buildfile hdoc_to_optim.ant -DInputFile=sample.hdoc"


This parameter is optional. Your file has been converted, the result is in the output folder.
62
63
64
65
66


Known bugs
----------

bperraud's avatar
bperraud committed
67
68
* Nested ul in ol are sometimes converted to ol.
* It seems that FS doesn't support the max-width for img tags, which makes proper scaling harder.
bperraud's avatar
bperraud committed
69
* ToC lines rendering is sometimes ugly if the title label is too long: dotted leader or even page number may appear on the following line, sometimes colliding between themselves.
bperraud's avatar
bperraud committed
70
* Inline elements like em cause bad paragraphs justification.
71
* Sometimes, they are unwanted page breaks after a heading.
72

Baptiste Perraud's avatar
Baptiste Perraud committed
73
Generic Todo
74
------------
75
76
77
78

* Generate a clean PDF file (using the LaTeX formatting example)
    - Create a default CSS file with basic spine rules
    - Get the right free font (equivalent to the LaTeX's one)
Baptiste Perraud's avatar
Baptiste Perraud committed
79
* Generate the ToC according to the converted (by XSL) headings of the hdoc
80
81
* Handle as fully as possible of widows and girl orphans; trying to match Prince's layout and implementing the suitable CSS rules (which shall not be interpreted by FS)
* Allow the user to override some specific CSS rules, according to the main layout logical rules
82
* Manage container.xml and content.xml validations using jing as jing task can't be handled with opale_to_pdf.ant call.
83
84
* Bonus: find out a HTML editor to manually add line breaks to a hdoc file in order to resolve widows and girl orphans problems after the PDF file's generation

bperraud's avatar
bperraud committed
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
Specific Todo list
------------------

* Gestion du bug d'espace qui apparaît après un lien.
* Ajouter le paramètre de reliure ("bound") au script ant
* Intégrer les styles CSS selon le paramètre "bound" dans un xsl
* Ajouter le paramètre de recto-verso au script ant
* Intégrer les styles CSS selon le paramètre recto-verso dans un xsl
* Ajout le support des marges pour documents oneside reliés
* Ajout le support des marges pour documents twoside reliés
* Identifier les règles CSS principales de traitement des tableaux
* Gérer les espacements veuves/o. pour les paragraphes
* Gérer les espacements veuves/o. pour les listes
* Gérer les espacements veuves/o. pour les tableaux
* Gérer les espacements veuves/o. pour les images
* Support des objets : ajouter une consigne dans le README de convertir tout objet graphique (odg, etc.) en image avant l'exécution
* Support des objets : ajouter des règles xsl de transformation des <object> en <img>
* Permettre à l'utilisateur de surcharger les règles CSS selon les règles logiques de la mise en page par défaut


105
106
107
108

Technical notes
---------------

109
* This converter works with _only one_ hdoc file in the input folder at the moment, please ensure to clean the folder before proceeding with the hdoc you want to convert to PDF. When the multifiles ability is set within the hdoc_to_pdf converter, the opale_to_pdf one shall naturally work because it already implements the opale_to_hdoc multifiles handling (the copy of all the hdoc results into the input directory of the hdoc_to_pdf converter).
110
111
112

User Story
----------
113

114
115
116
117
118
119
* Cas d'un fichier hdoc à convertir :
  * L'utilisateur dispose d'un fichier hdoc en entrée, il veut obtenir un fichier pdf paginé en sortie.
  * Il accède au convertisseur (dossier dédié) hdoc_to_pdf.
  * Il place le fichier hdoc dans le dossier input.
  * Il lance le script run.bat/run.sh ou exécute directement le script ant hdoc_to_pdf.ant.
  * Il récupère le fichier pdf dans le dossier output.
120
121
122
123


Capitalisation
--------------