README.md 4.04 KB
Newer Older
1
# Converter framapad_to_hdoc
qaomia's avatar
qaomia committed
2 3 4 5 6

## License
License GPL3.0
http://www.gnu.org/licenses/gpl-3.0.txt

Jean Vintache's avatar
Jean Vintache committed
7
## Credits
8
- 2016
9 10
    - Etienne Chognard
    - Fabien Boucaud
Jean Vintache's avatar
Jean Vintache committed
11 12 13 14 15 16
- 2015
    - Jean-Côme Douteau
    - Gabrielle Rit
    - Jean Vintache
- 2014
    - Fecherolle Cécile
qaomia's avatar
qaomia committed
17 18

## Presentation
Fabien Boucaud's avatar
Fabien Boucaud committed
19
This module is able to convert several [framapad](https://framapad.org/) files (exported as html files) to the hdoc format.
qaomia's avatar
qaomia committed
20 21 22

## User documentation

23 24 25 26
## User Story

Vous êtes un utilisateur de framapad et en créez un pour un projet. Après avoir travaillé sur votre pad, vous souhaitez transformer ce document en un autre format que vous pourrez utiliser dans un nouveau contexte de travail. Pour ce faire, vous exportez le framapad au format HTML grâce au bouton « Import/Export ». Vous récupérez ensuite les fichiers nécessaires à la transformation framapad to hdoc sur le répertoire git du projet hdoc (voir http://hdoc.crzt.fr/). Il ne vous reste alors plus qu'à placer le fichier html précédemment récupéré dans le dossier « input » du dossier framapad_to_hdoc et à exécuter le /run.bat si vous êtes sur Windows ou le /run.sh si vous êtes sur Linux/Mac. Cela produira une archive .hdoc dont l'intérêt est de servir de format de passage pour une transformation d'un format à un autre, et ce pour une grande variété de format. Il vous restera ensuite à déterminer le nouveau format dans lequel vous voulez transformer votre hdoc et utiliser le convertisseur approprié s'il existe.

Etienne Chognard's avatar
Etienne Chognard committed
27 28
## Running framapad_to_hdoc.ant
1. Create a framapad document and export it as an html file.
Jean Vintache's avatar
Jean Vintache committed
29 30 31
2. please place your html files in the `/input` folder
3. run the `run.[bat|sh]` script of your choice depending on your OS
4. and retrieve the hdoc outputs in the `/output` folder
qaomia's avatar
qaomia committed
32

33 34
## Product Backlog

Fabien Boucaud's avatar
Fabien Boucaud committed
35
Currently (january 2017) available on: https://framemo.org/framapad_to_opale
36

37 38
See also : https://bimestriel.framapad.org/p/nf29_framapad_to_opale for the full documentation of our working process.

qaomia's avatar
qaomia committed
39
## TODO
Fabien Boucaud's avatar
Fabien Boucaud committed
40
- Code tags
Jean Vintache's avatar
Jean Vintache committed
41
- Markdown
Fabien Boucaud's avatar
Fabien Boucaud committed
42
- Tags for typing the structure
qaomia's avatar
qaomia committed
43 44 45


## Technical notes
Etienne Chognard's avatar
Etienne Chognard committed
46
### Description of framapad_to_hdoc.ant
qaomia's avatar
qaomia committed
47 48

#### Prelude
Jean Vintache's avatar
Jean Vintache committed
49 50
- Importation of necessary classes (antlib, htmlcleaner, jing)
- Creation of directories architecture tree
qaomia's avatar
qaomia committed
51 52

#### Transformations
Jean Vintache's avatar
Jean Vintache committed
53 54 55 56 57
- Use of htmlcleaner to transform the input file from html to xhtml. For more info, see http://htmlcleaner.sourceforge.net/index.php.
- Apply html2xhtml.xsl : this xsl extracts the content into <body> tags
- Apply html2xhtmlv1.xsl : this xsl is used as a fix and adds br tag at the end of lists (ul and ol)
- Apply html2xhtmlv2.xsl : this xsl surround text line with p tags and transforms non-hdoc tags into hdoc tags as s, u, strong tags.
- Apply html2xhtml3.xsl : this xsl is used as a fix, it deletes p tags when its child is ul or ol
Fabien Boucaud's avatar
Fabien Boucaud committed
58
- Apply html2hdocstruct1 to 6 : those xsl files are dedicated to building the hdoc structure based on the titles h1 to h6
Fabien Boucaud's avatar
Fabien Boucaud committed
59 60
- Apply html2hdocstructdivsection: this xsl completes the sections created in the previous xsl with <div> around the actual content of each level
- Apply xhtml2hdoc.xsl : this xsl transforms the content into hdoc structure and changes the namespace
qaomia's avatar
qaomia committed
61 62

#### Post-transformations actions
Jean Vintache's avatar
Jean Vintache committed
63 64 65
- Build hdoc structure
- Jing checks if the output file is validated with the right rng schema
- Zip the directory into hdoc archive
qaomia's avatar
qaomia committed
66

qaomia's avatar
qaomia committed
67
### Supported tags
Jean Vintache's avatar
Jean Vintache committed
68
- html tags -> hdoc tags
Fabien Boucaud's avatar
Fabien Boucaud committed
69 70 71
- u, s, em, strong, color -> em
- sub -> sub
- sup -> sup
Jean Vintache's avatar
Jean Vintache committed
72 73 74
- li -> li
- ol -> ol
- br -> p
Etienne Chognard's avatar
Etienne Chognard committed
75 76 77 78 79 80

###Titles
- If it's h1 -> new section
	- If following title is not h1 -> new division in the section
		- If following title is at the same level -> new division at the same level
		- If following title is at a lower level -> new division in the previous division
Etienne Chognard's avatar
Etienne Chognard committed
81

qaomia's avatar
qaomia committed
82
## Capitalisation
83
Using regular expression with xsl is a good way to parse a non xml file.
Fabien Boucaud's avatar
Fabien Boucaud committed
84

Fabien Boucaud's avatar
Fabien Boucaud committed
85
We can note about the creation of the hdoc's structure with the titles h1 to h6 that XSLT is probably not the best tool for these specific transformations. Something like SAX, which actually explore the XML files sequentially, would probably be more efficient and easier to use for these kinds of transformations.