Commit 920f556b authored by Raphaël's avatar Raphaël

Merge branch 'master' of https://gitlab.utc.fr/crozatst/hdoc

parents 43f9dd2c 8b0efaae
......@@ -54,7 +54,8 @@ Step by step :
* searchDocByAuthor.xqm
* searchDocByTitle.xqm
* searchSectionByTitle.xqm
P.S. the symbols "(: :)" are used for adding comments in xQuery files
TODO List
------------------
......
......@@ -12,16 +12,29 @@ Kapilraj Thangeswaran
This module is able to extract data from a file in Hdoc format and insert them into MongoDB.
## Dependencies
In order to work properly this module needs
- In order to make this module work you have to download and install Node.js from the [Node.js download page](https://nodejs.org/en/).
- If needed, download and install MongoDB from the [MongoDB download page](https://www.mongodb.com/download-center#community).
In order to work properly this module needs :
- to download and install Node.js
For windows :
- from the [Node.js download page](https://nodejs.org/en/)
For linux (instruction for debian 8, may vary depending on your distrbution) execute the followings commands :
- `su`
- `apt install nodejs`
- `apt install node`
- `apt install npm`
- to download and install MongoDB
For Windows :
- from the [MongoDB download page](https://www.mongodb.com/download-center#community)
For linux :
- `su & apt install mongodb`
## Instructions
1. Install dependencies
2. Add all your hdoc documents in an "input" folder
3. Add or edit "config.xml" file in "input" folder (for more details, please check "Input configuration")
4. Edit "config.json" file from "mongo" folder (for more details, please check "Mongo configuration")
5. Execute run.bat or run.sh
5. Make sure that MongoDB is running (`mongod.exe --rest --jsonp` command from "MongoDB/Server/3.2/bin" folder)
6. Execute run.bat or chmod +x run.sh & ./run.sh
## Web
This module provides a Web application to access MongoDB and execute simples requests.
......
......@@ -58,9 +58,27 @@
<delete dir="${tmpdir}" />
</target>
<target name="mongoDB" depends="main">
<exec executable="node" dir="mongo">
<arg line="main.js"/>
</exec>
</target>
<condition property="isWindows">
<os family="windows" />
</condition>
<condition property="isUnix">
<os family="unix" />
</condition>
<target name="windowsMongoDB" if="isWindows" depends="main">
<exec executable="node" dir="mongo">
<arg line="main.js"/>
</exec>
</target>
<target name="linuxMongoDB" if="isUnix" depends="main">
<exec executable="nodejs" dir="mongo">
<arg line="main.js"/>
</exec>
</target>
<target name="mongoDB" depends="windowsMongoDB, linuxMongoDB">
<echo>End</echo>
</target>
</project>
\ No newline at end of file
......@@ -42,8 +42,8 @@ var removeDocument = function(db, collection, json) {
MongoClient.connect(url, function(err, db) {
assert.equal(null, err);
fs.readdir(outputFolder, (err, files) => {
files.forEach(file => {
fs.readdir(outputFolder, function(err, files) {
files.forEach(function(file) {
var json = JSON.parse(fs.readFileSync(outputFolder + "/" + file));
if(config.request === 'insert') {
insertDocument(db, config.collection, json);
......
......@@ -29,7 +29,7 @@
}
</xsl:template>
<xsl:template match="div[@data-hdoc-type='question']" >
<xsl:template match="div[@data-hdoc-type='question' and position() != last()]" >
{
<xsl:apply-templates select="div[@data-hdoc-type='description']"/>
<xsl:apply-templates select="div[@data-hdoc-type='solution']"/>
......
!tmp/.gitkeep
!input/.gitkeep
!output/.gitkeep
......@@ -43,7 +43,7 @@ Use this method if you do not want to use a terminal.
#### Terminal:
By using the terminal you can specify some parameters to the conversion at the moment: the source file.
By using the terminal you can specify one parameter to the conversion at the moment: the source file.
1. Download hdoc_converter.zip and unzip it.
2. Open your terminal and go into the folder hdoc_to_pdf.
......@@ -55,37 +55,41 @@ By using the terminal you can specify some parameters to the conversion at the m
Use -DInputFile to specify the source file.
Exemple:
"ant -buildfile hdoc_to_optim.ant -DInputFile=sample.hdoc"
"ant -buildfile hdoc_to_pdf.ant -DInputFile=sample.hdoc"
This parameter is optional. Your file has been converted, the result is in the output folder.
Flying Saucer limitations
-------------------------
* Nested ul in ol are sometimes converted to ol... [only noticed once, to be verified]
* It seems that FS doesn't support the max-width or max-height for img tags, which makes proper scaling harder... For now, as a temporary solution, we scale all images at a width of 80mm.
* ToC lines rendering is sometimes ugly if the title label is too long: dotted leader or even page number may appear on the following line, sometimes colliding between themselves.
* Inline elements like em cause bad paragraphs justification if they are rendered at the beginning of a new line.
* FS doesn't support the CSS widows/orphans properties, which makes their handling harder.
Known bugs
----------
* Nested ul in ol are sometimes converted to ol.
* It seems that FS doesn't support the max-width for img tags, which makes proper scaling harder.
* ToC lines rendering is sometimes ugly if the title label is too long: dotted leader or even page number may appear on the following line, sometimes colliding between themselves.
* Inline elements like em cause bad paragraphs justification.
* Sometimes, they are unwanted page breaks after a heading.
* Sometimes, they are still unwanted page breaks before a heading + list (e.g. h4 then ol).
* A schema validation is executed by jing during the hdoc_to_pdf conversion. Normally, if the validation fails, the process should abort because we are not treating a valid hdoc file. However, at the moment, the script only warns the user of the error and goes on, because the schemas and the opale_to_hdoc converter are not synchronized at the moment (it needs to be corrected).
Generic Todo
------------
* Generate a clean PDF file (using the LaTeX formatting example)
- Create a default CSS file with basic spine rules
- Get the right free font (equivalent to the LaTeX's one)
* Generate the ToC according to the converted (by XSL) headings of the hdoc
* Handle as fully as possible of widows and girl orphans; trying to match Prince's layout and implementing the suitable CSS rules (which shall not be interpreted by FS)
* Allow the user to override some specific CSS rules, according to the main layout logical rules
* Manage container.xml and content.xml validations using jing as jing task can't be handled with opale_to_pdf.ant call.
* Bonus: find out a HTML editor to manually add line breaks to a hdoc file in order to resolve widows and girl orphans problems after the PDF file's generation
* Rework the hdoc_to_pdf.ant and find_content.xsl scripts to allow multifiles handling.
* Handle as fully as possible of widows and girl orphans; trying to match Prince's layout and implementing the suitable CSS rules (which shall not be interpreted by FS).
* Allow the user to override some specific CSS rules, according to the main layout logical rules.
* Provide the user with a full set of options/parameters to customise the output: bound/unbound, odd/even margins, report/article LaTeX format (first page formating), etc.
* Bonus: find out a HTML editor to manually add line breaks to a hdoc file in order to resolve widows and girl orphans problems after the PDF file's generation.
Specific Todo list
------------------
* Gestion du bug d'espace qui apparaît après un lien.
* Ajouter le paramètre de reliure ("bound") au script ant
* Intégrer les styles CSS selon le paramètre "bound" dans un xsl
* Ajouter le paramètre de recto-verso au script ant
......@@ -107,6 +111,7 @@ Technical notes
---------------
* This converter works with _only one_ hdoc file in the input folder at the moment, please ensure to clean the folder before proceeding with the hdoc you want to convert to PDF. When the multifiles ability is set within the hdoc_to_pdf converter, the opale_to_pdf one shall naturally work because it already implements the opale_to_hdoc multifiles handling (the copy of all the hdoc results into the input directory of the hdoc_to_pdf converter).
* The java classes we use for the project are located in the "lib/MyPDFGenerator Sources" folder, please modify these if needed before compiling and adding the new jar file to the lib folder. In Eclipse, when the class is modified and ready to be exported, please choose the "Runnable jar file" export option.
User Story
----------
......@@ -121,3 +126,7 @@ User Story
Capitalisation
--------------
* A16 : during this semester, we have built a hdoc_to_pdf converter from scratch, which aims to be integrated in the global hdoc project. We use the java library Flying Saucer (FS) for the purpose, but this tool has some limitations, the ones we have already noticed are listed above.
At the moment, the converter is functional and deals with main PDF layout properties: title and authors, pages numbering, headings ranks, ToC generationk, basic inline formating (+ fonts) and nested lists for instance. Some elements still need to be worked on, especially the widows/orphans behaviours for the lists. Other elements need to be handled, like the tabulars or specific objects (e.g. odg resources).
The main objective has been to keep whenever it is possible the right formating and typographic rules (often in comparison to the LateX ones), and thus deliver a readable printed document at the end.
......@@ -54,4 +54,6 @@ Step by step :
Available library modules :
* searchDocByAuthor.xqm
* searchDocByTitle.xqm
* searchSectionByTitle.xqm
\ No newline at end of file
* searchSectionByTitle.xqm
P.S. the symbols "(: :)" are used for adding comments in xQuery files
\ No newline at end of file
......@@ -35,11 +35,14 @@ L'utilisation complète d'opale_to_elasticSearch nécessite l'utilisation de la
- Télécharger Logstash : https://www.elastic.co/fr/downloads/logstash et extraire l'archive
- Télécharger Kibana : https://www.elastic.co/fr/downloads/kibana et extraire l'archive
- Mettre le fichier esconf.conf situé dans opale_to_elasticSearch/logstash/conf/ dans %{dossier_installation_logstash}/
- Editer le fichier esconf.conf : ligne 11, remplacez "path => ["/opale_to_elasticSearch/logstash/input/*.json"]" par "path => ["%{votreCheminAbsolu}/opale_to_elasticSearch/logstash/input/*.json"]"
- Sauvegarder les modifications.
Etapes :
- aller dans votre dossier d'installation d'elasticsearch et lancer bin/elasticsearch
- aller dans votre dossier d'installation de kibana et lancer bin/kibana
- aller dans votre dossier d'installation de logstash et lancer bin/logstash - f esconf.conf
- Attendre les messages de logstash qui indiquent le lancement sans problème.
- lancer la transformation opale_to_elasticsearch en mettant d'abord les *.scar dans opale_to_elasticsearch/input
- Le dossier de sortie n'est pas le classique opale_to_elasticsearch/output mais opale_to_elasticsearch/logstash/input afin de faire directement le lien avec Logstash
- Normalement les log de logstash indique l'insertion des sorties de la transformation, il arrive pour le moment qu'il ne le fasse qu'au moment où logstash s'arrête, l'arrêter alors.
......
......@@ -12,18 +12,31 @@ Kapilraj Thangeswaran
This module is able to extract data from a file in Opale format and insert them into MongoDB.
## Dependencies
In order to work properly this module needs
- In order to make this module work you have to download and install Node.js from the [Node.js download page](https://nodejs.org/en/).
- If needed, download and install MongoDB from the [MongoDB download page](https://www.mongodb.com/download-center#community).
In order to work properly this module needs :
- [`opale_to_hdoc`](https://gitlab.utc.fr/crozatst/hdoc/tree/master/opale_to_hdoc) (Opale to Hdoc conversion)
- [`hdoc_to_mongo`](https://gitlab.utc.fr/crozatst/hdoc/tree/master/hdoc_to_mongo) (Hdoc to Mongo conversion)
- to download and install Node.js
For windows :
- from the [Node.js download page](https://nodejs.org/en/)
For linux (instruction for debian 8, may vary depending on your distrbution) execute the followings commands :
- `su`
- `apt install nodejs`
- `apt install node`
- `apt install npm`
- to download and install MongoDB
For Windows :
- from the [MongoDB download page](https://www.mongodb.com/download-center#community)
For linux :
- `su & apt install mongodb`
## Instructions
1. Install dependencies
2. Add all your hdoc documents in an "input" folder
3. Add or edit "config.xml" file in "input" folder (for more details, please check "Input configuration")
4. Edit "config.json" file from "mongo" folder in "hdoc_to_mongo" module (for more details, please check "Mongo configuration")
5. Execute run.bat or run.sh
5. Make sure that MongoDB is running (`mongod.exe --rest --jsonp` command from "MongoDB/Server/3.2/bin" folder)
6. Execute run.bat or chmod +x run.sh & ./run.sh
## Input configuration
You can add or edit "config.xml" in "input" folder to provide more information about your documents.
......
......@@ -44,6 +44,8 @@ Follow the steps above to get a pdf file from a scar one:
4. On Linux or Mac, run the script run.sh. On Windows, run the script run.bat.
5. Your file has been converted, the result is in the output folder.
**Warning: on Windows systems, the script may fail due to the lack of writing rights for some folders, try to re-execute it once are twice and it should work (until this bug is solved).**
Unsupported
-----------
......@@ -53,6 +55,8 @@ Refer to the unsupported elements in Opale to Hdoc and in Hdoc to Pdf.
Known bugs
----------
- The windows execution of the run.bat script sometimes fails (problems when we try to remove some folders).
Refer to the known bugs in Opale to Hdoc and in Hdoc to Pdf.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment