AppyPrinciplesGetting started
appy.pod Writing ODT templates The « xhtml » function

One of the functions being available by default in any POD context is the xhtml function. Used via a from clause (see the previous section), this function allows to convert chunks of XHTML code (generally passed as strings) into chunks of ODF code within the resulting ODT document. This functionality is useful when using POD with, for example, the "web framework" part of Appy, that store some data in XHTML format, like content encoded via ckeditor fields.

Suppose you want to render this chunk of XHTML code at some place in your POD result.

XHTML code XHTML rendering (raw)
<p>Te<b>s</b>t1 : <b>bold</b>, i<i>tal</i>ics, exponent<sup>34</sup>, sub<sub>45</sub>.</p>
<p>An <a href="http://www.google.com">hyperlink</a> to Google.</p>
<ol><li>Number list, item 1</li>
<ol><li>Sub-item 1</li><li>Sub-Item 2</li>
<ol><li>Sub-sub-item A</li><li>Sub-sub-item B <i>italic</i>.</li></ol>
</ol>
</ol>
<ul><li>A bullet</li>
<ul><li>A sub-bullet</li>
<ul><li>A sub-sub-bullet</li></ul>
<ol><li>A sub-sub number</li><li>Another.<br /></li></ol>
</ul>
</ul>
<h2>Heading</h2>
<p>Text<p>
<h3>SubHeading</h3>
<p>Subheading text</p>

Define the following POD template. Variable xhtmlChunk contains the XHTML code shown above, as a string.

The rendering produces this document.

The OpenDocument rendering is a bit different than the XHTML rendering shown above. This is because POD uses the styles found in the POD template and tries to make a correspondence between style information in the XHTML chunk and styles present in the POD template. By default, when pod encounters an XHTML tag:

  • it checks if a "class" attribute is defined on this tag. If yes, and if a style with the same "display name" is found in the OpenDocument template, this style will be used. The "display name" of an OpenDocument style is the name of the style as it appears in LibreOffice;
  • if no "class" attribute is present, and if the XHTML tag is a heading (h1 to h6), POD tries to find an OpenDocument style which has the same outline level. For example, tag h1 may be mapped to style Heading 1. This is what happened in the example above;
  • else, no style at all is applied.

You have the possibility to customize this behaviour by defining styles mappings.

Style mappings

You can define styles mappings at two different levels. The first, global, level, consists in defined a styles mapping when creating a Renderer instance, in parameter stylesMapping. A styles mapping is, basically, a Python dictionary whose keys are either CSS class names or XHTML element names, and whose values are display names of OpenDocument styles that must be present in the POD template. Every time you invoke the xhtml function in a POD template, the global styles mapping comes into play.

Note that in an OpenDocument document, LibreOffice stores only the styles that are used in the document. The styles names ("Heading 1", "Standard"...) that appear when opening your template with LibreOffice are thus a super-set of the styles that are actually stored into your document. You may consult the list of available styles in your POD template programmatically by calling your POD Renderer's getStyles method.

Ensure styles are stored in your POD templates

If the style you want to use is not stored in the POD template, a solution is to define a section with a series of paragraphs onto which you apply the styles you want to be included in the template. On this section, you set this POD statement:

do section if False

That way, the section and its content will never be rendered in any POD result; but the styles applied to its inner paragraphs will be stored in your POD template.

Here is an example.

The first section will not appear at all in the result, but thanks to the paragraphs defined in it, the following styles will be stored in the POD template: Highlight, ArtSectionBold and ArtSectionItalic.

In the rendered part of the template (in the example, it represents a single paragraph, the last one), any XHTML paragraph tag defined with a class attribute referring to any of these names will produce, in the POD result, a paragraph onto which the homonym style will be applied.

Key « h* »

In a styles mapping, you can also define a special key, h*, and define, as value, a positive or negative integer. When POD tries to establish a style correspondance based on outline level, it will use this number. For example, if you specify the following styles mapping,

{'h*':-1}

, when encountering tag h2 (provided it does not define a "class" attribute), if an OpenDocument style with an outlevel of 2-1=1 is found (ie, "Heading 1"), it will be used.

Local styles mappings

Beyond the unique, Renderer-wide global styles mapping, each time you invoke the xhtml function in a POD template, you may specify a local styles mapping in the parameter named stylesMapping, like shown below.

Local styles mappings override what you have (potentially) defined in the global styles mapping.

Current restrictions

At present, the XHTML "a" tag may not be "styled-mapped" (it may not be present in styles mappings) because POD uses it own automatically-generated OpenDocument style.

Styles mapping – Reference

There are a lot more possibilities with styles mappings. The complete reference is found in code comments, on method checkStylesMapping defined on class StylesManager from file appy/pod/styles_manager.py. This method's signature and documentation are shown below.

    def checkStylesMapping(self, stylesMapping):
        '''Checks that the given stylesMapping is correct, and returns the
           internal representation of it.'''

        # stylesMapping is a dict.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # Every key can be:
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (1) | the name of a XHTML 'paragraph-like' tag (p, h1, h2...) or the
        #     | meta-name "para", representing "p", "div", "blockquote" or
        #     | "address";
        # (2) | the name of a XHTML 'text-like' tag (span, b, i, em...);
        # (3) | the name of a CSS class;
        # (4) | string 'h*';
        # (5) | 'table';
        # (6) | 'ol' or 'ul'.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # Every value must be:
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (a) | if the key is (1), (2) or (3), value must be the display name of
        #     | an ODT style;
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (b) | if the key is (4), value must be an integer indicating how to
        #     | map the outline level of outlined styles (ie, for mapping XHTML
        #     | tag "h1" to the OD style with outline-level=2, value must be
        #     | integer "1". In that case, h2 will be mapped to the ODT style
        #     | with outline-level=3, etc.). Note that this value can also be
        #     | negative;
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (c) | if key is "table", the value must be an instance of
        #     | appy.pod.styles_manager.TableProperties;
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (d) | if key is "ol", the value must be an instance of
        #     | appy.pod.styles_manager.NumberedProperties;
        #     | if key is "ul", the value must be an instance of
        #     | appy.pod.styles_manager.BulletedProperties.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # Some precisions now about about keys. If key is (1) or (2), parameters
        # can be given between square brackets. Every such parameter represents
        # a CSS attribute and its value, with 2 possible operators: = and !=.
        # For example, a key can be:
        #
        #                     p[text-align=center,color=blue]
        #
        # This feature allows to map XHTML tags having different CSS attributes
        # to different ODT styles. Note that special attribute "parent" can be
        # used and does not represent a CSS attribute but the parent tag. For
        # example, if you want to apply a specific style to a "p" tag, excepted
        # those within "td" tags, specify this key:
        #
        #                           p[parent!=td]
        #
        # For a parent, you can use meta-name "cell" that denotes any tag being
        # "th", "td" or "li". For example: p[parent=cell]
        # 
        # Special attribute "class" can also be used to check whether some CSS
        # class is defined in attribute "class", ie:
        #
        #                         div[class=MyClass]
        #
        # For this "class" attribute, you can use value "None", representing no
        # class at all. For example, the following key matches any "p" tag
        # having any defined CSS class:
        #
        #                           p[class!=None]
        #
        # The method returns a dict which is the internal representation of the
        # styles mapping.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # Every key can be:
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (I)   | the name of a XHTML tag, corresponding to (1), (2), (5) or (6)
        #       | whose potential parameters have been removed;
        # (II)  | the name of a CSS class (=(3));
        # (III) | string 'h*' (=(4)).
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # Every value can be:
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (i)   | a Style instance that was found from the specified ODT style
        #       | display name in stylesMapping, if key is (I) and if only one
        #       | non-parameterized XHTML tag was defined in stylesMapping;
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (ii)  | a list of the form [ (params, Style), (params, Style),...] if
        #       | key is (I) and if one or more parameterized (or not) XHTML
        #       | tags representing the same tag were found in stylesMapping.
        #       | "params", which can be None, is a dict whose pairs are of the
        #       | form (cssAttribute, cssValue);
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (iii) | an integer value (=(b));
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # (iv)  | [x]Properties instance if cases (5) or (6).
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # ! Cheat ! Mapping a td class to its inner p
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        # ODF requires a paragraph to be present within a table cell, which is
        # not the case in XHTML: a "td" tag can directly contain text, without
        # adding an intermediate "p" or "div" tag. Consequently, when Appy must
        # convert an XHTML code like "<td>Text</td>", it adds an inner
        # paragraph, in order to produce valid ODF code, just as if the input
        # code was <td><p>Text</p></td>. Imagine now the input code defines a
        # CSS class on the "td" tag:
        #
        #                 <td class="smallText">Text</td>
        #
        # Because named, cell-level styles can't be defined from the LibreOffice
        # UI, this class is useless for Appy. This is why Appy, in that case,
        # copies the class to the added inner paragraph, as if it was initially
        # coded as:
        #
        #       <td class="smallText"><p class="smallText">Text</p></td>
        #
        # Note that this cheat occurs only if no paragraph is defined within the
        # "td" tag in the input code. For example,
        #
        #                <td class="smalltext"><p>Text</p></td>
        #
        # will be converted as if it was
        #
        #                          <td><p>Text</p></td>
        #
        # Another example:
        #
        #         <td class="smalltext"><p class="para">Text</p></td>
        #
        # will be converted as if it was
        #
        #                     <td><p class="para">Text</p></td>
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

You have noticed that, for tables and (bulleted and numbered) lists, the mapping can be so complex that is is defined via a specific class.

Styles mappings for tables – class TableProperties

Within a styles mapping, at key 'table', the specified value must be an instance of class TableProperties. This class, to be found in file appy/pod/styles_manager.py, has the following constructor.

    def __init__(self, pageWidth=None, px2cm=css.px2cm, cellPx2cm=10.0,
              wideAbove=495, minColumnWidth=0.07, columnModifier=None,
              minCellPadding=0.0, cellContentStyle='podCellContent',
              headerContentStyle='podHeaderCellContent', margins=defaultMargins,
              unbreakable=False, unbreakableRows=False, border=None,
              prevails=False):
        # pod computes, in cm, the width of the master page for a pod template.
        # Table widths expressed as percentages will be based on it. But if your
        # XHTML table(s) lie(s) within a section that has a specific page style
        # with another width, specify it here (as a float value, in cm).
        self.pageWidth = pageWidth
        # Table widths expressed as pixels will use a "pixels to cm" ratio as
        # defined in css.px2cm. If this is wrong for you, specify another ratio
        # here. The width in cm will be computed as:
        #
        #                 (table width in pixels) / px2cm
        #
        self.px2cm = px2cm
        # Table cell paddings may use another px / cm ratio. Indeed,
        # cellspacing="1" is converted to 0.02cm with the standard ratio, which
        # is low.
        self.cellPx2cm = cellPx2cm
        # Every table with no specified width will be "wide" (=100% width).
        # If a table width is specified in px and is above the value defined
        # here, it will be forced to 100%.
        self.wideAbove = wideAbove
        # pod ensures that every column will at least get a minimum width
        # (expressed as a percentage: a float value between 0.0 and 1.0). You
        # can change this minimum here.
        self.minColumnWidth = minColumnWidth
        # If a column modifier is specified, any parameter related to table and
        # column widths is ignored: we will let LibreOffice (LO) compute himself
        # the table and column widths via its algorithm
        # "SetOptimalColumnWidths" if columnModifier is "optimize" or
        # "DistributeColumns"      if columnModifier is "distribute".
        # This requires LO to run in server mode and the
        # appy.pod.renderer.Renderer being launched with parameters
        #                   optimalColumnWidths="OCW_.*"
        # and                distributeColumns="DC_.*"
        self.columnModifier = columnModifier
        # When cell padding is defined (CSS table property "border-spacing" or
        # HTML table attribute "cellspacing"), a minimum value can be defined
        # here, as a float value (cm). If no padding is defined, the default one
        # from pod default style "podCell" is used and is 0.1cm.
        self.minCellPadding = minCellPadding
        # The styles to use for cell and cell header content. The default values
        # correspond to styles defined in styles.xmlt.
        self.cellContentStyle = cellContentStyle
        self.headerContentStyle = headerContentStyle
        # The table margins, as a tuple of 4 float values (cm):
        # top, right, bottom and left margins.
        self.margins = margins
        # May the table be spread on several pages ?
        self.unbreakable = unbreakable
        # May a single table row be spread on several pages ?
        self.unbreakableRows = unbreakableRows
        # Table-wide border properties can be defined, for example:
        #                    '0.018cm solid #000000'
        # If defined, it will override the potential CSS value defined on tables
        self.border = border
        # If CSS attributes and corresponding TableProperties attributes are
        # both encountered, who prevails ? If prevails is True,
        # TableProperties attributes prevail.
        self.prevails = prevails

Static attribute defaultMargins used as default value for attribute margins, is the following.

defaultMargins = (0.3, 0.0, 0.3, 0.0)

If you do not define any 'table' key in any styles mapping, a default TableProperties instance will be used, having the default values as defined in the hereabove constructor.

Styles mappings for lists – classes BulletedProperties and NumberedProperties

Within a styles mapping, at keys 'ul' and 'ol', you may define, respectively, instances of classes BulletedProperties and NumberedProperties. These two classes inherit from class ListProperties having the following constructor.

    def __init__(self, levels, formats, delta, firstDelta, space, paraStyle):
        # The number of indentation levels supported
        self.levels = levels
        # The list of format characters for bullets or numbers
        self.formats = formats
        # The number of inches to increment at each level (as a float)
        self.delta = delta
        # The first delta can be different than any other
        self.firstDelta = firstDelta
        # The space, in inches (as a float), between the bullet/number and the
        # text.
        self.space = space
        # A specific style to apply to the inner paragraphs
        self.paraStyle = paraStyle
        # The number of levels can be > or < to the number of formats. In those
        # cases, formats will be applied partially or cyclically to levels.

Class BulletedProperties defines default values for this base constructor, as shown below.

    def __init__(self, levels=4, formats=defaultFormats,
                 delta=0.32, firstDelta=0.08, space=0.16, paraStyle=None):

Static attribute defaultFormats lists the UTF-8 characters used to render bullets at various levels. The default list of such chars is

defaultFormats = ('•', '◦', '▪')

These formats are recursively used at deeper levels. For example, at the 4th bullet level, the first character in this list will again be used.

Similarly, class NumberedProperties also defines default values for all attributes of the base constructor.

    def __init__(self, levels=4, formats=defaultFormats,
                 suffixes=defaultSuffixes, delta=0.32, firstDelta=0.08,
                 space=0.16, paraStyle=None):

For this class, static attribute defaultFormats is simply:

defaultFormats = ('1',)

It means that this numeric format is used at any level. Moreover, an additional concept is in use here, compared to bulleted lists: after each format character, a suffix may be defined. The constructor has an addtional parameter named suffixes, whose default value is:

defaultSuffixes = ('.',)

With these default settings, a numbered list with 2 levels will be rendered that way:

1. First level item A
  1.1. Sub 1
  1.2. Sub 2
2. First level, item B

Other numbering schemes can be used. The following table describes the most commonly used schemes, but more exist (see LibreOffice documentation).

Format character Numbering scheme
1 Arabic numerals, 1, 2, 3...
A A, B, C...
a a, b, c...
I Uppercase roman literals: I, II, III, IV...
i Lowercase roman literals: i, ii, iii, iv...
1st Special rendering: 1st, 2nd...

For example, if you want to produce this list:

a) first case
b) second case

Define the following NumberedProperties instance:

NumberedProperties(formats=('a',), suffixes=(')',)

If you do not define any 'ul' / 'ol' key in any styles mapping, a default BulletedProperties / NumberedProperties instance will be used, having the default values as defined in the hereabove constructors.

Do not use the xhtml function in POD expressions

You might be tempted to use the xhtml function in a POD expression. This will work but will not produce the expected result: the XHTML code will not be interpreted (it will be escaped) and will appear as is in the ODT result.

Reference

The xhtml function as available in the default POD context corresponds to method renderXhtml defined on class Renderer.

    def renderXhtml(self, s, stylesMapping={}, keepWithNext=0,
                    keepImagesRatio=False, imagesMaxWidth='page',
                    imagesMaxHeight='page', html=None, inject=False,
                    unwrap=False):
        '''Method that can be used (under the name "xhtml") into a POD template
           for converting a chunk of XHTML content (s) into a chunk of ODF
           content.'''

        # For this conversion, beyond the global styles mapping defined at the
        # renderer level, a specific stylesMapping can be passed: any key in
        # it overrides its homonym in the global mapping.

        # Parameter keepWithNext is used to prevent the last part of a
        # document to be left alone on top of the last page. Imagine your POD
        # template is an official document ending with some director's scanned
        # signature. Just before this signature, the document body is inserted
        # using renderXhtml. Depending on s's length, in some cases, the
        # scanned signature may end up alone on the last page. When using
        # keepWithNext, POD applies a specific style to s's last paragraph,
        # such that it will get standard LibreOffice property "keep-with-next"
        # and will thus be "pushed" on the last page, together with the scanned
        # signature, even if there is still space available on the previous one.

        # keepWithNext may hold the following values.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        #  0  | (the default) Keep-with-next functionality is disabled.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        #  1  | Keep-with-next is enabled as described hereabove: s's last
        #     | paragraph receives LibreOffice property "keep-with-next". It
        #     | also works if the last paragraph is a bulleted or numbered item.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        #  >1 | If keepWithNext is higher than 1, it represents a number of
        #     | characters to "keep-with-next". Indeed, in some cases, keeping
        #     | only the last paragraph may not be sufficient: a bit more text
        #     | could produce a better result. Based on this number of
        #     | characters, POD will determine how many paragraphs will get
        #     | property "keep-with-next" and will apply it to all of them. For
        #     | example, suppose keepWithNext is defined to 60. The last 3
        #     | paragraphs contain, respectively, 20, 30 and 35 characters. POD
        #     | will apply property "keep-with-next" to the 2 last paragraphs.
        #     | The algorithm is the following: POD walks s's paragraphs
        #     | backwards, starting from the last one, counting and adding the
        #     | number of characters for every walked paragraph. POD continues
        #     | the walk until it reaches (or exceeds) the number of characters
        #     | to keep. When it is the case, it stops. All the paragraphs
        #     | walked so far receive property "keep-with-next".
        #     |
        #     | POD goes even one step further by applying "keep-with-next"
        #     | properties to tables as well. In the previous example, if, when
        #     | counting 60 characters backwards, we end up in the middle of a
        #     | table, POD will apply property "keep-with-next" to the whole
        #     | table. However, with tables spanning more than one page, there
        #     | is a problem: if property "keep-with-next" is applied to such a
        #     | table, LibreOffice will insert a page break at the beginning of
        #     | the table. This can be annoying. While this may be considered a
        #     | bug (maybe because it represents a constraint being particularly
        #     | hard to implement), it is the behaviour implemented in
        #     | LibreOffice, at least >= 3 and <= 6.4. Consequently, at the POD
        #     | level, an (expensive) workaround has been found: when
        #     | keepWithNext characters lead us inside a table, POD will split
        #     | it into 2 tables: a first table containing all the rows that
        #     | were not walked by the keep-with-next algorithm, and a second
        #     | containing the remaining, walked rows. On this second table
        #     | only, property "keep-with-next" is applied. Because splitting
        #     | tables that way requires LibreOffice running in server mode, as
        #     | soon as you specify keepWithNext > 1, POD wil assume
        #     | LibreOffice runs in server mode.
        #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

        # If keepImagesRatio is True, while importing images from "img" tags
        # within s, their width/height ratio will be kept. Note that in most
        # cases, it is useless to specify it, because POD computes the image's
        # real width and height.

        # Parameters imagesMaxWidth and imagesMaxHeight correspond,
        # respectively, to parameters maxWidth and maxHeight from POD
        # function document. Being "page" by default, it prevents the images
        # defined in s to exceed the ODT result's page dimensions.
        
        # If html is not None, it will override renderer's homonym parameter.
        # Set html to True if s is not valid XHTML but standard HTML, where
        # invalid-XML tags like <br> may appear.

        # If, in s, you have inserted special "a" tags allowing to inject
        # specific content, like parts of external Python source code files or
        # PX, set parameter inject to True. In that case, everytime such a
        # link will be encountered, it will replaced with the appropriate
        # content. In the case of an external Python source code file, POD will
        # retrieve it via a HTTP request and incorporate the part (class,
        # method..) specified in the link's "title" attribute, into the
        # resulting chunk of ODF code. If a PX is specified, it will be called
        # with the current PX context, and the result will be injected where the
        # link has been defined. For more information about this functionality,
        # check Appy's Rich field in class appy.model.fields.rich.Rich.

        # If you set unwrap being True, if the resulting ODF chunk is
        # completely included in a single paragraph (<text:p>), this paragraph's
        # start and end tags will be removed from the result (keeping only its
        # content). Avoid using this. unwrap should only be used internally by
        # POD itself.