Panflute API

Base elements

class Element(*args, **kwargs)[source]

Base class of all Pandoc elements

parent

Element that contains the current one.

Note: the .parent and related attributes are not implemented for metadata elements.

Return type:

Element | None

location

None unless the element is in a non–standard location of its parent, such as the .caption or .header attributes of a table.

In those cases, .location will be equal to a string.

rtype:

str | None

walk(action, doc=None, stop_if=None)[source]

Walk through the element and all its children (sub-elements), applying the provided function action.

A trivial example would be:

from panflute import *

def no_action(elem, doc):
    pass

doc = Doc(Para(Str('a')))
altered = doc.walk(no_action)
Parameters:
  • action (function) – function that takes (element, doc) as arguments.

  • doc (Doc) – root document; used to access metadata, the output format (in .format, other elements, and other variables). Only use this variable if for some reason you don’t want to use the current document of an element.

  • stop_if (function, optional) – function that takes (element) as argument.

Return type:

Element | [] | None

content

Sequence of Element objects (usually either Block or Inline) that are “children” of the current element.

Only available for elements that accept *args.

Note: some elements have children in attributes other than content (such as Table that has children in the header and caption attributes).

index
ancestor(n)[source]

Return the n-th ancestor. Note that elem.ancestor(1) == elem.parent

Return type:

Element | None

offset(n)[source]

Return a sibling element offset by n

Return type:

Element | None

prev

Return the previous sibling. Note that elem.offset(-1) == elem.prev

Return type:

Element | None

next

Return the next sibling. Note that elem.offset(1) == elem.next

Return type:

Element | None

replace_keyword(keyword, replacement[, count])

Walk through the element and its children and look for Str() objects that contains exactly the keyword. Then, replace it.

Usually applied to an entire document (a Doc element)

Note: If the replacement is a block, it cannot be put in place of a Str element. As a solution, the closest ancestor (e.g. the parent) will be replaced instead, but only if possible (if the parent only has one child).

Example:

>>> from panflute import *
>>> p1 = Para(Str('Spam'), Space, Emph(Str('and'), Space, Str('eggs')))
>>> p2 = Para(Str('eggs'))
>>> p3 = Plain(Emph(Str('eggs')))
>>> doc = Doc(p1, p2, p3)
>>> doc.content
ListContainer(Para(Str(Spam) Space Emph(Str(and) Space Str(eggs))) Para(Str(eggs)) Plain(Emph(Str(eggs))))
>>> doc.replace_keyword('eggs', Str('ham'))
>>> doc.content
ListContainer(Para(Str(Spam) Space Emph(Str(and) Space Str(ham))) Para(Str(ham)) Plain(Emph(Str(ham))))
>>> doc.replace_keyword(keyword='ham', replacement=Para(Str('spam')))
>>> doc.content
ListContainer(Para(Str(Spam) Space Emph(Str(and) Space Str(ham))) Para(Str(spam)) Para(Str(spam)))
Parameters:
  • keyword (str) – string that will be searched (cannot have spaces!)

  • replacement (Element) – element that will be placed in turn of the Str element that contains the keyword.

  • count (int) – number of occurrences that will be replaced. If count is not given or is set to zero, all occurrences will be replaced.

container

Rarely used attribute that returns the ListContainer or DictContainer that contains the element (or returns None if no such container exist)

Return type:

ListContainer | DictContainer | None


The following elements inherit from Element:

Base classes and methods of all Pandoc elements

class Block(*args, **kwargs)[source]

Base class of all block elements

class Inline(*args, **kwargs)[source]

Base class of all inline elements

class MetaValue(*args, **kwargs)[source]

Base class of all metadata elements

Low-level classes

(Skip unless you want to understand the internals)

These containers keep track of the identity of the parent object, and the attribute of the parent object that they correspond to.

class DictContainer(*args, oktypes=<class 'object'>, parent=None, **kwargs)[source]

Wrapper around a dict, to track the elements’ parents. This class shouldn’t be instantiated directly by users, but by the elements that contain it.

Parameters:
  • args – elements contained in the dict–like object

  • oktypes (type | tuple) – type or tuple of types that are allowed as items

  • parent (Element) – the parent element

class ListContainer(*args, oktypes=<class 'object'>, parent=None)[source]

Wrapper around a list, to track the elements’ parents. This class shouldn’t be instantiated directly by users, but by the elements that contain it.

Parameters:
  • args – elements contained in the list–like object

  • oktypes (type | tuple) – type or tuple of types that are allowed as items

  • parent (Element) – the parent element

  • container (str | None) – None, unless the element is not part of its .parent.content (this is the case for table headers for instance, which are not retrieved with table.content but with table.header)

insert(i, v)[source]

S.insert(index, value) – insert value before index

Note

To keep track of every element’s parent we do some class magic. Namely, Element.content is not a list attribute but a property accessed via getter and setters. Why?

>>> e = Para(Str(Hello), Space, Str(World!))

This creates a Para element, which stores the three inline elements (Str, Space and Str) inside an .content attribute. If we add .parent attributes to these elements, there are three ways they can be made obsolete:

  1. By replacing specific elements: e.content[0] = Str('Bye')

  2. By replacing the entire list: e.contents = other_items

We deal with the first problem with wrapping the list of items with a ListContainer class of type collections.MutableSequence. This class updates the .parent attribute to elements returned through __getitem__ calls.

For the second problem, we use setters and getters which update the .parent attribute.

Standard elements

These are the standard Pandoc elements, as described here. Consult the repo for the latest updates.

Note

The attributes of every element object will be i) the parameters listed below, plus ii) the attributes of Element. Example:

>>> h = Str(text='something')
>>> h.text
'something'
>>> hasattr(h, 'parent')
True

Exception: the .content attribute only exists in elements that take *args (so we can do Para().content but not Str().content).

Classes corresponding to Pandoc elements

Notation: - “ica” is shorthand for “identifier, classes, attributes”

class Doc(*args, **kwargs)[source]

Pandoc document container.

Besides the document, it includes the frontpage metadata and the desired output format. Filter functions can also add properties to it as means of global variables that can later be read by different calls.

Parameters:
  • args (Block sequence) – top–level documents contained in the document

  • metadata (dict) – the frontpage metadata

  • format (str) – output format, such as ‘markdown’, ‘latex’ and ‘html’

  • api_version (tuple) – A tuple of three ints of the form (1, 18, 0)

Returns:

Document with base class Element

Base:

Element

Example:
>>> meta = {'author':'John Doe'}
>>> content = [Header(Str('Title')), Para(Str('Hello!'))]
>>> doc = Doc(*content, metadata=meta, format='pdf')
>>> doc.figure_count = 0 #  You can add attributes freely
get_metadata([key, default, simple])

Retrieve metadata with nested keys separated by dots.

This is useful to avoid repeatedly checking if a dict exists, as the frontmatter might not have the keys that we expect.

With builtin=True (the default), it will convert the results to built-in Python types, instead of MetaValue elements. EG: instead of returning a MetaBool it will return True|False.

Parameters:
  • key (str) – string with the keys separated by a dot (key1.key2). Default is an empty string (which returns the entire metadata dict)

  • default – return value in case the key is not found (default is None)

  • builtin – If True, return built-in Python types (default is True)

Example:
>>> doc.metadata['format']['show-frame'] = True
>>> # ...
>>> # afterwards:
>>> show_frame = doc.get_metadata('format.show-frame', False)
>>> stata_path = doc.get_metadata('media.path.figures', '.')

Classes corresponding to Pandoc elements

Notation: - “ica” is shorthand for “identifier, classes, attributes”

class BlockQuote(*args, **kwargs)[source]

Block quote

Parameters:

args (Block) – sequence of blocks

Base:

Block

class BulletList(*args, **kwargs)[source]

Bullet list (unordered list)

Parameters:

args (ListItem | list) – List item

Base:

Block

class Citation(*args, **kwargs)[source]

A single citation to a single work

Parameters:
  • id (str) – citation key (e.g. the BibTeX keyword)

  • mode (str) – how will the citation appear (‘NormalCitation’ for the default style, ‘AuthorInText’ to exclude parenthesis, ‘SuppressAuthor’ to exclude the author’s name)

  • prefix ([Inline]) – Text before the citation reference

  • suffix ([Inline]) – Text after the citation reference

  • note_num (int) – (Not sure…)

  • hash (int) – (Not sure…)

Base:

Element

class Cite(*args, **kwargs)[source]

Cite: set of citations with related text

Parameters:
  • args (Inline) – contents of the cite (the raw text)

  • citations ([Citation]) – sequence of citations

Base:

Inline

class Code(*args, **kwargs)[source]

Inline code (literal)

Parameters:
  • text (str) – literal text (preformatted text, code, etc.)

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Inline

class CodeBlock(*args, **kwargs)[source]

Code block (literal text) with optional attributes

Parameters:
  • text (str) – literal text (preformatted text, code, etc.)

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Block

class Definition(*args, **kwargs)[source]

The definition (description); used in a definition list. It can include code and all other block elements.

Parameters:

args (Block) – elements

Base:

Element

class DefinitionItem(*args, **kwargs)[source]

Contains pairs of Term and Definitions (plural!)

Each list item represents a pair of i) a term (a list of inlines) and ii) one or more definitions

Parameters:
  • term ([Inline]) – Term of the definition (an inline holder)

  • definitions – List of definitions or descriptions (each a block holder)

Base:

Element

class DefinitionList(*args, **kwargs)[source]

Definition list: list of definition items; basically (term, definition) tuples.

Each list item represents a pair of i) a term (a list of inlines) and ii) one or more definitions (each a list of blocks)

Example:

>>> term1 = [Str('Spam')]
>>> def1 = Definition(Para(Str('...emails')))
>>> def2 = Definition(Para(Str('...meat')))
>>> spam = DefinitionItem(term1, [def1, def2])
>>>
>>> term2 = [Str('Spanish'), Space, Str('Inquisition')]
>>> def3 = Definition(Para(Str('church'), Space, Str('court')))
>>> inquisition = DefinitionItem(term=term2, definitions=[def3])
>>> definition_list = DefinitionList(spam, inquisition)
Parameters:

args (DefinitionItem) – Definition items (a term with definitions)

Base:

Block

class Div(*args, **kwargs)[source]

Generic block container with attributes

Parameters:
  • args (Block) – contents of the div

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Block

class Emph(*args, **kwargs)[source]

Emphasized text

Parameters:

args (Inline) – elements that will be emphasized

Base:

Inline

class Figure(*args, **kwargs)[source]

Standalone figure, with attributes, caption, and arbitrary block content

Parameters:
  • args (Block) – contents of the figure block

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Block

Example:
>>> image = Image(Str("Description"), title='The Title',
            url='example.png', attributes={'height':'256px'})
>>> caption = Caption(Plain(Str('The'), Space, Str('Caption')))
>>> figure = Figure(Plain(image), caption=caption, identifier='figure1')
class Header(*args, **kwargs)[source]
Parameters:
  • args (Inline) – contents of the header

  • level (int) – level of the header (1 is the largest and 6 the smallest)

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Block

Example:
>>> title = [Str('Monty'), Space, Str('Python')]
>>> header = Header(*title, level=2, identifier='toc')
>>> header.level += 1
class HorizontalRule(*args, **kwargs)[source]

Horizontal rule

Base:

Block

class Image(*args, **kwargs)[source]
Parameters:
  • args (Inline) – text with the image description

  • url (str) – URL or path of the image

  • title (str) – Alt. title

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Inline

class LineBlock(*args, **kwargs)[source]

Line block (sequence of lines)

Parameters:

args (LineItem | list) – Line item

Base:

Block

class LineBreak(*args, **kwargs)[source]

Hard line break

Base:

Inline

class LineItem(*args, **kwargs)[source]

Line item (contained in line blocks)

Parameters:

args (Inline) – Line item

Base:

Element

Hyperlink

Parameters:
  • args (Inline) – text with the link description

  • url (str) – URL or path of the link

  • title (str) – Alt. title

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Inline

class ListItem(*args, **kwargs)[source]

List item (contained in bullet lists and ordered lists)

Parameters:

args (Block) – List item

Base:

Element

class Math(*args, **kwargs)[source]

TeX math (literal)

Parameters:
  • text (str) – a string of raw text representing TeX math

  • format (str) – How the math will be typeset (‘DisplayMath’ or ‘InlineMath’)

Base:

Inline

class MetaBlocks(*args, **kwargs)[source]

MetaBlocks: list of arbitrary blocks within the metadata

Parameters:

args (Block) – sequence of block elements

Base:

MetaValue

class MetaBool(*args, **kwargs)[source]

Container for True/False metadata values

Parameters:

boolean (bool) – True/False value

Base:

MetaValue

class MetaInlines(*args, **kwargs)[source]

MetaInlines: list of arbitrary inlines within the metadata

Parameters:

args (Inline) – sequence of inline elements

Base:

MetaValue

class MetaList(*args, **kwargs)[source]

Metadata list container

Parameters:

args (MetaValue) – contents of a metadata list

Base:

MetaValue

class MetaMap(*args, **kwargs)[source]

Metadata container for ordered dicts

Parameters:
  • args (MetaValue) – (key, value) tuples

  • kwargs (MetaValue) – named arguments

Base:

MetaValue

property content

Map of MetaValue objects.

class MetaString(*args, **kwargs)[source]

Text (a string)

Parameters:

text (str) – a string of unformatted text

Base:

MetaValue

class Note(*args, **kwargs)[source]

Footnote or endnote

Parameters:

args (Block) – elements that are part of the note

Base:

Inline

class Null(*args, **kwargs)[source]

Nothing

Base:

Block

class OrderedList(*args, **kwargs)[source]

Ordered list (attributes and a list of items, each a list of blocks)

Parameters:
  • args (ListItem | list) – List item

  • start (int) – Starting value of the list

  • style (str) – Style of the number delimiter (‘DefaultStyle’, ‘Example’, ‘Decimal’, ‘LowerRoman’, ‘UpperRoman’, ‘LowerAlpha’, ‘UpperAlpha’)

  • delimiter (str) – List number delimiter (‘DefaultDelim’, ‘Period’, ‘OneParen’, ‘TwoParens’)

Base:

Block

class Para(*args, **kwargs)[source]

Paragraph

Parameters:

args (Inline) – contents of the paragraph

Base:

Block

Example:
>>> content = [Str('Some'), Space, Emph(Str('words.'))]
>>> para1 = Para(*content)
>>> para2 = Para(Str('More'), Space, Str('words.'))
class Plain(*args, **kwargs)[source]

Plain text, not a paragraph

Parameters:

args (Inline) – contents of the plain block of text

Base:

Block

class Quoted(*args, **kwargs)[source]

Quoted text

Parameters:
  • args (Inline) – contents of the quote

  • quote_type (str) – either ‘SingleQuote’ or ‘DoubleQuote’

Base:

Inline

class RawBlock(*args, **kwargs)[source]

Raw block

Parameters:
  • text (str) – a string of raw text with another underlying format

  • format (str) – Format of the raw text (‘html’, ‘tex’, ‘latex’, ‘context’, etc.)

Base:

Block

class RawInline(*args, **kwargs)[source]

Raw inline text

Parameters:
  • text (str) – a string of raw text with another underlying format

  • format (str) – Format of the raw text (‘html’, ‘tex’, ‘latex’, ‘context’, etc.)

Base:

Inline

class SmallCaps(*args, **kwargs)[source]

Small caps text (list of inlines)

Parameters:

args (Inline) – elements that will be set with small caps

Base:

Inline

class SoftBreak(*args, **kwargs)[source]

Soft line break

Base:

Inline

class Space(*args, **kwargs)[source]

Inter-word space

Base:

Inline

class Span(*args, **kwargs)[source]

Generic block container with attributes

Parameters:
  • args (Inline) – contents of the div

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Inline

class Str(*args, **kwargs)[source]

Text (a string)

Parameters:

text (str) – a string of unformatted text

Base:

Inline

class Strikeout(*args, **kwargs)[source]

Strikeout text

Parameters:

args (Inline) – elements that will be striken out

Base:

Inline

class Strong(*args, **kwargs)[source]

Strongly emphasized text

Parameters:

args (Inline) – elements that will be emphasized

Base:

Inline

class Subscript(*args, **kwargs)[source]

Subscripted text (list of inlines)

Parameters:

args (Inline) – elements that will be set suberscript

Base:

Inline

class Superscript(*args, **kwargs)[source]

Superscripted text (list of inlines)

Parameters:

args (Inline) – elements that will be set superscript

Base:

Inline

class Underline(*args, **kwargs)[source]

Underlined text

Parameters:

args (Inline) – elements that will be underlined

Base:

Inline

Table-specific elements

Classes corresponding to Pandoc Table elements

class Caption(*args, **kwargs)[source]

Table caption with optional short caption

Parameters:
  • args (Block) – caption

  • short_caption (list of Inline) – Short caption

  • identifier – element identifier (usually unique)

Base:

Element

class Table(*args, **kwargs)[source]

Table, composed of a table head, one or more table bodies, and a a table foot. You can also specify captions, short captions, column alignments, and column widths.

Example:

>>> x = [Para(Str('Something')), Para(Space, Str('else'))]
>>> c1 = TableCell(*x)
>>> c2 = TableCell(Header(Str('Title')))
>>> row = TableRow(c1, c2)
>>>
>>> body = TableBody(row)
>>> head = TableHead(row)
>>> caption = Caption(Para(Str('Title')))
>>> table = Table(body, head=head, caption=caption)

TODO: UPDATE EXAMPLE TODO: OFFER A SIMPLE WAY TO BUILD A TABLE, with e.g. .alignments and .widths

Parameters:
  • args (TableBody) – Table bodies

  • head (TableHead) – Table head

  • foot (TableFoot) – Table foot

  • caption (Caption) – The caption of the table (with optional short caption)

  • colspec (list of (Alignment, ColWidth)) – list of (alignment, colwidth) tuples; one for each column

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

  • alignment ([str]) – List of row alignments (either ‘AlignLeft’, ‘AlignRight’, ‘AlignCenter’ or ‘AlignDefault’).

  • colwidth ([float | “ColWidthDefault”]) – Fractional column widths

Base:

Block

class TableBody(*args, **kwargs)[source]

Body of a table, containing a list of intermediate head rows, a list of table body rows, row_head_columns, plus optional attributes

Parameters:
  • row (str) – head rows

  • head (list of TableRow) – Intermediate head (list of table rows)

  • row_head_columns (class:int) – number of columns on the left that are considered column headers (default: 0)

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Block

class TableCell(*args, **kwargs)[source]

Table Cell

Parameters:
  • args (Block) – elements

  • alignment (str) – row alignment (either ‘AlignLeft’, ‘AlignRight’, ‘AlignCenter’ or ‘AlignDefault’).

  • rowspan (int) – number of rows occupied by a cell (height of a cell)

  • colspan (int) – number of columns occupied by a cell (width of a cell)

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Element

class TableFoot(*args, **kwargs)[source]

The foot of a table, containing a one or more foot rows, plus optional attributes

Parameters:
  • row (str) – foot rows

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Block

class TableHead(*args, **kwargs)[source]

The head of a table, containing a one or more head rows, plus optional attributes

Parameters:
  • row (str) – head rows

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Block

class TableRow(*args, **kwargs)[source]

Table Row

Parameters:
  • args (TableCell) – cells

  • identifier (str) – element identifier (usually unique)

  • classes (list of str) – class names of the element

  • attributes (dict) – additional attributes

Base:

Element

Standard functions

run_filters(actions[, prepare, finalize, ...])

Receive a Pandoc document from the input stream (default is stdin), walk through it applying the functions in actions to each element, and write it back to the output stream (default is stdout).

run_filter(action, *args, **kwargs)

Wrapper for run_filters()

toJSONFilter(*args, **kwargs)

Wrapper for run_filter(), which calls run_filters()

toJSONFilters(*args, **kwargs)

Wrapper for run_filters()

load([input_stream])

Load JSON-encoded document and return a Doc element.

dump(doc[, output_stream])

Dump a Doc object into a JSON-encoded text string.

See also

The walk() function has been replaced by the Element.walk() method of each element. To walk through the entire document, do altered = doc.walk().

I/O related functions

dump(doc, output_stream=None)[source]

Dump a Doc object into a JSON-encoded text string.

The output will be sent to sys.stdout unless an alternative text stream is given.

To dump to sys.stdout just do:

>>> import panflute as pf
>>> doc = pf.Doc(Para(Str('a')))  # Create sample document
>>> pf.dump(doc)

To dump to file:

>>> with open('some-document.json', 'w', encoding='utf-8') as f:
>>>     pf.dump(doc, f)

To dump to a string:

>>> import io
>>> with io.StringIO() as f:
>>>     pf.dump(doc, f)
>>>     contents = f.getvalue()
Parameters:
  • doc (Doc) – document, usually created with load()

  • output_stream – text stream used as output (default is sys.stdout)

load(input_stream=None)[source]

Load JSON-encoded document and return a Doc element.

The JSON input will be read from sys.stdin unless an alternative text stream is given (a file handle).

To load from a file, you can do:

>>> import panflute as pf
>>> with open('some-document.json', encoding='utf-8') as f:
>>>     doc = pf.load(f)

To load from a string, you can do:

>>> import io
>>> raw = '[{"unMeta":{}},
[{"t":"Para","c":[{"t":"Str","c":"Hello!"}]}]]'
>>> f = io.StringIO(raw)
>>> doc = pf.load(f)
Parameters:

input_stream – text stream used as input (default is sys.stdin)

Return type:

Doc

run_filter(action, *args, **kwargs)[source]

Wrapper for run_filters()

Receive a Pandoc document from stdin, apply the action function to each element, and write it back to stdout.

See run_filters()

run_filters(actions, prepare=None, finalize=None, input_stream=None, output_stream=None, doc=None, stop_if=None, **kwargs)[source]

Receive a Pandoc document from the input stream (default is stdin), walk through it applying the functions in actions to each element, and write it back to the output stream (default is stdout).

Notes:

  • It receives and writes the Pandoc documents as JSON–encoded strings; this is done through the load() and dump() functions.

  • It walks through the document once for every function in actions, so the actions are applied sequentially.

  • By default, it will read from stdin and write to stdout, but these can be modified.

  • It can also apply functions to the entire document at the beginning and end; this allows for global operations on the document.

  • If doc is a Doc instead of None, run_filters will return the document instead of writing it to the output stream.

Parameters:
  • actions ([function]) – sequence of functions; each function takes (element, doc) as argument, so a valid header would be def action(elem, doc):

  • prepare (function) – function executed at the beginning; right after the document is received and parsed

  • finalize (function) – function executed at the end; right before the document is converted back to JSON and written to stdout.

  • input_stream – text stream used as input (default is sys.stdin)

  • output_stream – text stream used as output (default is sys.stdout)

  • doc (None | Doc) – None unless running panflute as a filter, in which case this will be a Doc element

  • stop_if (function, optional) – function that takes (element) as argument.

  • *kwargs – keyword arguments will be passed through to the action functions (so they can actually receive more than just two arguments (element and doc)

toJSONFilter(*args, **kwargs)[source]

Wrapper for run_filter(), which calls run_filters()

toJSONFilter(action, prepare=None, finalize=None, input_stream=None, output_stream=None, **kwargs) Receive a Pandoc document from stdin, apply the action function to each element, and write it back to stdout.

See also toJSONFilters()

toJSONFilters(*args, **kwargs)[source]

Wrapper for run_filters()

Note

The action functions have a few rules:

  • They are called as action(element, doc) so they must accept at least two arguments.

  • Additional arguments can be passed through the **kwargs** of toJSONFilter and toJSONFilters.

  • They can return either an element, a list, or None.

  • If they return None, the document will keep the same element as before (although it might have been modified).

  • If they return another element, it will take the place of the received element.

  • If they return [] (an empty list), they will be deleted from the document. Note that you can delete a row from a table or an item from a list, but you cannot delete the caption from a table (you can make it empty though).

  • If the received element is a block or inline element, they may return a list of elements of the same base class, which will take the place of the received element.

“Batteries included” functions

These are functions commonly used when writing more complex filters

stringify(element[, newlines])

Return the raw text version of an element (and its children elements).

convert_text(text[, input_format, ...])

Convert formatted text (usually markdown) by calling Pandoc internally

yaml_filter(element, doc[, tag, function, ...])

Convenience function for parsing code blocks with YAML options

debug(*args, **kwargs)

Same as print, but prints to stderr (which is not intercepted by Pandoc).

shell(args[, wait, msg])

Execute the external command and get its exitcode, stdout and stderr.

See also Doc.get_metadata and Element.replace_keyword

Useful (but not essential) functions for writing panflute filters

class PandocVersion[source]

Get runtime Pandoc version

use PandocVersion().version for comparing versions

convert_text(text, input_format='markdown', output_format='panflute', standalone=False, extra_args=None, pandoc_path=None)[source]

Convert formatted text (usually markdown) by calling Pandoc internally

The default output format (‘panflute’) will return a tree of Pandoc elements. When combined with ‘standalone=True’, the tree root will be a ‘Doc’ element.

Example:

>>> from panflute import *
>>> md = 'Some *markdown* **text** ~xyz~'
>>> tex = r'Some $x^y$ or $x_n = \sqrt{a + b}$ \textit{a}'
>>> convert_text(md)
[Para(Str(Some) Space Emph(Str(markdown)) Space Strong(Str(text)) Space Subscript(Str(xyz)))]
>>> convert_text(tex)
[Para(Str(Some) Space Math(x^y; format='InlineMath') Space Str(or) Space Math(x_n = \sqrt{a + b}; format='InlineMath') Space RawInline(\textit{a}; format='tex'))]
Parameters:
  • text (str | Element | list of Element) – text that will be converted

  • input_format – format of the text (default ‘markdown’). Any Pandoc input format is valid, plus ‘panflute’ (a tree of Pandoc elements)

  • output_format – format of the output (default is ‘panflute’ which creates the tree of Pandoc elements). Non-binary Pandoc formats are allowed (e.g. markdown, latex is allowed, but docx and pdf are not).

  • standalone (bool) – whether the results will be a standalone document or not.

  • extra_args (list) – extra arguments passed to Pandoc

  • pandoc_path (str) – If specified, use the Pandoc at this path. If None, default to that from PATH.

Return type:

list | Doc | str

Note: for a more general solution, see pyandoc by Kenneth Reitz.

get_option(options=None, local_tag=None, doc=None, doc_tag=None, default=None, error_on_none=True)[source]

Fetch an option variable from either a local (element) level option/attribute tag, a document level metadata tag, or a default.

type options:

dict

type local_tag:

str

type doc:

Doc

type doc_tag:

str

type default:

any

type error_on_none:

bool

The order of preference is local > document > default, although if a local or document tag returns None, then the next level down is used. Also, if error_on_none=True and the final variable is None, then a ValueError will be raised

In this manner you can set global variables, which can be optionally overridden at a local level. For example, the two files below show how to apply different styles to docx text:

main.md:

 1------------------
 2style-div:
 3    name: MyStyle
 4------------------
 5
 6:::style
 7some text
 8:::
 9
10::: {.style name=MyOtherStyle}
11some more text
12:::

style_filter.py:

 1import panflute as pf
 2
 3def action(elem, doc):
 4    if type(elem) == pf.Div:
 5        style = pf.get_option(elem.attributes, "name", doc, "style-div.name")
 6        elem.attributes["custom-style"] = style
 7
 8def main(doc=None):
 9    return run_filter(action, doc=doc)
10
11if __name__ == "__main__":
12    main()
run_pandoc(text='', args=None, pandoc_path=None)[source]

Low level function that calls Pandoc with (optionally) some input text and/or arguments

Parameters:

pandoc_path (str) – If specified, use the Pandoc at this path. If None, default to that from PATH.

shell(args, wait=True, msg=None)[source]

Execute the external command and get its exitcode, stdout and stderr.

stringify(element, newlines=True)[source]

Return the raw text version of an element (and its children elements).

Example:

>>> from panflute import *
>>> e1 = Emph(Str('Hello'), Space, Str('world!'))
>>> e2 = Strong(Str('Bye!'))
>>> para = Para(e1, Space, e2)
>>> stringify(para)
'Hello world! Bye!

param newlines:

add a new line after a paragraph (default True)

type newlines:

bool

rtype:

str

yaml_filter(element, doc, tag=None, function=None, tags=None, strict_yaml=False)[source]

Convenience function for parsing code blocks with YAML options

This function is useful to create a filter that applies to code blocks that have specific classes.

It is used as an argument of run_filter, with two additional options: tag and function.

Using this is equivalent to having filter functions that:

  1. Check if the element is a code block

  2. Check if the element belongs to a specific class

  3. Split the YAML options (at the beginning of the block, by looking for ... or --- strings in a separate line

  4. Parse the YAML

  5. Use the YAML options and (optionally) the data that follows the YAML to return a new or modified element

Instead, you just need to:

  1. Call run_filter with yaml_filter as the action function, and with the additional arguments tag and function

  2. Construct a fenced_action function that takes four arguments: (options, data, element, doc). Note that options is a dict and data is a raw string. Notice that this is similar to the action functions of standard filters, but with options and data as the new ones.

Note: if you want to apply multiple functions to separate classes, you can use the tags argument, which receives a dict of tag: function pairs.

Note: use the strict_yaml=True option in order to allow for more verbose but flexible YAML metadata: more than one YAML blocks are allowed, but they all must start with --- (even at the beginning) and end with --- or .... Also, YAML is not the default content when no delimiters are set.

Example:

"""
Replace code blocks of class 'foo' with # horizontal rules
"""

import panflute as pf

def fenced_action(options, data, element, doc):
    count = options.get('count', 1)
    div = pf.Div(attributes={'count': str(count)})
    div.content.extend([pf.HorizontalRule] * count)
    return div

if __name__ == '__main__':
    pf.run_filter(pf.yaml_filter, tag='foo', function=fenced_action)