forked from jgm/pandoc
-
Notifications
You must be signed in to change notification settings - Fork 2
/
MANUAL.txt
4416 lines (3272 loc) · 150 KB
/
MANUAL.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% Pandoc User's Guide
% John MacFarlane
% March 26, 2017
Synopsis
========
`pandoc` [*options*] [*input-file*]...
Description
===========
Pandoc is a [Haskell] library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
[Markdown], [CommonMark], [PHP Markdown Extra], [GitHub-Flavored
Markdown], [MultiMarkdown], and (subsets of) [Textile],
[reStructuredText], [HTML], [LaTeX], [MediaWiki markup], [TWiki markup],
[TikiWiki markup], [Haddock markup], [OPML], [Emacs Org mode], [DocBook],
[Muse], [txt2tags], [Vimwiki], [EPUB], [ODT], and [Word docx]; and it can
write plain text, [Markdown], [CommonMark], [PHP Markdown
Extra], [GitHub-Flavored Markdown], [MultiMarkdown],
[reStructuredText], [XHTML], [HTML5], [LaTeX] \(including
[`beamer`] slide shows\), [ConTeXt], [RTF], [OPML], [DocBook],
[OpenDocument], [ODT], [Word docx], [GNU Texinfo], [MediaWiki
markup], [DokuWiki markup], [ZimWiki markup], [Haddock markup],
[EPUB] \(v2 or v3\), [FictionBook2], [Textile], [groff man],
[groff ms], [Emacs Org mode], [AsciiDoc], [InDesign ICML], [TEI
Simple], [Muse] and [Slidy], [Slideous], [DZSlides], [reveal.js]
or [S5] HTML slide shows. It can also produce [PDF] output on
systems where LaTeX, ConTeXt, `pdfroff`, or `wkhtmltopdf` is
installed.
Pandoc's enhanced version of Markdown includes syntax for [footnotes],
[tables], flexible [ordered lists], [definition lists], [fenced code
blocks], [superscripts and subscripts], [strikeout], [metadata blocks],
automatic tables of contents, embedded LaTeX [math], [citations], and
[Markdown inside HTML block elements][Extension:
`markdown_in_html_blocks`]. (These enhancements, described further under
[Pandoc's Markdown], can be disabled using the `markdown_strict` input
or output format.)
In contrast to most existing tools for converting Markdown to HTML, which
use regex substitutions, pandoc has a modular design: it consists of a
set of readers, which parse text in a given format and produce a native
representation of the document, and a set of writers, which convert
this native representation into a target format. Thus, adding an input
or output format requires only adding a reader or writer.
Because pandoc's intermediate representation of a document is less
expressive than many of the formats it converts between, one should
not expect perfect conversions between every format and every other.
Pandoc attempts to preserve the structural elements of a document, but
not formatting details such as margin size. And some document elements,
such as complex tables, may not fit into pandoc's simple document
model. While conversions from pandoc's Markdown to all formats aspire
to be perfect, conversions from formats more expressive than pandoc's
Markdown can be expected to be lossy.
[Markdown]: http://daringfireball.net/projects/markdown/
[CommonMark]: http://commonmark.org
[PHP Markdown Extra]: https://michelf.ca/projects/php-markdown/extra/
[GitHub-Flavored Markdown]: https://help.github.com/articles/github-flavored-markdown/
[MultiMarkdown]: http://fletcherpenney.net/multimarkdown/
[reStructuredText]: http://docutils.sourceforge.net/docs/ref/rst/introduction.html
[S5]: http://meyerweb.com/eric/tools/s5/
[Slidy]: http://www.w3.org/Talks/Tools/Slidy/
[Slideous]: http://goessner.net/articles/slideous/
[HTML]: http://www.w3.org/html/
[HTML5]: http://www.w3.org/TR/html5/
[polyglot markup]: https://www.w3.org/TR/html-polyglot/
[XHTML]: http://www.w3.org/TR/xhtml1/
[LaTeX]: http://latex-project.org
[`beamer`]: https://ctan.org/pkg/beamer
[Beamer User's Guide]: http://ctan.math.utah.edu/ctan/tex-archive/macros/latex/contrib/beamer/doc/beameruserguide.pdf
[ConTeXt]: http://www.contextgarden.net/
[RTF]: http://en.wikipedia.org/wiki/Rich_Text_Format
[DocBook]: http://docbook.org
[txt2tags]: http://txt2tags.org
[EPUB]: http://idpf.org/epub
[OPML]: http://dev.opml.org/spec2.html
[OpenDocument]: http://opendocument.xml.org
[ODT]: http://en.wikipedia.org/wiki/OpenDocument
[Textile]: http://redcloth.org/textile
[MediaWiki markup]: https://www.mediawiki.org/wiki/Help:Formatting
[DokuWiki markup]: https://www.dokuwiki.org/dokuwiki
[ZimWiki markup]: http://zim-wiki.org/manual/Help/Wiki_Syntax.html
[TWiki markup]: http://twiki.org/cgi-bin/view/TWiki/TextFormattingRules
[TikiWiki markup]: https://doc.tiki.org/Wiki-Syntax-Text#The_Markup_Language_Wiki-Syntax
[Haddock markup]: https://www.haskell.org/haddock/doc/html/ch03s08.html
[groff man]: http://man7.org/linux/man-pages/man7/groff_man.7.html
[groff ms]: http://man7.org/linux/man-pages/man7/groff_ms.7.html
[Haskell]: https://www.haskell.org
[GNU Texinfo]: http://www.gnu.org/software/texinfo/
[Emacs Org mode]: http://orgmode.org
[AsciiDoc]: http://www.methods.co.nz/asciidoc/
[DZSlides]: http://paulrouget.com/dzslides/
[Word docx]: https://en.wikipedia.org/wiki/Office_Open_XML
[PDF]: https://www.adobe.com/pdf/
[reveal.js]: http://lab.hakim.se/reveal-js/
[FictionBook2]: http://www.fictionbook.org/index.php/Eng:XML_Schema_Fictionbook_2.1
[InDesign ICML]: https://www.adobe.com/content/dam/Adobe/en/devnet/indesign/cs55-docs/IDML/idml-specification.pdf
[TEI Simple]: https://github.com/TEIC/TEI-Simple
[Muse]: https://amusewiki.org/library/manual
[Vimwiki]: https://vimwiki.github.io
Using `pandoc`
--------------
If no *input-file* is specified, input is read from *stdin*.
Otherwise, the *input-files* are concatenated (with a blank
line between each) and used as input. Output goes to *stdout* by
default (though output to *stdout* is disabled for the `odt`, `docx`,
`epub2`, and `epub3` output formats). For output to a file, use the
`-o` option:
pandoc -o output.html input.txt
By default, pandoc produces a document fragment, not a standalone
document with a proper header and footer. To produce a standalone
document, use the `-s` or `--standalone` flag:
pandoc -s -o output.html input.txt
For more information on how standalone documents are produced, see
[Templates], below.
Instead of a file, an absolute URI may be given. In this case
pandoc will fetch the content using HTTP:
pandoc -f html -t markdown http://www.fsf.org
It is possible to supply a custom User-Agent string when requesting a
document from a URL, by setting an environment variable:
USER_AGENT="Mozilla/5.0" pandoc -f html -t markdown http://www.fsf.org
If multiple input files are given, `pandoc` will concatenate them all (with
blank lines between them) before parsing. This feature is disabled for
binary input formats such as `EPUB`, `odt`, and `docx`.
The format of the input and output can be specified explicitly using
command-line options. The input format can be specified using the
`-r/--read` or `-f/--from` options, the output format using the
`-w/--write` or `-t/--to` options. Thus, to convert `hello.txt` from
Markdown to LaTeX, you could type:
pandoc -f markdown -t latex hello.txt
To convert `hello.html` from HTML to Markdown:
pandoc -f html -t markdown hello.html
Supported output formats are listed below under the `-t/--to` option.
Supported input formats are listed below under the `-f/--from` option. Note
that the `rst`, `textile`, `latex`, and `html` readers are not complete;
there are some constructs that they do not parse.
If the input or output format is not specified explicitly, `pandoc`
will attempt to guess it from the extensions of
the input and output filenames. Thus, for example,
pandoc -o hello.tex hello.txt
will convert `hello.txt` from Markdown to LaTeX. If no output file
is specified (so that output goes to *stdout*), or if the output file's
extension is unknown, the output format will default to HTML.
If no input file is specified (so that input comes from *stdin*), or
if the input files' extensions are unknown, the input format will
be assumed to be Markdown unless explicitly specified.
Pandoc uses the UTF-8 character encoding for both input and output.
If your local character encoding is not UTF-8, you
should pipe input and output through [`iconv`]:
iconv -t utf-8 input.txt | pandoc | iconv -f utf-8
Note that in some output formats (such as HTML, LaTeX, ConTeXt,
RTF, OPML, DocBook, and Texinfo), information about
the character encoding is included in the document header, which
will only be included if you use the `-s/--standalone` option.
[`iconv`]: http://www.gnu.org/software/libiconv/
Creating a PDF
--------------
To produce a PDF, specify an output file with a `.pdf` extension.
By default, pandoc will use LaTeX to convert it to PDF:
pandoc test.txt -o test.pdf
Production of a PDF requires that a LaTeX engine be installed (see
`--latex-engine`, below), and assumes that the following LaTeX packages
are available: [`amsfonts`], [`amsmath`], [`lm`], [`unicode-math`],
[`ifxetex`], [`ifluatex`], [`eurosym`], [`listings`] (if the
`--listings` option is used), [`fancyvrb`], [`longtable`],
[`booktabs`], [`graphicx`] and [`grffile`] (if the document
contains images), [`hyperref`], [`ulem`], [`geometry`] (with the
`geometry` variable set), [`setspace`] (with `linestretch`), and
[`babel`] (with `lang`). The use of `xelatex` or `lualatex` as
the LaTeX engine requires [`fontspec`]. `xelatex` uses
[`polyglossia`] (with `lang`), [`xecjk`], and [`bidi`] (with the
`dir` variable set). If the `mathspec` variable is set,
`xelatex` will use [`mathspec`] instead of [`unicode-math`].
The [`upquote`] and [`microtype`] packages are used if
available, and [`csquotes`] will be used for [smart punctuation]
if added to the template or included in any header file. The
[`natbib`], [`biblatex`], [`bibtex`], and [`biber`] packages can
optionally be used for [citation rendering]. These are included
with all recent versions of [TeX Live].
Alternatively, pandoc can use ConTeXt, `wkhtmltopdf`, or
`pdfroff` to create a PDF. To do this, specify an output file
with a `.pdf` extension, as before, but add `-t context`, `-t
html5`, or `-t ms` to the command line.
PDF output can be controlled using [variables for LaTeX] (if
LaTeX is used) and [variables for ConTeXt] (if ConTeXt is used).
If `wkhtmltopdf` is used, then the variables `margin-left`,
`margin-right`, `margin-top`, `margin-bottom`, and `papersize`
will affect the output, as will `--css`.
[`amsfonts`]: https://ctan.org/pkg/amsfonts
[`amsmath`]: https://ctan.org/pkg/amsmath
[`lm`]: https://ctan.org/pkg/lm
[`ifxetex`]: https://ctan.org/pkg/ifxetex
[`ifluatex`]: https://ctan.org/pkg/ifluatex
[`eurosym`]: https://ctan.org/pkg/eurosym
[`listings`]: https://ctan.org/pkg/listings
[`fancyvrb`]: https://ctan.org/pkg/fancyvrb
[`longtable`]: https://ctan.org/pkg/longtable
[`booktabs`]: https://ctan.org/pkg/booktabs
[`graphicx`]: https://ctan.org/pkg/graphicx
[`grffile`]: https://ctan.org/pkg/grffile
[`geometry`]: https://ctan.org/pkg/geometry
[`setspace`]: https://ctan.org/pkg/setspace
[`xecjk`]: https://ctan.org/pkg/xecjk
[`hyperref`]: https://ctan.org/pkg/hyperref
[`ulem`]: https://ctan.org/pkg/ulem
[`babel`]: https://ctan.org/pkg/babel
[`bidi`]: https://ctan.org/pkg/bidi
[`mathspec`]: https://ctan.org/pkg/mathspec
[`unicode-math`]: https://ctan.org/pkg/unicode-math
[`polyglossia`]: https://ctan.org/pkg/polyglossia
[`fontspec`]: https://ctan.org/pkg/fontspec
[`upquote`]: https://ctan.org/pkg/upquote
[`microtype`]: https://ctan.org/pkg/microtype
[`csquotes`]: https://ctan.org/pkg/csquotes
[`natbib`]: https://ctan.org/pkg/natbib
[`biblatex`]: https://ctan.org/pkg/biblatex
[`bibtex`]: https://ctan.org/pkg/bibtex
[`biber`]: https://ctan.org/pkg/biber
[TeX Live]: http://www.tug.org/texlive/
Options
=======
General options
---------------
`-f` *FORMAT*, `-r` *FORMAT*, `--from=`*FORMAT*, `--read=`*FORMAT*
: Specify input format. *FORMAT* can be `native` (native Haskell),
`json` (JSON version of native AST), `markdown` (pandoc's
extended Markdown), `markdown_strict` (original unextended
Markdown), `markdown_phpextra` (PHP Markdown Extra), `markdown_github`
(GitHub-Flavored Markdown), `markdown_mmd` (MultiMarkdown),
`commonmark` (CommonMark Markdown), `textile` (Textile), `rst`
(reStructuredText), `html` (HTML), `docbook` (DocBook), `t2t`
(txt2tags), `docx` (docx), `odt` (ODT), `epub` (EPUB), `opml` (OPML),
`org` (Emacs Org mode), `mediawiki` (MediaWiki markup), `twiki` (TWiki
markup), `tikiwiki` (TikiWiki markup), `haddock` (Haddock markup), or
`latex` (LaTeX). If `+lhs` is appended to `markdown`, `rst`, `latex`, or
`html`, the input will be treated as literate Haskell source: see
[Literate Haskell support], below. Markdown
syntax extensions can be individually enabled or disabled by
appending `+EXTENSION` or `-EXTENSION` to the format name. So, for
example, `markdown_strict+footnotes+definition_lists` is strict
Markdown with footnotes and definition lists enabled, and
`markdown-pipe_tables+hard_line_breaks` is pandoc's Markdown
without pipe tables and with hard line breaks. See [Pandoc's
Markdown], below, for a list of extensions and
their names. See `--list-input-formats` and `--list-extensions`,
below.
`-t` *FORMAT*, `-w` *FORMAT*, `--to=`*FORMAT*, `--write=`*FORMAT*
: Specify output format. *FORMAT* can be `native` (native Haskell),
`json` (JSON version of native AST), `plain` (plain text),
`markdown` (pandoc's extended Markdown), `markdown_strict`
(original unextended Markdown), `markdown_phpextra` (PHP Markdown
Extra), `markdown_github` (GitHub-Flavored Markdown), `markdown_mmd`
(MultiMarkdown), `commonmark` (CommonMark Markdown), `rst`
(reStructuredText), `html4` (XHTML 1.0 Transitional), `html`
or `html5` (HTML5/XHTML [polyglot markup]), `latex`
(LaTeX), `beamer` (LaTeX beamer slide show), `context` (ConTeXt),
`man` (groff man), `mediawiki` (MediaWiki markup),
`dokuwiki` (DokuWiki markup), `zimwiki` (ZimWiki markup),
`textile` (Textile), `org` (Emacs Org mode), `texinfo` (GNU Texinfo),
`opml` (OPML), `docbook` or `docbook4` (DocBook 4), `docbook5`
(DocBook 5), `jats` (JATS XML), `opendocument` (OpenDocument),
`odt` (OpenOffice text document), `docx` (Word docx), `haddock`
(Haddock markup), `rtf` (rich text format), `epub2` (EPUB v2 book),
`epub` or `epub3` (EPUB v3), `fb2` (FictionBook2 e-book),
`asciidoc` (AsciiDoc), `icml` (InDesign ICML), `tei` (TEI
Simple), `slidy` (Slidy HTML and JavaScript slide show),
`slideous` (Slideous HTML and JavaScript slide show),
`dzslides` (DZSlides HTML5 + JavaScript slide show),
`revealjs` (reveal.js HTML5 + JavaScript slide show), `s5`
(S5 HTML and JavaScript slide show), or the path of a custom
lua writer (see [Custom writers], below). Note that `odt`,
`epub`, and `epub3` output will not be directed to *stdout*;
an output filename must be specified using the `-o/--output`
option. If `+lhs` is appended to `markdown`, `rst`, `latex`,
`beamer`, `html4`, or `html5`, the output will be rendered as
literate Haskell source: see [Literate Haskell support],
below. Markdown syntax extensions can be individually
enabled or disabled by appending `+EXTENSION` or
`-EXTENSION` to the format name, as described above under `-f`.
See `--list-output-formats` and `--list-extensions`, below.
`-o` *FILE*, `--output=`*FILE*
: Write output to *FILE* instead of *stdout*. If *FILE* is
`-`, output will go to *stdout*. (Exception: if the output
format is `odt`, `docx`, `epub`, or `epub3`, output to stdout is disabled.)
`--data-dir=`*DIRECTORY*
: Specify the user data directory to search for pandoc data files.
If this option is not specified, the default user data directory
will be used. This is, in Unix:
$HOME/.pandoc
in Windows XP:
C:\Documents And Settings\USERNAME\Application Data\pandoc
and in Windows Vista or later:
C:\Users\USERNAME\AppData\Roaming\pandoc
You can find the default user data directory on your system by
looking at the output of `pandoc --version`.
A `reference.odt`, `reference.docx`, `epub.css`, `templates`,
`slidy`, `slideous`, or `s5` directory
placed in this directory will override pandoc's normal defaults.
`--bash-completion`
: Generate a bash completion script. To enable bash completion
with pandoc, add this to your `.bashrc`:
eval "$(pandoc --bash-completion)"
`--verbose`
: Give verbose debugging output. Currently this only has an effect
with PDF output.
`--quiet`
: Suppress warning messages.
`--fail-if-warnings`
: Exit with error status if there are any warnings.
`--log=`*FILE*
: Write log messages in machine-readable JSON format to
*FILE*. All messages above DEBUG level will be written,
regardless of verbosity settings (`--verbose`, `--quiet`).
`--list-input-formats`
: List supported input formats, one per line.
`--list-output-formats`
: List supported output formats, one per line.
`--list-extensions`
: List supported Markdown extensions, one per line, followed
by a `+` or `-` indicating whether it is enabled by default
in pandoc's Markdown.
`--list-highlight-languages`
: List supported languages for syntax highlighting, one per
line.
`--list-highlight-styles`
: List supported styles for syntax highlighting, one per line.
See `--highlight-style`.
`-v`, `--version`
: Print version.
`-h`, `--help`
: Show usage message.
Reader options
--------------
`--base-header-level=`*NUMBER*
: Specify the base level for headers (defaults to 1).
`--indented-code-classes=`*CLASSES*
: Specify classes to use for indented code blocks--for example,
`perl,numberLines` or `haskell`. Multiple classes may be separated
by spaces or commas.
`--default-image-extension=`*EXTENSION*
: Specify a default extension to use when image paths/URLs have no
extension. This allows you to use the same source for formats that
require different kinds of images. Currently this option only affects
the Markdown and LaTeX readers.
`--file-scope`
: Parse each file individually before combining for multifile
documents. This will allow footnotes in different files with the
same identifiers to work as expected. If this option is set,
footnotes and links will not work across files. Reading binary
files (docx, odt, epub) implies `--file-scope`.
`--filter=`*PROGRAM*
: Specify an executable to be used as a filter transforming the
pandoc AST after the input is parsed and before the output is
written. The executable should read JSON from stdin and write
JSON to stdout. The JSON must be formatted like pandoc's own
JSON input and output. The name of the output format will be
passed to the filter as the first argument. Hence,
pandoc --filter ./caps.py -t latex
is equivalent to
pandoc -t json | ./caps.py latex | pandoc -f json -t latex
The latter form may be useful for debugging filters.
Filters may be written in any language. `Text.Pandoc.JSON`
exports `toJSONFilter` to facilitate writing filters in Haskell.
Those who would prefer to write filters in python can use the
module [`pandocfilters`], installable from PyPI. There are also
pandoc filter libraries in [PHP], [perl], and
[javascript/node.js].
In order of preference, pandoc will look for filters in
1. a specified full or relative path (executable or
non-executable)
2. `$DATADIR/filters` (executable or non-executable)
3. `$PATH` (executable only)
`--lua-filter=`*SCRIPT*
: Transform the document in a similar fashion as JSON filters (see
`--filter`), but use pandoc's build-in lua filtering system. The given
lua script is expected to return a list of lua filters which will be
applied in order. Each lua filter must contain element-transforming
functions indexed by the name of the AST element on which the filter
function should be applied.
The `pandoc` lua module provides helper functions for element
creation. It is always loaded into the script's lua environment.
The following is an example lua script for macro-expansion:
function expand_hello_world(inline)
if inline.c == '{{helloworld}}' then
return pandoc.Emph{ pandoc.Str "Hello, World" }
else
return inline
end
end
return {{Str = expand_hello_world}}
`-M` *KEY*[`=`*VAL*], `--metadata=`*KEY*[`:`*VAL*]
: Set the metadata field *KEY* to the value *VAL*. A value specified
on the command line overrides a value specified in the document.
Values will be parsed as YAML boolean or string values. If no value is
specified, the value will be treated as Boolean true. Like
`--variable`, `--metadata` causes template variables to be set.
But unlike `--variable`, `--metadata` affects the metadata of the
underlying document (which is accessible from filters and may be
printed in some output formats).
`-p`, `--preserve-tabs`
: Preserve tabs instead of converting them to spaces (the default).
Note that this will only affect tabs in literal code spans and code
blocks; tabs in regular text will be treated as spaces.
`--tab-stop=`*NUMBER*
: Specify the number of spaces per tab (default is 4).
`--track-changes=accept`|`reject`|`all`
: Specifies what to do with insertions, deletions, and comments
produced by the MS Word "Track Changes" feature. `accept` (the
default), inserts all insertions, and ignores all
deletions. `reject` inserts all deletions and ignores
insertions. Both `accept` and `reject` ignore comments. `all` puts
in insertions, deletions, and comments, wrapped in spans with
`insertion`, `deletion`, `comment-start`, and `comment-end`
classes, respectively. The author and time of change is
included. `all` is useful for scripting: only accepting changes
from a certain reviewer, say, or before a certain date. This
option only affects the docx reader.
`--extract-media=`*DIR*
: Extract images and other media contained in or linked from
the source document to the path *DIR*, creating it if
necessary, and adjust the images references in the document
so they point to the extracted files. If the source format is
a binary container (docx, epub, or odt), the media is
extracted from the container and the original
filenames are used. Otherwise the media is read from the
file system or downloaded, and new filenames are constructed
based on SHA1 hashes of the contents.
`--abbreviations=`*FILE*
: Specifies a custom abbreviations file, with abbreviations
one to a line. If this option is not specified, pandoc will
read the data file `abbreviations` from the user data
directory or fall back on a system default. To see the
system default, use
`pandoc --print-default-data-file=abbreviations`. The only
use pandoc makes of this list is in the Markdown reader.
Strings ending in a period that are found in this list will
be followed by a nonbreaking space, so that the period will
not produce sentence-ending space in formats like LaTeX.
[`pandocfilters`]: https://github.com/jgm/pandocfilters
[PHP]: https://github.com/vinai/pandocfilters-php
[perl]: https://metacpan.org/pod/Pandoc::Filter
[javascript/node.js]: https://github.com/mvhenderson/pandoc-filter-node
General writer options
----------------------
`-s`, `--standalone`
: Produce output with an appropriate header and footer (e.g. a
standalone HTML, LaTeX, TEI, or RTF file, not a fragment). This option
is set automatically for `pdf`, `epub`, `epub3`, `fb2`, `docx`, and `odt`
output.
`--template=`*FILE*
: Use *FILE* as a custom template for the generated document. Implies
`--standalone`. See [Templates], below, for a description
of template syntax. If no extension is specified, an extension
corresponding to the writer will be added, so that `--template=special`
looks for `special.html` for HTML output. If the template is not
found, pandoc will search for it in the `templates` subdirectory of
the user data directory (see `--data-dir`). If this option is not used,
a default template appropriate for the output format will be used (see
`-D/--print-default-template`).
`-V` *KEY*[`=`*VAL*], `--variable=`*KEY*[`:`*VAL*]
: Set the template variable *KEY* to the value *VAL* when rendering the
document in standalone mode. This is generally only useful when the
`--template` option is used to specify a custom template, since
pandoc automatically sets the variables used in the default
templates. If no *VAL* is specified, the key will be given the
value `true`.
`-D` *FORMAT*, `--print-default-template=`*FORMAT*
: Print the system default template for an output *FORMAT*. (See `-t`
for a list of possible *FORMAT*s.) Templates in the user data
directory are ignored.
`--print-default-data-file=`*FILE*
: Print a system default data file. Files in the user data directory
are ignored.
`--eol=crlf`|`lf`|`native`
: Manually specify line endings: `crlf` (Windows), `lf`
(MacOS/linux/unix), or `native` (line endings appropriate
to the OS on which pandoc is being run). The default is
`native`.
`--dpi`=*NUMBER*
: Specify the dpi (dots per inch) value for conversion from pixels
to inch/centimeters and vice versa. The default is 96dpi.
Technically, the correct term would be ppi (pixels per inch).
`--wrap=auto`|`none`|`preserve`
: Determine how text is wrapped in the output (the source
code, not the rendered version). With `auto` (the default),
pandoc will attempt to wrap lines to the column width specified by
`--columns` (default 72). With `none`, pandoc will not wrap
lines at all. With `preserve`, pandoc will attempt to
preserve the wrapping from the source document (that is,
where there are nonsemantic newlines in the source, there
will be nonsemantic newlines in the output as well).
Automatic wrapping does not currently work in HTML output.
`--columns=`*NUMBER*
: Specify length of lines in characters. This affects text wrapping
in the generated source code (see `--wrap`). It also affects
calculation of column widths for plain text tables (see [Tables] below).
`--toc`, `--table-of-contents`
: Include an automatically generated table of contents (or, in
the case of `latex`, `context`, `docx`, `odt`,
`opendocument`, `rst`, or `ms`, an instruction to create
one) in the output document. This option has no effect on
`man`, `docbook4`, `docbook5`, or `jats` output.
`--toc-depth=`*NUMBER*
: Specify the number of section levels to include in the table
of contents. The default is 3 (which means that level 1, 2, and 3
headers will be listed in the contents).
`--no-highlight`
: Disables syntax highlighting for code blocks and inlines, even when
a language attribute is given.
`--highlight-style=`*STYLE*|*FILE*
: Specifies the coloring style to be used in highlighted source code.
Options are `pygments` (the default), `kate`, `monochrome`,
`breezeDark`, `espresso`, `zenburn`, `haddock`, and `tango`.
For more information on syntax highlighting in pandoc, see
[Syntax highlighting], below. See also
`--list-highlight-styles`.
Instead of a *STYLE* name, a JSON file with extension
`.theme` may be supplied. This will be parsed as a KDE
syntax highlighting theme and (if valid) used as the
highlighting style. To see a sample theme that can be
modified, `pandoc --print-default-data-file default.theme`.
`--syntax-definition=`*FILE*
: Instructs pandoc to load a KDE XML syntax definition file,
which will be used for syntax highlighting of appropriately
marked code blocks. This can be used to add support for
new languages or to use altered syntax definitions for
existing languages.
`-H` *FILE*, `--include-in-header=`*FILE*
: Include contents of *FILE*, verbatim, at the end of the header.
This can be used, for example, to include special
CSS or JavaScript in HTML documents. This option can be used
repeatedly to include multiple files in the header. They will be
included in the order specified. Implies `--standalone`.
`-B` *FILE*, `--include-before-body=`*FILE*
: Include contents of *FILE*, verbatim, at the beginning of the
document body (e.g. after the `<body>` tag in HTML, or the
`\begin{document}` command in LaTeX). This can be used to include
navigation bars or banners in HTML documents. This option can be
used repeatedly to include multiple files. They will be included in
the order specified. Implies `--standalone`.
`-A` *FILE*, `--include-after-body=`*FILE*
: Include contents of *FILE*, verbatim, at the end of the document
body (before the `</body>` tag in HTML, or the
`\end{document}` command in LaTeX). This option can be used
repeatedly to include multiple files. They will be included in the
order specified. Implies `--standalone`.
`--resource-path=`*SEARCHPATH*
: List of paths to search for images and other resources.
The paths should be separated by `:` on linux, unix, and
MacOS systems, and by `;` on Windows. If `--resource-path`
is not specified, the default resource path is the working
directory. Note that, if `--resource-path` is specified,
the working directory must be explicitly listed or it
will not be searched. For example:
`--resource-path=.:test` will search the working directory
and the `test` subdirectory, in that order.
Options affecting specific writers
----------------------------------
`--self-contained`
: Produce a standalone HTML file with no external dependencies, using
`data:` URIs to incorporate the contents of linked scripts, stylesheets,
images, and videos. The resulting file should be "self-contained,"
in the sense that it needs no external files and no net access to be
displayed properly by a browser. This option works only with HTML output
formats, including `html4`, `html5`, `html+lhs`, `html5+lhs`, `s5`,
`slidy`, `slideous`, `dzslides`, and `revealjs`. Scripts, images, and
stylesheets at absolute URLs will be downloaded; those at relative URLs
will be sought relative to the working directory (if the first source
file is local) or relative to the base URL (if the first source
file is remote). Elements with the attribute
`data-external="1"` will be left alone; the documents they
link to will not be incorporated in the document.
Limitation: resources that are loaded dynamically through
JavaScript cannot be incorporated; as a result,
`--self-contained` does not work with `--mathjax`, and some
advanced features (e.g. zoom or speaker notes) may not work
in an offline "self-contained" `reveal.js` slide show.
`--html-q-tags`
: Use `<q>` tags for quotes in HTML.
`--ascii`
: Use only ASCII characters in output. Currently supported only for
HTML and DocBook output (which uses numerical entities instead of
UTF-8 when this option is selected).
`--reference-links`
: Use reference-style links, rather than inline links, in writing Markdown
or reStructuredText. By default inline links are used. The
placement of link references is affected by the
`--reference-location` option.
`--reference-location = block`|`section`|`document`
: Specify whether footnotes (and references, if `reference-links` is
set) are placed at the end of the current (top-level) block, the
current section, or the document. The default is
`document`. Currently only affects the markdown writer.
`--atx-headers`
: Use ATX-style headers in Markdown and AsciiDoc output. The default is
to use setext-style headers for levels 1-2, and then ATX headers.
`--top-level-division=[default|section|chapter|part]`
: Treat top-level headers as the given division type in LaTeX, ConTeXt,
DocBook, and TEI output. The hierarchy order is part, chapter, then section;
all headers are shifted such that the top-level header becomes the specified
type. The default behavior is to determine the best division type via
heuristics: unless other conditions apply, `section` is chosen. When the
LaTeX document class is set to `report`, `book`, or `memoir` (unless the
`article` option is specified), `chapter` is implied as the setting for this
option. If `beamer` is the output format, specifying either `chapter` or
`part` will cause top-level headers to become `\part{..}`, while
second-level headers remain as their default type.
`-N`, `--number-sections`
: Number section headings in LaTeX, ConTeXt, HTML, or EPUB output.
By default, sections are not numbered. Sections with class
`unnumbered` will never be numbered, even if `--number-sections`
is specified.
`--number-offset=`*NUMBER*[`,`*NUMBER*`,`*...*]
: Offset for section headings in HTML output (ignored in other
output formats). The first number is added to the section number for
top-level headers, the second for second-level headers, and so on.
So, for example, if you want the first top-level header in your
document to be numbered "6", specify `--number-offset=5`.
If your document starts with a level-2 header which you want to
be numbered "1.5", specify `--number-offset=1,4`.
Offsets are 0 by default. Implies `--number-sections`.
`--listings`
: Use the [`listings`] package for LaTeX code blocks
`-i`, `--incremental`
: Make list items in slide shows display incrementally (one by one).
The default is for lists to be displayed all at once.
`--slide-level=`*NUMBER*
: Specifies that headers with the specified level create
slides (for `beamer`, `s5`, `slidy`, `slideous`, `dzslides`). Headers
above this level in the hierarchy are used to divide the
slide show into sections; headers below this level create
subheads within a slide. Note that content that is
not contained under slide-level headers will not appear in
the slide show. The default is to set the slide level based
on the contents of the document; see [Structuring the slide
show].
`--section-divs`
: Wrap sections in `<div>` tags (or `<section>` tags in HTML5),
and attach identifiers to the enclosing `<div>` (or `<section>`)
rather than the header itself. See
[Header identifiers], below.
`--email-obfuscation=none`|`javascript`|`references`
: Specify a method for obfuscating `mailto:` links in HTML documents.
`none` leaves `mailto:` links as they are. `javascript` obfuscates
them using JavaScript. `references` obfuscates them by printing their
letters as decimal or hexadecimal character references. The default
is `none`.
`--id-prefix=`*STRING*
: Specify a prefix to be added to all identifiers and internal links
in HTML and DocBook output, and to footnote numbers in Markdown
and Haddock output. This is useful for preventing duplicate
identifiers when generating fragments to be included in other pages.
`-T` *STRING*, `--title-prefix=`*STRING*
: Specify *STRING* as a prefix at the beginning of the title
that appears in the HTML header (but not in the title as it
appears at the beginning of the HTML body). Implies
`--standalone`.
`-c` *URL*, `--css=`*URL*
: Link to a CSS style sheet. This option can be used repeatedly to
include multiple files. They will be included in the order specified.
A stylesheet is required for generating EPUB. If none is
provided using this option (or the `stylesheet` metadata
field), pandoc will look for a file `epub.css` in the
user data directory (see `--data-dir`). If it is not
found there, sensible defaults will be used.
`--reference-doc=`*FILE*
: Use the specified file as a style reference in producing a
docx or ODT file.
Docx
: For best results, the reference docx should be a modified
version of a docx file produced using pandoc. The contents
of the reference docx are ignored, but its stylesheets and
document properties (including margins, page size, header,
and footer) are used in the new docx. If no reference docx
is specified on the command line, pandoc will look for a
file `reference.docx` in the user data directory (see
`--data-dir`). If this is not found either, sensible
defaults will be used.
To produce a custom `reference.docx`, first get a copy of
the default `reference.docx`: `pandoc
--print-default-data-file reference.docx >
custom-reference.docx`. Then open `custom-reference.docx`
in Word, modify the styles as you wish, and save the file.
For best results, do not make changes to this file other
than modifying the styles used by pandoc: [paragraph]
Normal, Body Text, First Paragraph, Compact, Title,
Subtitle, Author, Date, Abstract, Bibliography, Heading 1,
Heading 2, Heading 3, Heading 4, Heading 5, Heading 6,
Heading 7, Heading 8, Heading 9, Block Text, Footnote Text,
Definition Term, Definition, Caption, Table Caption,
Image Caption, Figure, Captioned Figure, TOC Heading;
[character] Default Paragraph Font, Body Text Char,
Verbatim Char, Footnote Reference, Hyperlink; [table]
Table.
ODT
: For best results, the reference ODT should be a modified
version of an ODT produced using pandoc. The contents of
the reference ODT are ignored, but its stylesheets are used
in the new ODT. If no reference ODT is specified on the
command line, pandoc will look for a file `reference.odt` in
the user data directory (see `--data-dir`). If this is not
found either, sensible defaults will be used.
To produce a custom `reference.odt`, first get a copy of
the default `reference.odt`: `pandoc
--print-default-data-file reference.odt >
custom-reference.odt`. Then open `custom-reference.odt` in
LibreOffice, modify the styles as you wish, and save the
file.
`--epub-cover-image=`*FILE*
: Use the specified image as the EPUB cover. It is recommended
that the image be less than 1000px in width and height. Note that
in a Markdown source document you can also specify `cover-image`
in a YAML metadata block (see [EPUB Metadata], below).
`--epub-metadata=`*FILE*
: Look in the specified XML file for metadata for the EPUB.
The file should contain a series of [Dublin Core elements].
For example:
<dc:rights>Creative Commons</dc:rights>
<dc:language>es-AR</dc:language>
By default, pandoc will include the following metadata elements:
`<dc:title>` (from the document title), `<dc:creator>` (from the
document authors), `<dc:date>` (from the document date, which should
be in [ISO 8601 format]), `<dc:language>` (from the `lang`
variable, or, if is not set, the locale), and `<dc:identifier
id="BookId">` (a randomly generated UUID). Any of these may be
overridden by elements in the metadata file.
Note: if the source document is Markdown, a YAML metadata block
in the document can be used instead. See below under
[EPUB Metadata].
`--epub-embed-font=`*FILE*
: Embed the specified font in the EPUB. This option can be repeated
to embed multiple fonts. Wildcards can also be used: for example,
`DejaVuSans-*.ttf`. However, if you use wildcards on the command
line, be sure to escape them or put the whole filename in single quotes,
to prevent them from being interpreted by the shell. To use the
embedded fonts, you will need to add declarations like the following
to your CSS (see `--css`):
@font-face {
font-family: DejaVuSans;
font-style: normal;
font-weight: normal;
src:url("DejaVuSans-Regular.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: normal;
font-weight: bold;
src:url("DejaVuSans-Bold.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: italic;
font-weight: normal;
src:url("DejaVuSans-Oblique.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: italic;
font-weight: bold;
src:url("DejaVuSans-BoldOblique.ttf");
}
body { font-family: "DejaVuSans"; }
`--epub-chapter-level=`*NUMBER*
: Specify the header level at which to split the EPUB into separate
"chapter" files. The default is to split into chapters at level 1
headers. This option only affects the internal composition of the
EPUB, not the way chapters and sections are displayed to users. Some
readers may be slow if the chapter files are too large, so for large
documents with few level 1 headers, one might want to use a chapter
level of 2 or 3.
`--epub-subdirectory=`*DIRNAME*
: Specify the subdirectory in the OCF container that is to hold
the EPUB-specific contents. The default is `EPUB`. To put
the EPUB contents in the top level, use an empty string.
`--latex-engine=pdflatex`|`lualatex`|`xelatex`
: Use the specified LaTeX engine when producing PDF output.
The default is `pdflatex`. If the engine is not in your PATH,
the full path of the engine may be specified here.
`--latex-engine-opt=`*STRING*
: Use the given string as a command-line argument to the `latex-engine`.
If used multiple times, the arguments are provided with spaces between
them. Note that no check for duplicate options is done.
[Dublin Core elements]: http://dublincore.org/documents/dces/
[ISO 8601 format]: http://www.w3.org/TR/NOTE-datetime