Autogenerate a Quarto publications listing from Excel (xlsx → YAML → custom listing)

tool

quarto

Author

Gang He

Published

April 10, 2026

What problem does this solve?

Editing a long publications list as raw YAML is error-prone and awkward for researchers. Instead, import publications from a BibTeX file or maintain publication rows in Excel (publications.xlsx), run xlsx_to_yml.py to produce publications.yml, and let Quarto’s custom listing plus your own pub-listing.ejs and pub-listing.css render the page with categories, filters, links, and badges as your template defines.

In this repository the script lives at files/scripts/xlsx_to_yml.py, and _quarto.yml runs it under pre-render before each build when needed (see below).

Prerequisites

Python 3 and:
```
pip install openpyxl
```
Optional: install pyyaml; the script will try to validate the written YAML with PyYAML if available.
Place publications.xlsx in the project root (next to _quarto.yml). By default the script reads that file and writes publications.yml; you can pass different paths on the command line.
Copy pub-listing.ejs and pub-listing.css from the template repo (or author your own), and keep them where your listing .qmd (e.g. publications.qmd) expects—often alongside that page.

Step 1: Prepare the spreadsheet

The first row must be headers, and header text must match the table below exactly (case and spaces matter). Columns not in the mapping become lowercase hyphenated keys; sticking to the table avoids stray fields. For first time users, especially if you have a long list of publications, you can use bib file to create the spreadsheet using online tools, such as this one.

Excel column	YAML key	Notes
Section	`section`	Drives Quarto `include: section: "..."` blocks, e.g. `Selected Work` vs `Peer-reviewed Journal Paper`
Authors	`authors`	Markdown allowed (bold, `*` for corresponding author, etc.)
Year	`year`	Prefer a number; the script coerces to int when possible
Date	`date`	Optional; useful for sort or display
Title	`title`	Paper title
Paper Link	`path`	Primary URL (site post or external)
Journal	`journal`	Journal name
Volume	`volume`	Volume
Issue	`issue`	Issue
Pages	`pages`	Pages
DOI	`doi`	DOI
PDF	`pdf`	PDF URL
Preprint	`preprint`	Preprint URL
SharedIt	`sharedit`	Share / readcube-style link
Supplemental Information	`supplemental`	Supplementary materials
GitHub	`github`	Repository
Code	`code`	Code archive (e.g. Zenodo)
Data	`data`	Data URL
Highly Cited	`highlycited`	Flag; rendering depends on EJS/CSS
Hot Paper	`hotpaper`	Flag
Awards	`awards`	Awards text
Media Coverage	`mediacoverage`	Press links (e.g. `\|` inside the cell; parsed by template)
Invited Presentation	`invitedpresentation`	Invited talk, etc.
Categories	`categories`	List: separate with comma, semicolon, or pipe, e.g. `2025, selected, peer-reviewed`

Empty rows are skipped; rows that yield no fields after parsing are omitted from the YAML.

Step 2: Run the converter

From the project root (default publications.xlsx → publications.yml):

python files/scripts/xlsx_to_yml.py

Or pass explicit paths:

python files/scripts/xlsx_to_yml.py path/to/in.xlsx path/to/out.yml

Incremental runs: if the output file already exists, the script compares modification times of publications.xlsx and publications.yml and only regenerates when Excel is newer, avoiding redundant writes. To force a rebuild:

python files/scripts/xlsx_to_yml.py --force

Step 3: Hook into Quarto (`pre-render`)

Under project in _quarto.yml, add pre-render so every quarto render / quarto preview runs the converter first (still subject to the “only if Excel changed” logic):

project:
  type: website
  pre-render:
    - python files/scripts/xlsx_to_yml.py

If your script sits at the repo root instead of files/scripts/, use python xlsx_to_yml.py.

Step 4: Wire the publications page (YAML + custom template)

In publications.qmd (filename up to you), set listing with contents: publications.yml, type: custom, template: pub-listing.ejs, and use include to split listings by section. A minimal pattern matching this site:

title: "Publications"
listing:
  - id: selected-work
    contents: publications.yml
    type: custom
    page-size: 100
    categories: numbered
    template: pub-listing.ejs
    include:
      section: "Selected Work"
  - id: peer-reviewed
    contents: publications.yml
    type: custom
    page-size: 100
    categories: numbered
    template: pub-listing.ejs
    include:
      section: "Peer-reviewed Journal Paper"
css: pub-listing.css

In the body, use fenced divs whose id matches each listing id:

## Selected Work

::: {#selected-work}
:::

## Other Peer-reviewed Journal Papers

::: {#peer-reviewed}
:::

This site also uses a Lua filter when raw HTML from EJS interacts badly with Quarto’s listing wrapper; if you see stray ::: in output, add filters: [files/scripts/remove-stray-divfence.lua] in publications.qmd.

Examples and where to get the templates

Live demo (academic template): pub-listing example
Template docs and files: quarto-academic-website-template README
This site: publications.html and publications.qmd in the repo

After copying pub-listing.ejs and pub-listing.css, adjust CSS for your brand.

Troubleshooting

New papers missing after build
Confirm Excel is saved and that publications.yml is not newer than publications.xlsx unless you intend to skip conversion; run once with --force, then quarto render.

YAML errors or empty listing
Verify headers match the table exactly. For tricky titles or authors, the script quotes YAML scalars; if it still fails, simplify the cell temporarily to isolate the issue.

Local test without editing _quarto.yml
Run python files/scripts/xlsx_to_yml.py, then quarto render publications.qmd (or a full site render).

Cheat sheet

Step	Action
1	Maintain `publications.xlsx` with the correct column headers
2	Run `python files/scripts/xlsx_to_yml.py` (or rely on `pre-render`)
3	Point `publications.qmd` at `publications.yml` + `pub-listing.ejs` + `pub-listing.css`
4	`quarto render` and publish

Day-to-day you only edit the spreadsheet; the site listing stays in sync with that data.