Thymeleaf + OpenHtmlToPdf: The Java Stack for PDF Generation
Generating PDFs on the server side is a problem nearly every enterprise application faces: invoices, reports, contracts, shipping labels. While many solutions exist, the combination of Thymeleaf for HTML templating and OpenHtmlToPdf for rendering produces a stack that is fast, predictable, and entirely JVM-native. No external processes, no browser dependencies, no native libraries.
This article walks through the complete rendering pipeline, the CSS constraints you need to understand, font embedding strategies, and performance characteristics drawn from our production experience at Doxnex.
The Rendering Pipeline
The pipeline consists of three stages: template resolution, HTML processing, and PDF rendering. Understanding each stage is critical for debugging issues and optimizing performance.
Stage 1: Thymeleaf Template Processing
Thymeleaf takes an HTML template with th:* attributes and a data model (a Java Map or object), and produces a fully resolved HTML string. This stage is purely text processing and is extremely fast, typically 5-15ms for a complex template.
SpringTemplateEngine engine = new SpringTemplateEngine();
ClassLoaderTemplateResolver resolver = new ClassLoaderTemplateResolver();
resolver.setPrefix("templates/");
resolver.setSuffix(".html");
resolver.setTemplateMode(TemplateMode.HTML);
engine.setTemplateResolver(resolver);
Context context = new Context();
context.setVariable("invoice", invoiceData);
context.setVariable("lineItems", items);
String html = engine.process("invoice-template", context);
Thymeleaf's natural templating approach means your templates are valid HTML that designers can preview in a browser. The th:text, th:each, and th:if attributes are processed server-side and stripped from the output.
Stage 2: HTML Parsing and Box Model Construction
OpenHtmlToPdf parses the resolved HTML using jsoup internally, builds a CSS box model from the stylesheet, and constructs a layout tree. This is where CSS 2.1 constraints come into play. The library builds a page-aware layout, splitting content across pages according to your @page rules.
Stage 3: PDF Rendering with PDFBox
The layout tree is rendered into a PDF document using Apache PDFBox. Text is laid out using the embedded fonts, images are compressed and embedded, and the final binary PDF is written to an output stream.
try (ByteArrayOutputStream os = new ByteArrayOutputStream()) {
PdfRendererBuilder builder = new PdfRendererBuilder();
builder.useFastMode();
builder.withHtmlContent(html, baseUrl);
builder.useFont(new File("fonts/Roboto-Regular.ttf"),
"Roboto");
builder.toStream(os);
builder.run();
byte[] pdfBytes = os.toByteArray();
}
CSS 2.1 Constraints: What Works and What Does Not
This is the area that causes the most frustration for developers coming from modern web development. OpenHtmlToPdf does not render through a browser engine. It implements a subset of CSS focused on print layout.
Supported and Reliable
- Block layout:
display: block,display: inline,display: inline-block,display: table - Floats:
float: leftandfloat: rightfor column layouts - Positioning:
position: relativeandposition: absolute(within a positioned ancestor) - Paged media:
@page,page-break-before,page-break-after,page-break-inside: avoid - Borders, margins, padding: Full box model support
- CSS counters: For page numbers in headers and footers
Not Supported
- Flexbox and Grid: Use tables or floats instead
- CSS variables: Preprocess with a tool or use Thymeleaf expressions
- Media queries: Not applicable in PDF context
- JavaScript: No script execution at all
- Advanced selectors: Some CSS3 pseudo-classes may not work
Page Headers and Footers
OpenHtmlToPdf supports running headers and footers through CSS paged media. You can include page numbers, document titles, and dates.
@page {
size: A4;
margin: 2cm;
@bottom-center {
content: "Page " counter(page) " of " counter(pages);
font-size: 9pt;
color: #999;
}
}
Font Embedding: Consistent Rendering Everywhere
PDF documents embed their fonts so they render identically on every device. With OpenHtmlToPdf, you register TrueType or OpenType fonts and reference them in your CSS.
@font-face {
font-family: 'Roboto';
src: url('fonts/Roboto-Regular.ttf');
font-weight: 400;
font-style: normal;
}
@font-face {
font-family: 'Roboto';
src: url('fonts/Roboto-Bold.ttf');
font-weight: 700;
font-style: normal;
}
body {
font-family: 'Roboto', sans-serif;
font-size: 11pt;
}
Register each font file with the builder. The font subsetting feature in OpenHtmlToPdf includes only the glyphs used in the document, keeping file sizes small. A typical invoice PDF with embedded fonts weighs 50-80 KB.
Performance Characteristics
We benchmarked the stack on an 8-core server with 16 GB RAM, generating a typical two-page invoice with a logo, table of line items, and three embedded fonts:
- Template processing: 8ms average
- PDF rendering: 120ms average
- Total end-to-end: 140ms average
- Memory per render: 25-40 MB
- Throughput: 180 documents/second (with connection pooling and thread pool)
Compare this to headless Chrome, which typically takes 1-3 seconds per document and consumes 200-500 MB per render instance. The JVM-native approach offers an order-of-magnitude improvement in both speed and resource efficiency.
Production Tips
- Use
useFastMode()on the builder unless you need advanced CSS features. It skips some layout calculations and improves speed by 20-30%. - Pre-compile templates by calling
engine.process()once at startup. Thymeleaf caches compiled templates, but the first call incurs parsing overhead. - Pool font resources. Loading fonts from disk on every render is wasteful. Load them once and pass the
Filereferences to each builder. - Set a render timeout. Malformed HTML or extremely large tables can cause the renderer to hang. Wrap the render call in a thread with a timeout.
The Thymeleaf + OpenHtmlToPdf stack powers the core rendering engine at Doxnex, generating thousands of documents per hour with consistent quality and sub-200ms latency.
Frequently Asked Questions
Can OpenHtmlToPdf handle CSS3 features like Flexbox or Grid?
No. OpenHtmlToPdf supports CSS 2.1 with some CSS3 extensions for paged media. Flexbox and Grid are not supported. Use traditional layout techniques like floats, tables, and inline-block for positioning. The CSS paged media module (@page, page-break-before, etc.) is well-supported and is the primary advantage of this approach.
How do I embed custom fonts in the generated PDF?
Place TTF or OTF font files in your classpath, declare them with @font-face in your CSS, and register the font directory with the OpenHtmlToPdf builder using the useFont method. The fonts are then embedded directly into the PDF, ensuring consistent rendering on any device.
What is the performance of Thymeleaf + OpenHtmlToPdf compared to other solutions?
For a typical one-page document, Thymeleaf template processing takes 5-15ms and OpenHtmlToPdf rendering takes 50-200ms, resulting in sub-250ms total generation time. This is significantly faster than headless browser approaches which typically take 1-3 seconds. The JVM-native approach also uses far less memory, typically 20-50MB per render versus 200-500MB for Chrome headless.
Can I generate PDFs with dynamic images and charts?
Yes. OpenHtmlToPdf supports images via standard img tags. For dynamic charts, generate SVG on the server side and embed it inline in the HTML. SVG rendering is well-supported. Avoid JavaScript-dependent chart libraries as OpenHtmlToPdf does not execute JavaScript.