Apache PDFBox - Get Info
π§ Operation Name
Apache PDFBox - Get Info
extractInfo
π§Ύ Description
Extracts metadata and structural details from a PDF document. This includes properties like author, title, number of pages, creation/modification dates, and file size.
β
Inputs
PDF File [Binary]
InputStream
(Binary)
β
The PDF document for which to extract information.
π€ Output
Attributes:
PdfBoxFileAttributes
A custom object containing metadata and structural details:FieldTypeDescriptionnumberOfPages
int
Total number of pages in the PDF
pdfSize
long
Size in bytes
title
String
Document title
author
String
Author metadata
subject
String
Subject metadata
keywords
String
Keywords metadata
creator
String
Tool or system used to create the PDF
producer
String
PDF producer metadata
creationDate
String
Date created (ISO-8601)
modificationDate
String
Date modified (ISO-8601)
π§ͺ MuleSoft Flow Example
Hereβs how to call this operation in a MuleSoft flow:
π Notes
The operation does not modify the PDFβonly reads metadata.
Ideal for auditing, indexing, or validating PDFs before further processing.
Underlying Application Interface:
Last updated