Apache PDFBox - Get Info
π§ Operation Name
Apache PDFBox - Get Info
extractInfo
π§Ύ Description
Extracts metadata and structural details from a PDF document. This includes properties like author, title, number of pages, creation/modification dates, and file size.
β
Inputs
PDF File [Binary]
InputStream (Binary)
β
The PDF document for which to extract information.
π€ Output
Attributes:
PdfBoxFileAttributesA custom object containing metadata and structural details:FieldTypeDescriptionnumberOfPagesintTotal number of pages in the PDF
pdfSizelongSize in bytes
titleStringDocument title
authorStringAuthor metadata
subjectStringSubject metadata
keywordsStringKeywords metadata
creatorStringTool or system used to create the PDF
producerStringPDF producer metadata
creationDateStringDate created (ISO-8601)
modificationDateStringDate modified (ISO-8601)
π§ͺ MuleSoft Flow Example
Hereβs how to call this operation in a MuleSoft flow:

π Notes
The operation does not modify the PDFβonly reads metadata.
Ideal for auditing, indexing, or validating PDFs before further processing.
Underlying Application Interface:
Last updated