PDFtoolkit VCL
Edit, enhance, secure, merge, split, view, print PDF and AcroForms documents
Compatibility
Delphi C++Builder

How To Extract XMP Metadata of a PDF Document

Extract meta data that others might miss.
By V. Subhash

Metadata means data about data. In PDF, the document properties such as title, subject, and keywords can be considered as meta data. Apart from this, applications may add other meta data, under the Adobe XMP specification.

PDFtoolkit provides a method TgtPDFDocument.GetXMLMetadata() to retrieve this meta data. The XMP specification requires that the meta data is stored in XML (eXtensible Markup Language).

Here is how you can extract the meta data.

{
 This program shows how to retrieve XML meta data
 of a PDF document.
}
program TgtCustomPDFDocument_GetXMLMetadata;

{$APPTYPE CONSOLE}

uses
  SysUtils, Classes, ShellApi,
  gtPDFDoc, gtCstPDFDoc;
var
  gtPDFDocument1: TgtPDFDocument;
  strXMLData: String;
  strXMLDataList: TStringList;
begin

  // Create a document object
  gtPDFDocument1 := TgtPDFDocument.Create(Nil);

  try
    // Load a document
    gtPDFDocument1.LoadFromFile('sample_doc.pdf');

    // Check if the document was loaded successfully
    if gtPDFDocument1.IsLoaded then
      begin
        // Obtain XML meta data
        strXMLData := gtPDFDocument1.GetXMLMetadata;

        // Write XML meta data to a text file
        strXMLDataList := TStringList.Create;
        strXMLDataList.Add(strXMLData);
        strXMLDataList.SaveToFile('sample_doc_pdf.xml');
        strXMLDataList.Free;

        // Launch the XML text file
        ShellExecute(0, 'open', 
                     'sample_doc_pdf.xml',nil,nil,1) ;
      end
    else
      Writeln('Sorry, I could not load sample_doc.pdf.');
  except
    on Err:Exception do
      begin
        Writeln('Sorry, an exception was raised. ');
        Writeln(Err.Classname + ': ' + Err.Message);
      end;
  end;

  // Free resources
  gtPDFDocument1.Reset;
  // Destroy PDF document object
  FreeAndNil(gtPDFDocument1);

end.
Screenshot of XML meta data extracted from a PDF document

---o0O0o---

Our Developer Tools
eDocEngine VCL

A Delphi/C++Builder component suite for creating documents in over 20 formats and also export reports from popular Delphi reporting tools.

PDFtoolkit VCL

A Delphi/C++Builder component suite to edit, enhance, view, print, merge, split, encrypt, annotate, and bookmark PDF documents.

XtremePDFConverter VCL

A Delphi/C++Builder component to intelligently convert PDF to user-friendly Word RTF documents.

PDFOne .NET

A .NET PDF component suite to create, edit, view, print, reorganize, encrypt, annotate, and bookmark PDF documents in .NET applications.

XtremeDocumentStudio .NET

Multi-format document-processing component suite for .NET developers

PDFOne (for Java™)

A Java™ PDF component suite to create, edit, view, print, reorganize, encrypt, annotate, bookmark PDF documents in Java™ applications.

XtremeFontEngine (for Java)

Java font engine to render glyphs from Type 1, Type 2 (CFF), and TrueType fonts

Our Office Productivity Applications
Free PDF Reader

A free, fast, and portable application for viewing, printing and converting PDF documents.

Privacy | Legal | Feedback | Newsletter | Resellers © 2002-2013 Gnostice Information Technologies Private Limited. All rights reserved.

This site is best viewed on a screen with minimum resolution of 1152 x 864 pixels. Windows XP users are advised to use Microsoft ClearType Tuning for optimal experience. Also, please use the latest version of a standards-compliant browser such as Firefox, Opera, or Dragon (Chromium).