PDFtoolkit VCL
Edit, enhance, secure, merge, split, view, print PDF and AcroForms documents
Compatibility
Delphi C++Builder

How To Extract XMP Metadata of a PDF Document

Extract meta data that others might miss.
By V. Subhash

Metadata means data about data. In PDF, the document properties such as title, subject, and keywords can be considered as meta data. Apart from this, applications may add other meta data, under the Adobe XMP specification.

PDFtoolkit provides a method TgtPDFDocument.GetXMLMetadata() to retrieve this meta data. The XMP specification requires that the meta data is stored in XML (eXtensible Markup Language).

Here is how you can extract the meta data.

{
 This program shows how to retrieve XML meta data
 of a PDF document.
}
program TgtCustomPDFDocument_GetXMLMetadata;

{$APPTYPE CONSOLE}

uses
  SysUtils, Classes, ShellApi,
  gtPDFDoc, gtCstPDFDoc;
var
  gtPDFDocument1: TgtPDFDocument;
  strXMLData: String;
  strXMLDataList: TStringList;
begin

  // Create a document object
  gtPDFDocument1 := TgtPDFDocument.Create(Nil);

  try
    // Load a document
    gtPDFDocument1.LoadFromFile('sample_doc.pdf');

    // Check if the document was loaded successfully
    if gtPDFDocument1.IsLoaded then
      begin
        // Obtain XML meta data
        strXMLData := gtPDFDocument1.GetXMLMetadata;

        // Write XML meta data to a text file
        strXMLDataList := TStringList.Create;
        strXMLDataList.Add(strXMLData);
        strXMLDataList.SaveToFile('sample_doc_pdf.xml');
        strXMLDataList.Free;

        // Launch the XML text file
        ShellExecute(0, 'open', 
                     'sample_doc_pdf.xml',nil,nil,1) ;
      end
    else
      Writeln('Sorry, I could not load sample_doc.pdf.');
  except
    on Err:Exception do
      begin
        Writeln('Sorry, an exception was raised. ');
        Writeln(Err.Classname + ': ' + Err.Message);
      end;
  end;

  // Free resources
  gtPDFDocument1.Reset;
  // Destroy PDF document object
  FreeAndNil(gtPDFDocument1);

end.
Screenshot of XML meta data extracted from a PDF document
Privacy | Legal | Feedback | Newsletter © 2002-2010 Gnostice Information Technologies Private Limited. All rights reserved.

This site is best viewed on a screen with minimum resolution of 1152 x 864 pixels. Windows users are advised to use Microsoft ClearType Tuning for optimal experience. Linux and other users can enable font smoothing, as supported by their OS. Also, please use the latest version of a standards-compliant browser such as Opera, FireFox, Chrome or Safari.