Developer Tools

Create new documents
Gnostice Document Studio Java	Gnostice Document Studio .NET
eDocEngine VCL
Edit/enhance/view/print/convert PDFs
Gnostice Document Studio Java	Gnostice Document Studio .NET
PDFtoolkit VCL
Work with multiple document formats
Gnostice Document Studio .NET	Gnostice Document Studio Delphi
Gnostice Document Studio Java

Office Productivity Applications

View, print & export documents
Free PDF Reader

Platform-Agnostic APIs

For pay-as-you-go models, startups…
StarDocs

Home » Newsletter » December 2008

How To Extract XMP Metadata of a PDF Document

Extract meta data that others might miss.

By V. Subhash

Metadata means data about data. In PDF, the document properties such as title, subject, and keywords can be considered as meta data. Apart from this, applications may add other meta data, under the Adobe XMP specification.

PDFtoolkit provides a method TgtPDFDocument.GetXMLMetadata() to retrieve this meta data. The XMP specification requires that the meta data is stored in XML (eXtensible Markup Language).

Here is how you can extract the meta data.

{
 This program shows how to retrieve XML meta data
 of a PDF document.
}
program TgtCustomPDFDocument_GetXMLMetadata;

{$APPTYPE CONSOLE}

uses
  SysUtils, Classes, ShellApi,
  gtPDFDoc, gtCstPDFDoc;
var
  gtPDFDocument1: TgtPDFDocument;
  strXMLData: String;
  strXMLDataList: TStringList;
begin

  // Create a document object
  gtPDFDocument1 := TgtPDFDocument.Create(Nil);

  try
    // Load a document
    gtPDFDocument1.LoadFromFile('sample_doc.pdf');

    // Check if the document was loaded successfully
    if gtPDFDocument1.IsLoaded then
      begin
        // Obtain XML meta data
        strXMLData := gtPDFDocument1.GetXMLMetadata;

        // Write XML meta data to a text file
        strXMLDataList := TStringList.Create;
        strXMLDataList.Add(strXMLData);
        strXMLDataList.SaveToFile('sample_doc_pdf.xml');
        strXMLDataList.Free;

        // Launch the XML text file
        ShellExecute(0, 'open', 
                     'sample_doc_pdf.xml',nil,nil,1) ;
      end
    else
      Writeln('Sorry, I could not load sample_doc.pdf.');
  except
    on Err:Exception do
      begin
        Writeln('Sorry, an exception was raised. ');
        Writeln(Err.Classname + ': ' + Err.Message);
      end;
  end;

  // Free resources
  gtPDFDocument1.Reset;
  // Destroy PDF document object
  FreeAndNil(gtPDFDocument1);

end.

---o0O0o---

How To Extract XMP Metadata of a PDF Document

Extract meta data that others might miss.

By V. Subhash

Gnostice Document Studio .NET

PDFOne .NET

Gnostice Document Studio Delphi

eDocEngine VCL

PDFtoolkit VCL

Gnostice Document Studio Java

PDFOne (for Java)

StarDocs