I'm trying to read the appearance stream of a PDF annotation, using iTextSharp, and get the content text from the stream.
I'm using the following code:
public String ExtractAnnotationText(PdfStream xObject)
PdfDictionary resources = xObject.GetAsDict(PdfName.RESOURCES);
ITextExtractionStrategy strategy = new LocationTextExtractionStrategy();
PdfContentStreamProcessor processor = new PdfContentStreamProcessor(strategy);
byte contentByteArray = ContentByteUtils.GetContentBytesFromContentObject(xObject);
PRStream value = (PRStream)appearancesDictionary.GetAsStream(key);
String text = ExtractAnnotationText(value);
[/Matrix, [1, 0, 0, 1, -28.7103, -643.893]]
[/BBox, [28.7103, 643.893, 597.85, 751.068]]
On this the pdf specification declares:
A resource dictionary shall be associated with a content stream in one of the following ways:
For a content stream that is the value of a page’s Contents entry (or is an element of an array that is the value of that entry), the resource dictionary shall be designated by the page dictionary’s Resources or is inherited, as described under 220.127.116.11, "Inheritance of Page Attributes," from some ancestor node of the page object.
For other content streams, a conforming writer shall include a Resources entry in the stream's dictionary specifying the resource dictionary which contains all the resources used by that content stream. This shall apply to content streams that define form XObjects, patterns, Type 3 fonts, and annotation.
PDF files written obeying earlier versions of PDF may have omitted the Resources entry in all form XObjects and Type 3 fonts used on a page. All resources that are referenced from those forms and fonts shall be inherited from the resource dictionary of the page on which they are used. This construct is obsolete and should not be used by conforming writers.
(section 7.8.3 - Resource Dictionaries - of ISO 32000-1)
Thus, the example you found either is a case of that third option, or the example simply needs no resources at all, or your example file simply is broken.