Darren Darren - 27 days ago 24
C# Question

Flattening PDF without /AcroForm fields with /Annot Widget

I'm struggling with flattening using iTextSharp 5.5.10. This code represent a variety of efforts to flatten this PDF following code found on SO and Google searches.

void Process()
{
PdfReader reader = new PdfReader(mInFileName);
// 1 type of fields
AcroFields formFields = reader.AcroFields;
int countAcroFields = formFields.Fields.Count;
// 2 another type of fields
PRAcroForm form = reader.AcroForm;
List<PRAcroForm.FieldInformation> fieldList = form.Fields;
// 3 i think this is the same as 2
PdfDictionary acroForm = reader.Catalog.GetAsDict(PdfName.ACROFORM);
PdfArray acroFields = (PdfArray)PdfReader.GetPdfObject(acroForm.Get(PdfName.FIELDS), acroForm);
// 4 another type of fields
XfaForm xfaForm = formFields.Xfa;
bool flatten = false;

if (countAcroFields > 0)
{
//No fields found
}
else if (fieldList.Count > 0)
{
//No fields found
}
else if (xfaForm.XfaPresent == true)
{
//No xfaForms found
ReadXfa(reader);
}
else
{
// Yes, there are annotations and this code extracts them but does NOT flatten them
PdfDictionary page = reader.GetPageN(1);
PdfArray annotsArray = page.GetAsArray(PdfName.ANNOTS);
if ( annotsArray == null)
return;
else
{
List<string> namedFieldToFlatten = new List<string>();
foreach (var item in annotsArray)
{
var annot = (PdfDictionary)PdfReader.GetPdfObject(item);

var type = PdfReader.GetPdfObject(annot.Get(PdfName.TYPE)).ToString(); //Expecting be /Annot
var subtype = PdfReader.GetPdfObject(annot.Get(PdfName.SUBTYPE)).ToString(); //Expecting be /Widget
var fieldType = PdfReader.GetPdfObject(annot.Get(PdfName.FT)).ToString(); //Expecting be /Tx

if (annot.Get(PdfName.TYPE).Equals(PdfName.ANNOT) &&
annot.Get(PdfName.SUBTYPE).Equals(PdfName.WIDGET))
{
if (annot.Get(PdfName.FT).Equals(PdfName.TX))
{
flatten = true;
var textLabel = PdfReader.GetPdfObject(annot.Get(PdfName.T)).ToString(); //Name of textbox field
namedFieldToFlatten.Add(textLabel);
var fieldValue = PdfReader.GetPdfObject(annot.Get(PdfName.V)).ToString(); //Value of textbox
Console.WriteLine($"Found Label={textLabel} Value={fieldValue}");
}
}
}

if (flatten == true)
{
// Flatten the PDF [11/9/2016 15:10:06]
string foldername = Path.GetDirectoryName(mInFileName);
string basename = Path.GetFileNameWithoutExtension(mInFileName);
string outName = $"{foldername}\\{basename}_flat.pdf";
using (var fStream = new FileStream(outName, FileMode.Create))
{
//This totally removes the fields instead of flattening them
var stamper = new PdfStamper(reader, fStream) { FormFlattening = true, FreeTextFlattening = true, AnnotationFlattening = true };
var stamperForm = stamper.AcroFields;
stamperForm.GenerateAppearances = true;
foreach (var item in namedFieldToFlatten)
{
stamper.PartialFormFlattening(item);
}
stamper.Close();
reader.Close();
}
}
}
}
}


Any tips on how to turn the "fields" into text as flatten does in Acrobat?

This is the PDF I'm trying to flatten.

This is what the output of this code.

Answer

The problem is that your form is broken. It indeed has /Annots, and those annotations are widget annotations that also have entries that are meant to be entries in a field dictionary, but there is no form in the PDF. Or rather: there is a form, but the /Fields array is empty.

As the /Fields array is empty, there are no fields to flatten. The widget annotations are removed and you don't see anything. This is not an iText bug: you need to fix the form before filling it out.

In Java, you'd fix the form like this:

PdfReader reader = new PdfReader(src);
PdfDictionary root = reader.getCatalog();
PdfDictionary form = root.getAsDict(PdfName.ACROFORM);
PdfArray fields = form.getAsArray(PdfName.FIELDS);

PdfDictionary page;
PdfArray annots;
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
    page = reader.getPageN(i);
    annots = page.getAsArray(PdfName.ANNOTS);
    for (int j = 0; j < annots.size(); j++) {
        fields.add(annots.getAsIndirectObject(j));
    }
}
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.close();
reader.close();

It should be fairly easy to port this to C#. I'm not a C# developer, but this is an (untested) attempt at making a port:

PdfReader reader = new PdfReader(src);
PdfDictionary root = reader.Catalog;
PdfDictionary form = root.GetAsDict(PdfName.ACROFORM);
PdfArray fields = form.GetAsArray(PdfName.FIELDS);
PdfDictionary page;
PdfArray annots;
for (int i = 1; i <= reader.NumberOfPages; i++) {
    page = reader.GetPageN(i);
    annots = page.GetAsArray(PdfName.ANNOTS);
    for (int j = 0; j < annots.Size; j++) {
        fields.Add(annots.GetAsIndirectObject(j));
    }
}
PdfStamper stamper = new PdfStamper(reader, new FileStream(dest, FileMode.Create));
stamper.Close();
reader.Close();

Once you've fixed the form like this, you'll be able to flatten it correctly.