sonicbhoc sonicbhoc - 3 months ago 21
Vb.net Question

If I enable Smart Mode, a NullReferenceException is thrown

I'm trying to merge a group of PDFs together and ensure that they are compressed well, and that there are no duplicate resources. However, in my code, if I call SetSmartMode(true) on my writer, the first write to it will always result in a NullReferenceException.

Here is my (vb.net) code:

Private Function CombinePdfBatch(pdfMetaData As IEnumerable(Of QuestDataSet.MetaDataRow), batchNumber As Integer,
fileNamePrefix As String, outputDir As String) As String

Dim outputFileName As String = Path.Combine(outputDir, fileNamePrefix & "_" & batchNumber & ".pdf")

Using combinedPdf As New PdfDocument(New PdfWriter(Path.Combine(outputDir, fileNamePrefix & "_" & batchNumber & ".pdf")).SetSmartMode(True))

'Make sure we close the underlying stream when we're done with the combination
combinedPdf.SetCloseWriter(True)
combinedPdf.SetCloseReader(False)
combinedPdf.SetFlushUnusedObjects(False)
combinedPdf.GetWriter().SetCompressionLevel(CompressionConstants.BEST_COMPRESSION)
combinedPdf.GetWriter().SetCloseStream(True)
combinedPdf.SetDefaultPageSize(New Geom.PageSize(630, 810))

Dim merger As New PdfMerger(combinedPdf)

For Each currentMD As QuestDataSet.MetaDataRow In pdfMetaData
Using currentPDF As New PdfDocument(New PdfReader(Path.Combine(programPaths.Input, currentMD.ReceivedFilesRowByInputFileRelation.FileName)))
currentPDF.SetCloseReader(True)
currentPDF.SetCloseWriter(False)
currentPDF.GetReader().SetCloseStream(True)

currentMD.CombinedFileName = outputFileName
currentMD.StartPage = combinedPdf.GetNumberOfPages() + 1
merger.Merge(currentPDF, 1, currentPDF.GetNumberOfPages())
currentMD.EndPage = combinedPdf.GetNumberOfPages()
End Using
Next
merger.Close()
End Using

Return outputFileName
End Function


As soon as
merger.Merge
is called, a
NullReferenceException
is thrown. I've replaced that with many other functions, but if anything is added to the PDF when the writer is in Smart Mode, it crashes.

If I disable Smart Mode, the PDFs are merged. But I need to reduce the size of these PDFs as much as possible without sacrificing too much quality. Since I know they all use the same font and share some stock images, I figured I'd combine them all in order to do so.

EDIT: Here's a stack trace since I love you guys:

System.NullReferenceException occurred
HResult=-2147467261
Message=Object reference not set to an instance of an object.
Source=itext.kernel
StackTrace:
at iText.Kernel.Pdf.PdfWriter.ByteStore.SerDic(PdfDictionary dic, Int32 level, ByteBufferOutputStream bb, IntHashtable serialized)
at iText.Kernel.Pdf.PdfWriter.ByteStore.SerObject(PdfObject obj, Int32 level, ByteBufferOutputStream bb, IntHashtable serialized)
at iText.Kernel.Pdf.PdfWriter.ByteStore..ctor(PdfStream str, IntHashtable serialized)
at iText.Kernel.Pdf.PdfWriter.SmartCopyObject(PdfObject obj)
at iText.Kernel.Pdf.PdfWriter.CopyObject(PdfObject obj, PdfDocument document, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfObject.ProcessCopying(PdfDocument documentTo, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfArray.CopyContent(PdfObject from, PdfDocument document)
at iText.Kernel.Pdf.PdfWriter.CopyObject(PdfObject obj, PdfDocument document, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfObject.ProcessCopying(PdfDocument documentTo, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfDictionary.CopyContent(PdfObject from, PdfDocument document)
at iText.Kernel.Pdf.PdfWriter.CopyObject(PdfObject obj, PdfDocument document, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfObject.ProcessCopying(PdfDocument documentTo, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfObject.CopyTo(PdfDocument document, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfDictionary.CopyTo(PdfDocument document, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfDictionary.CopyTo(PdfDocument document, IList`1 excludeKeys, Boolean allowDuplicating)
at iText.Kernel.Pdf.PdfPage.CopyTo(PdfDocument toDocument, IPdfPageExtraCopier copier)
at iText.Kernel.Pdf.PdfDocument.CopyPagesTo(IList`1 pagesToCopy, PdfDocument toDocument, Int32 insertBeforePage, IPdfPageExtraCopier copier)
at iText.Kernel.Pdf.PdfDocument.CopyPagesTo(IList`1 pagesToCopy, PdfDocument toDocument, IPdfPageExtraCopier copier)
at iText.Kernel.Pdf.PdfDocument.CopyPagesTo(IList`1 pagesToCopy, PdfDocument toDocument)
at iText.Kernel.Utils.PdfMerger.Merge(PdfDocument from, IList`1 pages)
at iText.Kernel.Utils.PdfMerger.Merge(PdfDocument from, Int32 fromPage, Int32 toPage)
at QuestMonolithic.Process.CombinePdfBatch(IEnumerable`1 pdfMetaData, Int32 batchNumber, String fileNamePrefix, String outputDir) in C:\Users\cchrist\Documents\Visual Studio 2012\Projects\Quest_Monolithic\trunk\source\Process.vb:line 594
InnerException:

Answer

It's a known bug in the iText 7 .NET code, and a fix will be deployed soon. The SerDic() method, which is only called when copying in smart-mode handles the retrieval of the dictionary keys incorrectly in .NET, resulting in null-pointers.

If you want to fix it yourself, replace line 592 in itext.kernel.PdfWriter:

dic.KeySet().ToArray(keys);

with

keys = dic.KeySet().ToArray(keys);