Franko Franko - 19 days ago 4
HTML Question

iTextSharp HTML to PDF conversion - unable to change font

I'm creating some PDF documents with iTextSharp (5.5.7.0) from HTML in ASP.NET MVC5 application, but I'm unable to change the font. I've tried almost everything that I was able to find on SO or from some other resources.

Code for PDF generation is as follows:

public Byte[] GetRecordsPdf(RecordsViewModel model)
{
var viewPath = "~/Template/RecordTemplate.cshtml";
var renderedReport = RenderViewToString(viewPath, model);

FontFactory.RegisterDirectory(Environment.GetFolderPath(Environment.SpecialFolder.Fonts));

using (var ms = new MemoryStream())
{
using (var doc = new Document())
{
doc.SetPageSize(PageSize.A4.Rotate());

using (var writer = PdfWriter.GetInstance(doc, ms))
{
doc.Open();

using (var html = new MemoryStream(Encoding.Default.GetBytes(renderedReport)))
{
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, html, Encoding.Default);
}

doc.Close();
}
}

var bytes = ms.ToArray();
return bytes;
}
}


Actual HTML is contained in renderedReport string variable (I have strongly typed .cshtml file which I render using MVC Razor engine and then return HTML in string).

I've tried to register some specific fonts, but that didn't help. I've also tried to register all fonts on my machine (as shown in example above), but that also didn't help. The fonts were loaded I've checked that in debug mode.

CSS is embedded in HTML file (in heading, style tag) like this:

body {
font-size: 7px;
font-family: Comic Sans MS;
}


(for test, I've decided to use Comic Sans, because I can recognize it with ease, I'm more interested in Arial Unicode MS actually).

And I'm actually able to change the font with that font-family attribute from CSS, but only from fonts that are preloaded by iTextSharp by default - Times New Roman, Arial, Courier, and some other (Helvetica i think). When I change it to - Comic Sans, or some other that is not preloaded iTextSharp renders with default font (Arial I would say).

The reason why I need to change the font is because I have some Croatian characters in my rendered HTML (ČĆŠĐŽčćšđž) which are missing from PDF, and currently I think the main reason is - font.

What am I missing?

Answer

A couple of things to make this work.

First, XMLWorkerHelper doesn't use FontFactory by default, you need to use one of the overloads to ParseXHtml() that takes an IFontProvider. Both of those overloads require that you specify a Stream for a CSS file but you can just pass null if your CSS lives inside your HTML file. Luckily FontFactory has a static property that implements this that you can use called FontFactory.FontImp

//                                                                                 **This guy**
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHTML, null, Encoding.UTF8, FontFactory.FontImp);

Second, I know that you said that you tried registering your entire font directory out of desperation but that can be a rather expensive call. If you can, always try to just register the fonts you need. Although optional, I also strongly recommend that you explicitly define the font's alias because fonts can have several names and they're not always what we think.

FontFactory.Register(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "comic.ttf"), "Comic Sans MS");

Third, and this might not affect you, but any tags not present in the HTML, even if they're logically implied, won't get styling applied to them from CSS. That sounds weird so to say it differently, if your HTML is just <p>Hello</p> and your CSS is body{font-size: 7px;}, the font size won't get applied because your HTML is missing the <body> tag.

Fourth, and this is optional, but usually its easier to specify your HTML and CSS separately from each other which I'll do in the example below.

Your code was 95% there so with just a couple of tweaks it should work. Instead of a view I'm just parsing raw HTML and CSS but you can modify as needed. Please do remember (and I think you know this) that iTextSharp cannot process ASP.Net, only HTML, so you need to make sure that your ASP.Net to HTML conversion process is sane.

//Sample HTML and CSS
var html = @"<body><p>Sva ljudska bića rađaju se slobodna i jednaka u dostojanstvu i pravima. Ona su obdarena razumom i sviješću i trebaju jedna prema drugima postupati u duhu bratstva.</p></body>";
var css = "body{font-size: 7px; font-family: Comic Sans MS;}";

//Register a single font
FontFactory.Register(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "comic.ttf"), "Comic Sans MS");

//Placeholder variable for later
Byte[] bytes;

using (var ms = new MemoryStream()) {
    using (var doc = new Document()) {
        doc.SetPageSize(PageSize.A4.Rotate());

        using (var writer = PdfWriter.GetInstance(doc, ms)) {
            doc.Open();

            //Get a stream of our HTML
            using (var msHTML = new MemoryStream(Encoding.UTF8.GetBytes(html))) {

                //Get a stream of our CSS
                using (var msCSS = new MemoryStream(Encoding.UTF8.GetBytes(css))) {

                    XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHTML, msCSS, Encoding.UTF8, FontFactory.FontImp);
                }
            }

            doc.Close();
        }
    }

    bytes = ms.ToArray();
}
Comments