Cris Cris - 18 days ago 7
Perl Question

Printing large files using perl CGI

I need to print text on browser with my perl CGI application, the file i need to print can be of large size 30 - 100MB or more.
I have the following code

$lns=$_[0];
open(my $fh, '<:encoding(UTF-8)', $lns) or die "Could not open file '$lns' $!";
while (my $l = <$fh>) {
chomp $l;
print "<br>$l";

}


Which works well but it's very slow displaying large file, 30 MB file take up 15 minutes.
Is there a way to speed it up?
To clarify, file start to be displayed immediately but browser keep loading and displaying new lines for 10 - 15 minutes for 30 - 40 MB file.
the file is static and is not being modified.

Answer

You mentioned in the comments:

The input file is fixed text file, it won't change and won't be modified

Therefore, you should do the conversion from the input to the output format once and redirect visitors to the location of the already generated output file. Let the server software handle sending it.

Also, check how well the same browser deals with displaying the same content when it is loaded from the local file system.

Keep in mind that you seem not to be sending text/plain content. For some reason, you are sending text/html and inserting line breaks manually. Let's say each line is 512 bytes. For a 40 MB file, that's more than 80,000 nodes in the DOM. If each line is 80 bytes, we are talking about almost 525,000 nodes in the DOM. That may cause issues with the browser.

To test this, I created a file using:

#!/usr/bin/env perl

use strict;
use warnings;

my $CHARS_PER_LINE = 72;
my $text = 'x' x $CHARS_PER_LINE;

for my $i (1 .. 40 * 1024 * 1024 / $CHARS_PER_LINE) {
    print "<br>$text\n";
}

This gave me the following file:

$ ls -lh ytt.html
-rw-r--r--  1 abc abc    43M Nov 18 13:31 ytt.html

Then, I loaded this document in Firefox from the SSD on my MacBook Pro with 16GB memory. This caused CPU usage to spike to 100% for almost 30 seconds, caused Firefox to become unresponsive, and caused Firefox to allocate an extra 2.5GB memory. Now, if the computer you are using does not have a lot of spare memory, it would have to swap to disk. In that case, I can easily envision scenarios that cause serious usability issues.

Opening the same file as plain text was less painful but not great either.

A question you have to answer is whether there is any good reason for this document to be displayed in the browser, or should it be just available as a download?

In any case, you should also ensure that your web server software applies compression to text files so you can serve this file using about 10% of the bandwidth that would be required otherwise.