GoldenNewby GoldenNewby - 1 month ago 11x
Perl Question

Tips for keeping Perl memory usage low

What are some good tips for keeping memory usage low in a Perl script? I am interested in learning how to keep my memory footprint as low as possible for systems depending on Perl programs. I know Perl isn't great when it comes to memory usage, but I'd like to know if there are any tips for improving it.

So, what can you do to keep a Perl script using less memory. I'm interested in any suggestions, whether they are actual tips for writing code, or tips for how to compile Perl differently.

Edit for Bounty:
I have a perl program that serves as a server for a network application. Each client that connects to it gets it's own child process currently. I've used threads instead of forks as well, but I haven't been able to determine if using threads instead of forks is actually more memory efficient.

I'd like to try using threads instead of forks again. I believe in theory it should save on memory usage. I have a few questions in that regard:

  1. Do threads created in Perl prevent copying Perl module libraries
    into memory for each thread?

  2. Is threads (use threads) the most efficient way (or the only)
    way to create threads in Perl?

  3. In threads, I can specify a stack_size paramater, what specifically
    should I consider when specifying this value, and how does it impact
    memory usage?

With threads in Perl/Linux, what is the most reliable method to determine the actual memory usage on a per-thread basis?


What sort of problem are you running into, and what does "large" mean to you? I have friends you need to load 200 Gb files into memory, so their idea of good tips is a lot different than the budget shopper for minimal VM slices suffering with 250 Mb of RAM (really? My phone has more than that).

In general, Perl holds on to any memory you use, even if it's not using it. Realize that optimizing in one direction, e.g. memory, might negatively impact another, such as speed.

This is not a comprehensive list (and there's more in Programming Perl):

☹ Use Perl memory profiling tools to help you find problem areas. See Profiling heap memory usage on perl programs and How to find the amount of physical memory occupied by a hash in Perl?

☹ Use lexical variables with the smallest scope possible to allow Perl to re-use that memory when you don't need it.

☹ Avoid creating big temporary structures. For instance, reading a file with a foreach reads all the input at once. If you only need it line-by-line, use while.

 foreach ( <FILE> ) { ... } # list context, all at once 
 while( <FILE> ) { ... } # scalar context, line by line

☹ You might not even need to have the file in memory. Memory-map files instead of slurping them

☹ If you need to create big data structures, consider something like DBM::Deep or other storage engines to keep most of it out of RAM and on disk until you need it.

☹ Don't let people use your program. Whenever I've done that, I've reduced the memory footprint by about 100%. It also cuts down on support requests.

☹ Pass large chunks of text and large aggregates by reference so you don't make a copy, thus storing the same information twice. If you have to copy it because you want to change something, you might be stuck. This goes both ways as subroutine arguments and subroutine return values:

 call_some_sub( \$big_text, \@long_array );
 sub call_some_sub {
      my( $text_ref, $array_ref ) = @_;
      return \%hash;

☹ Track down memory leaks in modules. I had big problems with an application until I realized that a module wasn't releasing memory. I found a patch in the module's RT queue, applied it, and solved the problem.

☹ If you need to handle a big chunk of data once but don't want the persistent memory footprint, offload the work to a child process. The child process only has the memory footprint while it's working. When you get the answer, the child process shuts down and releases it memory. Similarly, work distribution systems, such as Gearman, can spread work out among machines.

☹ Turn recursive solutions into iterative ones. Perl doesn't have tail recursion optimization, so every new call adds to the call stack. You can optimize the tail problem yourself with tricks with goto or a module, but that's a lot of work to hang onto a technique that you probably don't need.

☹ Did he use 6 Gb or only five? Well, to tell you the truth, in all this excitement I kind of lost track myself. But being as this is Perl, the most powerful language in the world, and would blow your memory clean off, you've got to ask yourself one question: Do I feel lucky? Well, do ya, punk?

There are many more, but it's too early in the morning to figure out what those are. I cover some in Mastering Perl and Effective Perl Programming.