tulians tulians - 3 months ago 11
Perl Question

Multiple CGI Perl scripts

This is kind of a theoretical question. I'm trying to develop a Perl application based on the producer-consumer paradigm. One of the scripts creates a file with data, while the other reads the data and has to present it in a HTML. There's also a third file, a HTML form, that starts the producer perl file.

What I don't know is how to run both the producer and the consumer at the same time using CGI, and I couldn't find information about it online (at least not how I searched for it).

I would like to know if you could tell me where to find this kind of information so I could test the app in the Apache server.

Thanks in advanced.

Answer

Disclaimer: I think what this question boils down to is how to have two different components of a program interact with each other to create one application that is accessible from the web. If that is not what you want, just treat this as food for thought.

Common Gateway Interface

You are talking about CGI scripts in your question. (Emphasis mine).

I'm trying to develop a Perl application based on the producer-consumer paradigm. One of the scripts creates a file with data, while the other reads the data and has to present it in a HTML.

In general, CGI works in a way that a request goes through a web server, and is passed on to an application. That application might be written in Perl. If it is a Perl script, then that script is run by the perl interpreter. The web server starts that process. It can access the request information through the CGI, which is mostly environment variables. When the process is done, it writes data to STDOUT, which the web server takes as a response and sends back.

+-----------+        +-------------+                     +----------------+
|           | +----> |             | +-----Request-----> |                |
|  Browser  |        | Web server  |                     |  perl foo.cgi  |
|           | <----+ |             | <-----Response----+ |                |
+-----------+        +-------------+                     +----------------+

Now because there is only one process involved behind the web server, you cannot have two scripts. There is just no way for the server to communicate with two things at the same time. That's not how CGI works.

A combined approach

Instead, you need to wrap your two scripts into a single point of entry and turn them into some kind of components. Then you can have them talk to each other internally, while on the outside the web server is only waiting for one program to finish.

+-----------+        +-------------+                     +-----------------+
|           | +----> |             | +-----Request-----> |                 |
|  Browser  |        | Web server  |                     |  perl foo.cgi   |
|           | <----+ |             | <-----Response----+ |                 |
+-----------+        +-------------+                     | +-------------+ |
                                                         | |  Producer   | |
                                                         | +-----+-------+ |
                                                         |       |         |
                                                         |       |         |
                                                         |       V         |
                                                         | +-------------+ |
                                                         | | Consumer    | |
                                                         | +-------------+ |
                                                         |                 |
                                                         +-----------------+

To translate this into Perl, let's first determine some terminology.

  • script: a Perl program that is in a .pl file and that does not have its own package
  • module: a Perl module that is in a .pm file and that has a package with a namespace that fits to the file name

Let's assume you have these two Perl scripts that we call producer.pl and consumer.pl. They are heavily simplified and do not take any arguments into account.

producer.pl

#!/usr/bin/perl
use strict;
use warnings 'all';
use CGI;

open my $fh, '>', 'product.data' or die $!;
print $fh "lots of data\n";
close $fh;

consumer.pl

#!/usr/bin/perl
use strict;
use warnings 'all';
use CGI;

my $q = CGI->new;
print $q->header('text/plain');

open my $fh, '<', 'product.data' or die $!;
while my $line (<$fh>) {
    print $line;
}

exit;

This is as simplified as it gets. There is one script that creates data and one that consumes it. Now we need to make these two interact without actually running them.

Let's jump ahead and assume that we have already refactored both of these scripts and turned them into modules. We'll see how that works a bit later. We can now use those modules in our new foo.pl script. It will process the request, ask the producer for the data and let the consumer turn the data into the format the reader wants.

foo.pl

#!/usr/bin/perl
use strict;
use warnings 'all';
use Producer; # this was producer.pl
use Consumer; # this was consumer.pl
use CGI;

my $q = CGI->new;

my $params; # those would come from $q and are the parameters for the producer

my $product = Producer::produce($params);
my $output = Consumer::consume($product);

print $q->header;
print $output;

exit;

This is very straightforward. We read the parameter from the CGI, pass them to the producer, and pass the product to the consumer. That gives us output, which we print out so it goes back to the server, which sends a response.

Let's take a look at how we turned the two scripts into simple modules. Those do not need to be object oriented, though that might be preferred. Note that the spelling of the file names is now different. Module names conventionally start with capital letters.

Producer.pm

package Producer;
use strict;
use warnings 'all';

sub produce {
    my @args = @_;

    return "lots of data\n";
}

1;

Consumer.pm

package Consumer;
use strict;
use warnings 'all';

sub consume {
    my ($data) = @_;

    return $data; # this is really simple
}

1;

Now we have two modules that do the same as the scripts if you call the right function. All I did was put a namespace (package) at the top and wrap the code in a sub. I also removed the CGI part.

In our example, it's not necessary for the producer to write to a file. It can just return the data structure. The consumer in turn doesn't need to read from a file. It just takes a variable with the data structure and does stuff to it to present it.

If you stick to consistent function names (like produce and consume, just better), you can even write multiple producers or consumers. We have basically defined an interface here. That gives us the possibility to refactor the internals of the code without breaking compatibility, but also to stick in completely different producers or consumers. You can switch from the one-line-string producer to one that looks up stuff in a database in a heartbeat, as long as you stick to your interface.

Essentially, what we just did can also be shown like this:

+--foo.pl---------------------------+
|                                   |
|  +------+        +-------------+  |
|  |      | +----> |             |  |
|  |      |        |  Producer   |  |
|  |      | <----+ |             |  |
|  | main |        +-------------+  |
|  | foo  |                         |
|  | body |        +-------------+  |
|  |      | +----> |             |  |
|  |      |        |  Consumer   |  |
|  |      | <----+ |             |  |
|  +------+        +-------------+  |
|                                   |
+-----------------------------------+

This might look slightly familiar. It is essentially the Model-View-Controller (MVC) pattern. In a web context, the model and the view often only talk to each other through the controller, but it's pretty much the same.

Our producer is a data model. The consumer turns the data into a website that the user can see, so it's the view. The main program inside foo.pl that glues both of them together controls the flow of data. It's the controller.

The initial website that triggers the whole thing could be either part of the program, and be shown if no parameters are passed, or could be a stand-alone .html file. That's up to you.

All of this is possible with plain old CGI. You don't need to use any web frameworks for it. But as you grow your application, you'll see that a modern framework makes your life easier.


The diagrams where created with http://asciiflow.com/

Comments