David Harper David Harper - 1 year ago 79
Linux Question

USB Serial port programming has "disastrous" results

I am currently working on a C program running on a Raspberry Pi 3 (Linux Ubuntu) that is intended to provide a web page interface for configuring networking on an embedded system.

The code is being developed using Code::Blocks with the GDB debugger. I'm using microhttpd for the web server and that, plus the various web pages, are all working great. I'm now working on the USB Serial link to the embedded system using information in "Serial Programming Guide for POSIX Operating Systems".

The code below is responsible for opening the USB Serial link to the target system and seems to work fine - once. If I close the program and restart it (either standalone on the command line or from within Code::Blocks) the second time microhttpd is hosed - browser windows will no longer connect. Further, from within Code::Blocks the debugger is also hosed - once the program is started it cannot be paused or stopped. The only way is to kill it by closing the project.

The problem is clearly within the function since I can comment out the call to it and everything works as it did previously. Unfortunately, once the problem happens the only solution seems to be to reboot the Pi.

I've done things like this before using a scripting language (Tcl) but this time around I'm looking for a performance boost from a non-interpreted language since the Pi will also be running a high bandwidth data logging program through a similar USB serial interface.

The code is shown below:

/* This function scans through the list of USB Serial ports and tries to */
/* establish communication with the target system. */

void tapCommInit(void) {
char line[128];
char port[15]; // this is always of the form "/dev/TTYACMn"
char *ptr;
FILE *ifd;
struct termios options;
uint8_t msgOut[3], msgIn[4];

msgOut[0] = REQ_ID; // now prepare the message to send
msgOut[1] = 0; // no data so length is zero
msgOut[2] = 0;

/* First, get the list of USB Serial ports. */

system("ls -l /dev/serial/by-path > usbSerial\n"); // get current port list
ifd = fopen("usbSerial", "r");
logIt(fprintf(lfd, "serial ports: \n"));

/* The main loop iterates through the file looking for lines containing */
/* "tty" which should be a valid USB Serial port. The port is configured */
/* in raw mode as 8N1 and an ID request command is sent, which has no */
/* data. If a response is received it's checked to see if the returned */
/* ID is a match. If not, the port is closed and we keep looking. If a */
/* match is found, tapState is set to "UP" and the function returns. If */
/* no match is found, tapState is left in the initial "DOWN" state. */

while(1) {
if (fgets(line, 127, ifd) == NULL) { // end of file?
break; // yes - break out and return
ptr = strstr(line, "tty"); // make sure the line contains a valid entry
if (ptr == NULL) {
continue; // nothing to process on this line
strcpy(port, "/dev/"); // create a correct pathname
strcat(port, ptr); // append the "ttyACMn" part of the line
port[strlen(port)-1] = 0; // the last character is a newline - remove it
logIt(fprintf(lfd," %s\n", port)); // we have a port to process now
cfd = open(port, O_RDWR | O_NOCTTY | O_NDELAY); // cfd is a global int
if (cfd == -1) {
logIt(fprintf(lfd, "Could not open port: %s\n", port));
continue; // keep going with the next one (if any)
fcntl(cfd, F_SETFL, 0); // blocking mode
tcgetattr(cfd, &options); // get the current port settings
options.c_cflag |= (CLOCAL | CREAD); // ena receiver, ignore modem lines
options.c_lflag &= ~(ICANON | ECHO | ECHOE | ISIG); // raw, no echo
options.c_oflag &= ~OPOST; // no special output processing
options.c_cc[VMIN] = 0; // minimum number of raw read characters
options.c_cc[VTIME] = 10; // timeout in deciseconds (1 second timeout)
tcsetattr(cfd, TCSANOW, &options); // set options right now
cfsetispeed(&options, B115200); // input baud rate
cfsetospeed(&options, B115200); // output baud rate
options.c_cflag &= ~(CSIZE | PARENB | // clear size bits, no parity
CSTOPB | CRTSCTS); // 1 stop bit, no hw flow control
options.c_cflag |= CS8; // now set size: 8-bit characters
options.c_cflag &= ~(IXON | IXOFF | IXANY); // no sw flow control

if (write(cfd, msgOut, 3) < 3) {
logIt(fprintf(lfd, "Sending of output message failed\n"));
if (read(cfd, msgIn, 4) != 4) {
logIt(fprintf(lfd, "Didn't get expected amount of return data\n"));
if (msgIn[3] != HOST_ID) {
logIt(fprintf(lfd, "Got the wrong HOST_ID response\n"));
logIt(fprintf(lfd, "Port found - communication established\n"));
tapState = UP;
break; // we're done - break out of the loop
fclose(ifd); // close and remove the file we created

Answer Source

from within Code::Blocks the debugger is also hosed - once the program is started it cannot be paused or stopped

It is far more likely that you do not understand your tools than that you have created an unkillable program.

It's easy enough to figure this out: divide and conquer. You've got a whole pile of unrelated components here. Start separating them and find out which pieces work fine in isolation and which continue to behave badly when disconnected from everything else. Then you'll have your culprit.

Specifically here, that means try running your program outside the IDE, then under command line gdb instead of GDB via the IDE.

Also, it should be possible to run your program without starting the web server piece, so that you can run the serial part of the app in isolation. This is not only good for debugging by minimizing confounding variables, it also encourages a loosely-coupled program design, which is a good thing in its own right.

In the end, you may find that the thing keeping your program from stopping is the web framework, Code::Blocks, or the way GDB operates on the Pi under Code::Blocks, rather than anything to do with the USB to serial adapter.

once the problem happens the only solution seems to be to reboot the Pi

If your program is still running in the background, then of course your next instance will fail if it tries to open the same USB port.

Don't guess, find out:

$ sudo lsof | grep ttyACM


$ lsof -p $(pidof myprogram)

(Substitute pgrep if your system doesn't have pidof.)

I've done things like this before using a scripting language (Tcl) but this time around I'm looking for a performance boost from a non-interpreted language

Your serial port is running at 115,200 bps. Divide that by 10 to account for the stop and start bits, then flip the fraction to get seconds per byte, and you come to 87 microseconds per byte. And you only achieve that when the serial port is running flat-out, sending or receiving 11,500 bytes per second. Wanna take a guess at how many lines of code Tcl can interpret in 87 microseconds? Tcl isn't super-fast, but 87 microseconds is an eternity even in Tcl land.

Then on the other side of the connection, you have HTTP and a [W]LAN, likely adding another hundred milliseconds or so of delay per transaction.

Your need for speed is an illusion.

Now come back and talk to me again when you need to talk to 100 of these asynchronously, and then maybe we can start to justify C over Tcl.

(And I say this as one whose day job involves maintaining a large C++ program that does a lot of serial and network I/O.)

Now lets get to the many problems with this code:

system("ls -l /dev/serial/by-path > usbSerial\n");  // get current port list
ifd = fopen("usbSerial", "r");

Don't use a temporary where a pipe will suffice; use popen() here instead.

while(1) {

This is simply wrong. Say while (!feof(ifd)) { here, else you will attempt to read past the end of the file.

This, plus the next error, is likely the key to your major symptoms.

if (fgets(line, 127, ifd) == NULL) { 

There are several problems here:

  1. You're assuming things about the meaning of the return value that do not follow from the documentation. The Linux fopen(3) man page isn't super clear on this; the BSD version is better:

    The fgets() and gets() functions do not distinguish between end-of-file and error, and callers must use feof(3) and ferror(3) to determine which occurred.

    Because fgets() is Standard C, and not Linux- or BSD-specific, it is generally safe to consult other systems' manual pages. Even better, consult a good generic C reference, such as Harbison & Steele. (I found that much more useful than K&R back when I was doing more pure C than C++.)

    Bottom line, simply checking for NULL doesn't tell you everything you need to know here.

  2. Secondarily, the hard-coded 127 constant is a code bomb waiting to go off, should you ever shrink the size of the line buffer. Say sizeof(line) here.

    (No, not sizeof(line) - 1: fgets() leaves space for the trailing null character when reading. Again, RTFM carefully.)

  3. The break is also a problem, but we'll have to get further down in the code to see why.

Moving on:

strcat(port, ptr);              // append the "ttyACMn" part of the line

Two problems here:

  1. You're blindly assuming that strlen(ptr) <= sizeof(port) - 6. Use strncat(3) instead.

    (The prior line's strcpy() (as opposed to strncpy()) is justifiable because you're copying a string literal, so you can see that you're not overrunning the buffer, but you should get into the habit of pretending that the old C string functions that don't check lengths don't even exist. Some compilers will actually issue warnings when you use them, if you crank the warning level up.)

    Or, better, give up on C strings, and start using std::string. I can see that you're trying to stick to C, but there really are things in C++ that are worth using, even if you mostly use C. C++'s automatic memory management facilities (not just string, but also auto_ptr/unique_ptr and more) fall into this category.

    Plus, C++ strings operate more like Tcl strings, so you'll probably be more comfortable with them.

  2. Factual assertions in comments must always be true, or they are likely mislead you later, potentially hazardously so. Your particular USB to serial adapter may use /dev/ttyACMx, but not all do. There's another common USB device class used by some serial-to-USB adapters that causes them to show up under Linux as ttyUSBx. More generally, a future change may change the device name in some other way; you might port to BSD, for example, and now your USB to serial device is called /dev/cu.usbserial, blowing your 15-byte port buffer. Don't assume.

    Even with the BSD case aside, your port buffer should not be smaller than your line buffer, since you are concatenating the latter onto the former. At minimum, sizeof(port) should be sizeof(line) + strlen("/dev/"), just in case. If that seems excessive, it is only because 128 bytes for the line buffer is unnecessarily large. (Not that I'm trying to twist your arm to change it. RAM is cheap; programmer debugging time is expensive.)


fcntl(cfd, F_SETFL, 0);                                 // blocking mode

File handles are blocking by default in Unix. You have to ask for a nonblocking file handle. Anyway, blasting all the flags is bad style; you don't know what other flags you're changing here. Proper style is to get, modify, then set, much like the way you're doing with tcsetattr():

int flags;
fcntl(cfd, F_GETFL, &flags);
flags &= ~O_NONBLOCK;
fcntl(cfd, F_SETFL, flags);

Well, you're kind of using tcsetattr() correctly:

tcsetattr(cfd, TCSANOW, &options); 

...followed by further modifications to options without a second call to tcsetattr(). Oops!

You weren't under the impression that modifications to the options structure affect the serial port immediately, were you?

if (write(cfd, msgOut, 3) < 3) {
    logIt(fprintf(lfd, "Sending of output message failed\n"));

Piles of wrong here:

  1. You're collapsing the short-write and error cases. Handle them separately:

    int bytes = write(cfd, msgOut, 3);
    if (bytes == 0) {
        // can't happen with USB, but you may later change to a
        // serial-to-Ethernet bridge (e.g. Digi One SP), and then
        // it *can* happen under TCP.
        // complain, close, etc.
    else if (bytes < 0) {
        // plain failure case; could collapse this with the == 0 case
        // close, etc
    else if (bytes < 3) {
        // short write case
    else {
        // success case
  2. You aren't logging errno or its string equivalent, so when (!) you get an error, you won't know which error:

    logIt(fprintf(lfd, "Sending of output message failed: %s (code %d)\n",
               strerror(errno), errno));

    Modify to taste. Just realize that write(2), like most other Unix system calls, has a whole bunch of possible error codes. You probably don't want to handle all of them the same way. (e.g. EINTR)

  3. After closing the FD, you're leaving it set to a valid FD value, so that on EOF after reading one line, you leave the function with a valid but closed FD value! (This is the problem with break above: it can implicitly return a closed FD to its caller.) Say cfd = -1 after every close(cfd) call.

Everything written above about write() also applies to the following read() call, but also:

if (read(cfd, msgIn, 4) != 4) {

There's nothing in POSIX that tells you that if the serial device sends 4 bytes that you will get all 4 bytes in a single read(), even with a blocking FD. You are especially unlikely to get more than one byte per read() with slow serial ports, simply because your program is lightning fast compared to the serial port. You need to call read() in a loop here, exiting only on error or completion.

And just in case it isn't obvious:


You don't need that if you switch to popen() above. Don't scatter temporary working files around the file system where a pipe will do.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download