rici rici - 7 months ago 10
Bash Question

Is a semicolon prohibited after NAME in `for NAME do ...`?

The bash manual lists the syntax for the

for
compound statement as

for name [ [ in [ word ... ] ] ; ] do list ; done


which implies that the semicolon before
do
is optional if the
in
clause is omitted. [Note 2].

However, the Posix specification lists only the following three productions for
for_clause
:

for_clause : For name linebreak do_group
| For name linebreak in sequential_sep do_group
| For name linebreak in wordlist sequential_sep do_group
;


For reference,
linebreak
is a possibly-empty sequence of
NEWLINE
while
sequential_sep
is either a semicolon or a
NEWLINE
, possibly followed by a sequence of
NEWLINE
:

newline_list : NEWLINE
| newline_list NEWLINE
;
linebreak : newline_list
| /* empty */
;
separator : separator_op linebreak
| newline_list
;
sequential_sep : ';' linebreak
| newline_list
;


As far as I can see, that prohibits the syntax
for foo; do :; done
.

In practice, all the shells I tried (bash, dash, ksh and zsh) accept both
for foo; do :; done
and
for foo do :; done
without complaint, regardless of Posix or their own documentation [Note 3].

Is this an accidental omission in the grammar in the Posix standard, or should the use of the semicolon in that syntax be considered a (commonly-implemented) extension to the standard?

Addendum



In the XCU description of the
for loop
, Posix seems to insist on newlines:


The format for the for loop is as follows:

for name [ in [word* ... ]]
do
compound-list
done



However, in the Rationale volume, it is made clear that the grammar is intended to be the last word:


The format is shown with generous usage of <newline> characters. See the grammar in XCU Shell Grammar for a precise description of where <newline> and <semicolon> characters can be interchanged.





Notes




  1. Apparently this is the first SO question which pairs and . There is no , which might have been more appropriate.

  2. The
    bash
    manual is not entirely explicit about newlines; what it says is:


    In most cases a list in a command's description may be separated from the rest of the command by one or more newlines, and may be followed by a newline in place of a semicolon.


    That makes it clear that the semicolon preceding
    done
    can be replaced by a newline, but does not seem to mention that the same transformation can be performed on the semicolon preceding
    do
    .

  3. Both
    ksh
    and
    zsh
    seem to insist that there be either a semicolon or a newline after the
    name
    , although the implementations don't insist on it.

    The
    ksh
    manpage lists the syntax as:


    for vname [ in word ... ] ;do list ;done



    (I believe that the semicolon in
    ;do
    and
    ;done
    represents "a semicolon or a newline". I can't find any definite statement to that effect but it is the only way to make sense of the syntax description.)

    The
    zsh
    manual shows:


    for name ... [ in word ... ] term do list done


        where term is at least one newline or ;.


Answer

Nicely spotted! I don't have a definite answer, but here is what the source code says about it:

It's indeed not valid in the original Bourne shell from AT&T UNIX v7:

(shell has just read `for name`):
       IF skipnl()==INSYM
       THEN chkword();
        t->forlst=item(0);
        IF wdval!=NL ANDF wdval!=';'
        THEN    synbad();
        FI
        chkpr(wdval); skipnl();
       FI
       chksym(DOSYM|BRSYM);

Given this snippet, it does not appear to be a conscious design decision. It's just a side effect of the semicolon being handled as part of the in group, which is skipped entirely when there is no "in".

Dash agrees that it's not valid in Bourne, but adds it as an extension:

        /*
         * Newline or semicolon here is optional (but note
         * that the original Bourne shell only allowed NL).
         */

Ksh93 claims that it's valid, but says nothing of the context:

/* 'for i;do cmd' is valid syntax */
else if(tok==';')
    while((tok=sh_lex(lexp))==NL);

Bash has no comment, but explicitly adds support for this case:

for_command:    FOR WORD newline_list DO compound_list DONE
            {
              $$ = make_for_command ($2, add_string_to_list ("\"$@\"", (WORD_LIST *)NULL), $5, word_lineno[word_top]);
              if (word_top > 0) word_top--;
            }
...
    |   FOR WORD ';' newline_list DO compound_list DONE
            {
              $$ = make_for_command ($2, add_string_to_list ("\"$@\"", (WORD_LIST *)NULL), $6, word_lineno[word_top]);
              if (word_top > 0) word_top--;
            }

In zsh, it is's just a side effect of the parser:

while (tok == SEPER)
    zshlex();

where (SEPER is ; or linefeed). Due to this, zsh happily accepts this loop:

for foo; ; 
;
; ; ; ; ;
; do echo cow; done

To me, this all points to an intentional omission in POSIX, and widely and intentionally supported as an extension.

Comments