Comment by inkyoto

1 month ago

* was the usual infix multiplication operator in BCPL, and it was not used for pointer arithmetic.

The BCPL manual[0] explains the «monadic !» operator (section 2.11.3) as:

  2.11.3 MONADIC !

  The value or a monadic ! expression is the value of the storage cell whose address is the operand of the !. Thus @!E = !@E = E, (providing E is an expression of the class described in 2.11.2).

  Examples.

  !X := Y Stores the value of Y into the storage cell whose address is the value of X.

  P := !P Stores the value of the cell whose address is the value of P, as the new value of P.

The array indexing used the «V ! idx» syntax (section 2.13, «Vector application»).

So, the ! was a prefix operator for pointers, and it was an infix operator for array indexing.

In Richard's account of BCPL's evolution, he noted that on early hardware the exlamation mark was not easily available, and, therefore, he used a composite *( (i.e. a diagraph):

  «The star in *( was chosen because it was available … and it seemed appropriate for subscription since it was used as the indirection operator in the FAP assembly language on CTSS. Later, when the exclamation mark became available, *( was replaced by !( and exclamation mark became both a dyadic and monadic indirection operator».

So, in all likelihood, !X := Y became *(X := Y, eventually becoming *X = Y (in B and C) whilst retaining the exact and original semantics of the !.

[0] https://rabbit.eng.miami.edu/info/bcpl_reference_manual.pdf

The BCPL manual linked by you is not useful, as it describes a recent version of the language, which is irrelevant for the evolution of the B and C languages. A manual of BCPL from July 1967, predating B, can be found on the Web.

The use of the character "!" in BCPL is much later than the development of the B language from BCPL, in 1969.

The asterisk had 3 uses in BCPL, as the multiplication operator, as a marker for the opening bracket in array indexing, to compensate for the lack of different kinds of brackets for function evaluation and for array indexing, and as the escape character in character strings. For the last use the asterisk has been replaced by the backslash in C.

There was indeed a prefix indirection operator in BCPL, but it did not use any special character, because the available character set did not have any unused characters.

The BCPL parser was separate from the lexer, and it was possible for the end users to modify the lexer, in order to assign any locally available characters to the syntactic tokens.

So if a user had appropriate characters, they could have been assigned to indirection and address-of, but otherwise they were just written RV and LV, for right-hand-side value and left-hand-side value.

It is not known whether Ken Thompson had modified the BCPL lexer for his PDP computer, to use some special characters for operators like RV and LV.

In any case, he could not have used asterisk for indirection, because that would have conflicted with its other uses.

The use of asterisk for indirection in B became possible only after Ken Thompson has made many other changes and simplifications in comparison with BCPL, removing any parsing conflicts.

You are right that BCPL already had prefix operators for indirection and address-of, which was different from how this had been handled in CPL, but Martin Richards did not seem to have any reason for this choice and in BCPL this was a less obvious mistake, because it did not have structures.

On the other hand, Ken Thompson did want to have "*" as prefix, after introducing his increment and decrement operators, in order to need no parentheses for pre- and post-incrementation or decrementation of pointers, in the context where postfix operators were defined as having higher precedence than prefix.

Also in his case this was not yet an obvious mistake, because he had no structures and the programs written in B at that time did not use any complex data structures that would need correspondingly complex addressing expressions.

Only years later it became apparent that this was a bad choice, while the earlier choice of N. Wirth in Euler (January 1966; the first high-level language that handled pointers explicitly, with indirection and address-of operators) had been the right one. The high-level languages that had "references" before 1966 (the term "pointer" has been introduced in IBM PL/I, in July 1966), e.g. CPL and FORTRAN IV, handled them only implicitly.

Decades later, complex data structures became common while the manual optimization of incrementing/decrementing explicitly pointers for addressing arrays became a way of writing inefficient programs, which prevent the compiler from optimizing correctly the array accessing for the target CPU.

So the choice of Ken Thompson can be justified in its context from 1969, but in hindsight it has definitely been a very bad choice.

  • I take no issue with the acknowledgment of being on the losing side of a technical argument – provided evidence compels.

    However, to be entirely candid, I have submitted two references and a direct quotation throughout the discourse in support of the position – each of which has been summarily dismissed with an appeal to some ostensibly «older, truer origin», presented without citation, without substantiation, and, most tellingly, without the rigour such a claim demands.

    It is important to recall that during the formative years of programming language development, there were no formal standards, no governing design committees. Each compiled copy of a language – often passed around on a tape and locally altered, sometimes severely – became its own dialect, occasionally diverging to the point of incompatibility with its progenitor.

    Therefore, may I ask that you provide specific and credible sources – ones that not only support your historical assertion, but also clarify the particular lineage, or flavour, of the language in question? Intellectual honesty demands no less – and rhetorical flourish is no substitute for evidence.