Comment by anonnon

2 days ago

Specifically, it's non-literal strings that are mutable. Implementations of either may allow you to modify the literals, but it's likely to break something due to interning. The Common Lisp standard is clear that destructive literal modification is undefined behavior. I believe it's the same in Smalltalk, and I remember you were admonished to never use the direct modification messages (like at:put: or replaceFrom:to:with:) on strings without copying them first.

    'justastring' at: 6 put: $S; yourself

    'justaString' .

However

    #'justasymbol' at: 6 put: $S; yourself

    errorNoModification
        self error:  'symbols can not be modified.'

  • > 'justastring' at: 6 put: $S; yourself

    Evaluating this yields an error in both Squeak and Pharo. What Smalltalk are you using? I'm going to guess Cuis, in which case your example holds, but is misleading. Consider:

        a:='justastring'.
        b:='justastring'.
        a at: 6 put: $S.
        a, ' = ', b.
        'justaString = justaString' .
    

    Notice, modifying "a" also modified "b," because of the shared literal frame entry. This is why you were traditionally admonished to avoid directly modifying string literals. (Which wasn't an issue given the design of the string classes, and the general poor manners of destructively modifying a string argument of unknown origin.)

    • Cuis.

          | a b |
          a := 'justastring'.
          b := 'justastring'.
          a == b
      
          true .

I looked around and it seems like Smalltalk didn't make modifying literals UB, the way Common Lisp does but I'm not an expert. Still, for both languages, most implementations won't stop you if you modify string literals, you just have to deal with the consequences.

  • I'm too lazy to review the ANSI spec, but the Smalltalk Bluebook VM served as a reference implementation for Smalltalk VMs for years, and it used a method-scoped literal frame for string (and other) literals:

    > The selectors in parentheses may be replaced with other selectors by modifying the compiler and recompiling all methods in the system. The other selectors are built into the virtual machine.

    > Any objects referred to in a CompiledMethod's bytecodes that do not fall into one of the categories above must appear in its literal frame. The objects ordinarily contained in a literal frame are

    > shared variables (global, class, and pool)

    > most literal constants (numbers, characters, strings, arrays, and symbols)

    > most message selectors (those that are not special)

    > Objects of these three types may be intermixed in the literal frame. If an object in the literal frame is referenced twice in the same method, it need only appear in the literal frame once. The two bytecodes that refer to the object will refer to the same location in the literal frame.

    > Two types of object that were referred to above, temporary variables and shared variables, have not been used in the example methods. The following example method for Rectangle merge: uses both types. The merge: message is used to find a Rectangle that includes the areas in both the receiver and the argument.

    http://www.mirandabanda.org/bluebook/bluebook_chapter26.html