← Back to context

Comment by steveklabnik

3 days ago

> How does this work under the hood? Does Ruby keep a giant map of all strings in the application to check new strings against to see if it can dedupe?

1. Strings have a flag (FL_FREEZE) that are set when the string is frozen. This is checked whenever a string would be mutated, to prevent it.

2. There is an interned string table for frozen strings.

> Does it keep a reference count to each unique string that requires a set lookup to update on each string instance’s deallocation?

This I am less sure about, I poked around in the implementation for a bit, but I am not sure of this answer. It appears to me that it just deletes it, but that cannot be right, I suspect I'm missing something, I only dig around in Ruby internals once or twice a year :)

There's no need for ref counting, since Ruby has a mark & sweep GC.

The interned string table uses weak references. Any string added to the interned string tables has the `FL_FSTR` flag set to it, and when a string a freed, if it has that flag the GC knowns to remove it from the interned string table.

The keyword to know to search for this in the VM is `fstring`, that's what interned strings are called internally:

- https://github.com/ruby/ruby/blob/b146eae3b5e9154d3fb692e8fe...

- https://github.com/ruby/ruby/blob/b146eae3b5e9154d3fb692e8fe...

  • Ah, the value of FL_FSTR was what I was missing, I had followed this code into rb_gc_free_fstring without realizing what FL_FSTR meant. Thank you!