Comment by snvzz

1 year ago

The fifth is also, once you consider non-ascii names.

3 comments

snvzz

Could someone show a legit reason to use 1000-character filenames? Seems to me, when filenames are long like that, they are actually capturing several KEYS that can be easily searched via ls & re's. e.g.

2025-Jan-14-1258.93743_Experiment-2345_Gas-Flow-375.3_etc_etc.dat

But to me this stuff should be in metadata. It's just that we don't have great tools for grepping the metadata.

Heck, the original Macintosh FS had no subdirectories - they were faked by burying subdirectory names in the (flat filesysytem) filename. The original Macintosh File System (MFS), did not support true hierarchical subdirectories. Instead, the illusion of subdirectories was created by embedding folder-like names into the filenames themselves.

This was done by using colons (:) as separators in filenames. A file named Folder:Subfolder:File would appear to belong to a subfolder within a folder. This was entirely a user interface convention managed by the Finder. Internally, MFS stored all files in a flat namespace, with no actual directory hierarchy in the filesystem structure.

So, there is 'utility' in "overloading the filename space". But...

p_l 1 year ago
> Could someone show a legit reason to use 1000-character filenames?
1023 byte names can mean less than 250 characters due to use of unicode and utf-8. Add to it unicode normalization which might "expand" some characters into two or more combining characters, deliberate use of combining characters, emoji, rare characters, and you might end up with many "characters" taking more than 4 bytes. A single "country flag" character will be usually 8 bytes, usually most emoji will be at least 4 bytes, skin tone modifiers will add 4 bytes, etc.
this ' ' takes 27 bytes in my terminal, '󠁧󠁢󠁳󠁣󠁴󠁿' takes 28, another combo I found is 35 bytes.
And that's on top of just getting a long title using let's say one of CJK or other less common scripts - an early manuscript of somewhat successful Japanese novel has a non-normalized filename of 119 byte, and it's nowhere close to actually long titles, something that someone might reasonably have on disk. A random find on the internet easily points to a book title that takes over 300 bytes in non-normalized utf8.
P.S. proper title of "Robinson Crusoe" if used as filename takes at least 395 bytes...
- p_l 1 year ago
  
  hah. Apparently HN eradicated the carefully pasted complex unicode emojis.
  The first was "man+woman kissing" with skin tone modifier, then there was few flags