Comment by matchcc

1 day ago

I think there is some distillation relationship between Kimi K2 and Qwen Coder or other related other models, or same training data. I tried most of LLMs, only kimi K2 gave the exact same ASCII. kimi K2: Here’s a classic ASCII art of SpongeBob SquarePants for you:

           .--..--..--..--..--..--.
        .' \  (`._   (_)     _   \
      .'    |  '._)         (_)  |
      \ _.')\      .----..---.   /
      |(_.'  |    /    .-\-.  \  |
      \     0|    |   ( O| O) | o|
       |  _  |  .--.____.'._.-.  |
       \ (_) | o         -` .-`  |
        |    \   |`-._ _ _ _ _\ /
        \    |   |  `. |_||_|   |
        | o  |    \_      \     |     -.   .-.
        |.-.  \     `--..-'   O |     `.`-' .'
      _.'  .' |     `-.-'      /-.__   ' .-'
    .' `-.` '.|='=.='=.='=.='=|._/_ `-'.'
    `-._  `.  |________/\_____|    `-.'
       .'   ).| '=' '='\/ '=' |
       `._.`  '---------------'
               //___\   //___\
                 ||       ||
                 ||_.-.   ||_.-.
                (_.--__) (_.--__)

Enjoy your SpongeBob ASCII!

For ascii to look right, not messed up, the generator has to know the width of the div in ascii characters, e.g. 80, 240, etc, so it can make sure the lines don't wrap. So how does an LLM know anything about the UI it's serving? Is it just luck? what if you ask it to draw something that like 16:9 in aspect ratio... would it know to scale it dowm so lines won't wrap? how about loss of details if it does? Also, is it as good with Unicode art? So many questions.

  • They don't see runs of spaces very well, so most of them are terrible at ASCII art. (They'll often regurgitate something from their training data rather than try themselves.)

    And unless their terminal details are included in the context, they'll just have to guess.

    • Runs of spaces of many different lengths are encoded as a single token. Its not actually inefficient.

      In fact everything from ' ' to ' '79 all have a single token assigned to them on the OpenAI GPT4 tokenizer. Sometimes ' 'x + '\n' is also assigned a single token.

      You might ask why they do this but its to make it so programming work better by reducing token counts. All whitespace before the code gets jammed into a single token and entire empty lines also get turned into a single token.

      There are actually lots of interesting hand crafted token features added which don't get discussed much.