Comment by adefa

21 days ago

I ran a similar experiment last month and ported Qwen 3 Omni to llama cpp. I was able to get GGUF conversion, quantization, and all input and output modalities working in less than a week. I submitted the work as a PR to the codebase and understandably, it was rejected.

https://github.com/ggml-org/llama.cpp/pull/18404

https://huggingface.co/TrevorJS/Qwen3-Omni-30B-A3B-GGUF

11 comments

adefa

antirez 21 days ago

The refusal because often AI writes suboptimal GGML kernels looks very odd, to me. It means that who usually writes manually GGML kernels, could very easily steer the model into writing excellent kernels, and even a document for the agents can be compiled with the instructions on how to do a great work. If they continue in this way, soon a llama.cpp fork will emerge that will be developed much faster and potentially even better: it is unavoidable.

rjh29 21 days ago
The refusal is probably because OP said "100% written by AI" and didn't indicate an interest in actually reviewing or maintaining the code. In fact, a later PR comment suggests that the AI's approach was needlessly complicated.
- hirako2000 21 days ago
  
  Also because it's a large PR. Also because the maintainer has better things to do than taking longer and more energy to review than the author spent to write it, just to find that multiple optimisations will be requested, which the author may not be able to take on.
  the creator of llama.cc can hardly be suspected to be reluctant or biased towards GenAI.
  
  2 replies →
nickandbro 21 days ago

I wonder if some of the docs from https://app.wafer.ai/docs could be used to make the model be better at writing GGML kernels. Interesting use case.
nickpsecurity 21 days ago
[flagged]
- hirako2000 21 days ago
  
  Not sure we need to invoke Jesus to agree with the liability concerns.
  
  1 reply →
- user34283 21 days ago
  
  Complete non-issue in my experience.
  With usage on a daily basis since GPT-4 I have not once encountered a scenario where I was concerned about the output being complex enough and a verbatim copy to warrant such concerns.
  Generally it would seem statistically unlikely to reconstruct a copyrighted work, rather the output should be a probabilistic average. Snippets are typically too common and short to be protected by copyright. Copyright challenges are likely to fail on the "substantial similarity" test.
  I understand plaintiffs would need to show that code is virtually identical, not just similar, and that these parts represent a "substantial" portion of the original work's creative value.