Comment by bitbasher
10 hours ago
That doesn't make any sense. There's 10,000+ lines of code. There shouldn't be a single commit "Initial commit". I'm fine with squashing some commits and creating a clean history, but this isn't a clean history it's obfuscated.
I do this all the time. I’ll spend weeks or months on a project, with thousands of wip commits and various fragmented branches. When ready, I’ll squash it all into a single initial commit for public consumption.
I also do this. Lots of weird commit messages because fuck that, I'm busy. Commits that are just there to put some stuff aside, things like that. I don't owe it to anyone to show how messy my kitchen is.
Does your makefile also do this https://github.com/xtellect/spaces/blob/422dbba85b5a7e9a209a...
This repo is full of so many strange and hilarious things. Look, I'm a lisper, and this is even too many parentheses for me https://github.com/xtellect/spaces/blob/master/spaces.c#L471...
On the other hand, others don’t have to adopt, use or like your stuff which would be the reasons to publish it.
One big commit definitely doesn’t help with creating confidence in this project.
> I don't owe it to anyone to show how messy my kitchen is.
There was once a time when sharing code had a social obligation.
This attitude you have isn't in the same spirit. GitHub (or any forge) was never meant to be a garbage dumping ground for whatever idea you cooked up at 3AM.
Never happened. My projects start with me goofing around and playing with things, accidentally committing my editor config or a logfile, etc. The first commit on my public release is a snapshot of the first working version, minus all the dumb typos and malcommits I made along the way.
I don’t owe it to anyone to show how the sausage was made. Once it’s out the door and public, things are different. But before then? No one was the moral right to see all my mistakes leading up to the first release.
It requires self-discipline to stay organized. A vcs is just a tool. I'm never organized, my brain just works that way. Whatever the tool, I'll create a mess with it. So as long as the project structure and its code is all good I can't care about anything else.
Explain why you think making a single commit is related to any source code sharing obligation? You completely failed to establish why making a single commit is indicative of it being garbage. Your statements are a series of non-sequiturs so far and thus I can't take you seriously.
2 replies →
that world never existed
I have done "Initial commit"s after having almost finished something. Sometimes fter >10k lines. Totally unrelated to LLMs, as I have done it years ago as well, and has nothing to do with LLMs. I see why you would think what you do though, but it does not logically follow.
It may have been released with a new repo created, losing all the previously-private history.
Yes and no.
Have you looked at the code? It was clearly generated in one form or another (see the other comments).
The author created a new GitHub account and this is their first repository. It looks to be generated from another code base as a sorta amalgamation (either through code generation, ai, or another means).
We're supposed to implicitly trust this person (new GitHub account, first repository, no commit history, 10k+ lines of complicated code).
Jia Tan worked way too hard, all they had to do was upload a few files and share on HN :)
> We're supposed to implicitly trust this person
That would be rather foolish even with a fully viewable history.
I don't understand why you're so worked up about this—nobody is forcing you to use the code.
2 replies →
> no commit history, 10k+ lines of complicated code
This kind of pattern is incredibly common when e.g. a sublibrary of a closed source project is extracted from a monorepository. Search for "_LICENSE" in the source code and you'll see leftover signs that this was indeed at one point limited to "single-process-package hardware" for rent extraction purpouses.
Now, for me, my bread and butter monorepos are Perforce based, contain 100GB+ of binaries (gamedev - so high-resolution textures, meshes, animation data, voxely nonsense, etc.) which take an hour+ to check out the latest commit, and frequently have mishandled bulk file moves (copied and deleted, instead of explicitly moved through p4/p4v) which might mean terrabytes of bandwidth would be used over days if trying to create a git equivalent of the full history... all to mostly throw it away and then give yourself the added task of scrubbing said history to ensure it contains no code signing keys, trade secrets, unprofessional easter eggs, or other such nonsense.
There are times when such attention to detail and extra work make sense, but I have no reason to suspect this is one of them. And I've seen monocommits of much worse - typically injested from .zip or similar dumps of "golden master" copies, archived for the purpouses of contract fulfillment, without full VCS history.
Even Linux, the titular git project, has some of these shenannigans going on. You need to resort to git grafts to go earlier than the Linux-2.6.12-rc2 dump, which is significantly girthier.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
https://github.com/torvalds/linux/commit/1da177e4c3f41524e88...
0 parents.
> It looks to be generated from another code base as a sorta amalgamation (either through code generation, ai, or another means).
I'm only skimming the code, but other posters point out some C macros may have been expanded. The repeated pattern of `(chunk)->...` reminds me of a C-ism where you defensively parenthesize macro args in case they're something complex like `a + b`, so it expands to `(a + b)->...` instead of `a + b->...`.
One explaination for that would be stripping "out of scope" macros that the sublibrary depends on but wishes to avoid including.
> We're supposed to implicitly trust this person
Not necessairly, but cleaner code, git history, and a more previously active account aren't necessairly meant to suggest trust either.
1 reply →