← Back to context

Comment by muzani

4 days ago

I'm impressed. You guys cloned a whole suite of products in a short period of time that cost millions of dollars. Even the little bits of humor look costly.

On the other hand, it's way more information than I expected. I can see why someone would hesitate to release them - there's a lot to sift through and it's likely even the government couldn't sift through all of them to make sure their friends weren't mentioned somewhere.

Thanks! And it's a lot of info, yeah. ~90% of new data in yesterday's drop was photographs, which they redacted for us.

The House Oversight Committee's giant drop in November had tons of data we still didn't take advantage of even after doing the original Jmail, like flight logs.

For the Yahoo release, which is still ongoing, the folks at Drop Site News (see https://www.jmail.world/about) are handling the manual redaction which has been very time consuming, even with tons of AI to help in the background.

  • Would be nice to explain at some point how we did the structuring of the destructured data.

    For now we’re focusing on fixing the bugs because we’re already seeing an insane wave of traffic so most of us are focused on keeping the site alive.

  • I'm being snarky and this isn't such a serious comment and I don't really mean this for Gemini but can you imagine using something like Gemini ("Hi, please comb through this") and it just refuses on ethical grounds

  • But, whoever’s doing the redacting sees the original right? What prevents the redactor from saying, “here’s what the document really said.” Or “here’s who’s in the image, I saw it before I redacted it?”

    • Part of the law mandates that all redactions will be listed for Congress within 15 days.

    • That’s a good point. I would imagine they break it up into pieces - in a reCAPTCHA sorta way - and any given person sees a sentence or a piece of a sentence.

      An alternative would be to strip out all obvious known words and only leave unknowns (i.e., names) and then have those fragments reviewed (in a reCAPTCHA sorta way).

      Finally, for images, cover all faces and the one by one decide which should remain covered and which should not.

      LOTS of work but there are workflows to mitigate the ability for reviewers to connect more than they should.

    • People who they think will do this don't get to be redactors. It's all about power and relationships, not technology.

    • Given how MTG went completely silent despite her high profile platform, I'm guessing the civil (or at this point, royal) servants don't want their families harmed.

    • I’d guess a first pass is done automatically? Eg if a page mentions eg Trump, just redact that whole page/paragraph/etc. So the people who have done the closer reading to redact further probably don’t actually know the scale of what was already redacted. Just a guess though.

> You guys cloned a whole suite of products in a short period of time that cost millions of dollars.

At the risk of stating the obvious, the functionality isn't actually cloned, only the UI. The actual code powering Gmail probably dates back to the late 80s or early 90s and has had several hundred thousands of hours of work put into it. This is just a webpage that looks kind of similar.

I point this out only because I've seen people saying that software businesses don't have moats anymore because of this, which is taking away a completely false lesson.

  • Out of curiosity, would you explain what you mean by that? Google was founded in 1998 and writing a mail client isn't terribly complicated. Did they buy some code for Gmail from an older company? Is Gmail older than Google?

  • I mean it is so obvious causing me to find the use of the phrase cloned so weird that I feel it needs to be said.

    The UI cloning doesn't feel exactly correct either there are things that are slightly off.

    But I just find the "cloned" wrong, because obviously you cannot send an email from this account, you cannot log in to the service as Jeffrey Epstein, you cannot delete emails, create alerts based on searches, do actions on selected emails (create new tag, move under that tag)

    there are so many functionalities that are not cloned because obviously they could not be cloned because they would make no sense for what this project is. So just the praise for cloning so quickly makes me sort of mad.

    You could theoretically make something like this that allowed log in so you got a personalized epstein mails, and then could do all that, and perhaps get more mails sent in as files get released, and perhaps create Google alerts on epstein in the news etc. that would come as mails and maybe the code could put news that came in, into the appropriate the tags etc.

    But until that time "cloned" is just very wrong.

    • For the holidays, they should at least implement a Shockingly Distasteful Jeffrey Epstein Christmas Card Meme Generator.

  • > The actual code powering Gmail probably dates back to the late 80s or early 90s and has had several hundred thousands of hours of work put into it.

    no. google did not exist until the late 90s.

    various forms of internet email sure did, but most popular mtas of the google era shared very little code with predecessors from the 80s and early 90s (maybe sendmail) and google almost certainly wrote their own from scratch.

    but your first point. that an archive browser that looks like gmail is not equivalent to a full tilt email service backend is valid.

  • Why stop there, I'm sure you can trace Gmail all the way back to the Roman aqueducts.

    • I mean the I would really only include the code for things like:

      - Fetching email messages

      - Parsing email headers

      - Mime parsing

      - Converting the text of email bodies into UTF-8

      - Threading messages

      - Eliding reply text

      Given that the official story is that pb made the first version of Gmail in a day, does anyone actually believe that he wrote the code for any of those things in a day? If you honestly believe that I have a bridge to sell you.

      Wait till you learn that the source code in Chrome also predates the existence of Google.

  • I don't know if I'm just misremembering but it feels like over the last three years or so the technical knowledge on HN has gone down the toilet.

  • [flagged]

    • >I decided to get the Max 20x plan, and prompting 4 projects with each 2 to 3 running 'conversations' , never hit the limit anymore.

      Can you expand on this please? Really cool btw.

> I'm impressed. You guys cloned a whole suite of products in a short period of time that cost millions of dollars. Even the little bits of humor look costly.

The cynic in me would assume that someone with a lot of money wants to hide some of the emails and the best way to do that (at this point) is to release them filtered with a great UI.

  • That’s not cynicism, it’s conspiracy theorism. That leads you to “the whole Epstein thing is a hoax designed to distract from what’s really going on”.

    • The thing I got from reading the majority of these emails is Epstein / trump connection was not that strong later years. I feel JE humored trump to a degree and disliked him to an even larger degree. He may have initially had strong relations in the beginning but he was NOT pleased he was winning the presidency at all. He mentioned multiple times references to dirt on DT and even at one point there was the question did Trump set him up. Not to say JE did no wrong, cause the evidence is 100% there for that but it's super interesting having read the actual files to see the various media spins on all sides. If anything though it's led me to believe there are much stronger ties to Russia with DT than I thought before. (Palm Beach House, the casino, models coming from those areas etc).

      1 reply →

Well there's only 2500 emails here. They definitely had time to sift through these to make sure friends weren't mentioned.

  • I read through 80% of them last night by myself. I mean, I didn't go to bed until 3am but spread across a handful of agents? yeah you could do it in an hour.

> it's likely even the government couldn't sift through all of them

How could you tell?

They also have “promotions” tab listing all promo content. I wonder is this real or mock data.

there's a lot to sift through

The total archive size is 300GB. AFAIK they have only released around 2GB. Curious what is in the rest of it assuming it does not get [redacted] out or deleted. I am also curious how they intend to release the rest of it in time to meet the requirements of the act. Discussion [1] Epstein Files bill sponsor Ro Khanna and Hassan, no dogs being zapped.

[1] - https://www.youtube.com/watch?v=KT2u0Fp3hQg [video][1hr12m]

  • > Curious what is in the rest of it

    Probably a lot of CSAM, if the Mossad blackmail op theory of Epstein is true.

"whole suite of products in a short period of time that cost millions of dollars."

but they just copy the "UI" not the whole product

> I can see why someone would hesitate to release them - there's a lot to sift through and it's likely even the government couldn't sift through all of them to make sure their friends weren't mentioned somewhere.

Jared kushner, is that you?

> - there's a lot to sift through and it's likely even the government couldn't sift through all of them to make sure their friends weren't mentioned somewhere.

if only there were some kind of universal summary engine that never gets tired and is essentially free.