Comment by pmarreck

6 hours ago

Impressive model, for sure. I've been running it on my Mac, now I get to have it locally in my iPhone? I need to test this. Wait, it does agent skills and mobile actions, all local to the phone? Whaaaat? (Have to check out later! Anyone have any tips yet?)

I don't normally do the whole "abliterated" thing (dealignment) but after discovering https://github.com/p-e-w/heretic , I was too tempted to try it with this model a couple days ago (made a repo to make it easier, actually) https://github.com/pmarreck/gemma4-heretical and... Wow. It worked. And... Not having a built-in nanny is fun!

It's also possible to make an MLX version of it, which runs a little faster on Macs, but won't work through Ollama unfortunately. (LM Studio maybe.)

Runs great on my M4 Macbook Pro w/128GB and likely also runs fine under 64GB... smaller memories might require lower quantizations.

I specifically like dealigned local models because if I have to get my thoughts policed when playing in someone else's playground, like hell am I going to be judged while messing around in my own local open-source one too. And there's a whole set of ethically-justifiable but rule-flagging conversations (loosely categorizable as things like "sensitive", "ethically-borderline-but-productive" or "violating sacred cows") that are now possible with this, and at a level never before possible until now.

Note: I tried to hook this one up to OpenClaw and ran into issues

To answer the obvious question- Yes, this sort of thing enables bad actors more (as do many other tools). Fortunately, there are far more good actors out there, and bad actors don't listen to rules that good actors subject themselves to, anyway.

32 comments

pmarreck

c2k 6 hours ago

I run mlx models with omlx[1] on my mac and it works really well.

[1] https://github.com/jundot/omlx

pmarreck 1 hour ago

Holy hell, how new is this? I've never heard of it, looks great!

barbazoo 6 hours ago

> And there's a whole set of ethically-justifiable but rule-flagging conversations (loosely categorizable as things like "sensitive", "ethically-borderline-but-productive" or "violating sacred cows") that are now possible with this, and at a level never before possible until now.

I checked the abliterate script and I don't yet understand what it does or what the result is. What are the conversations this enables?

SL61 3 hours ago
LLMs are very helpful for transcribing handwritten historical documents, but sometimes those documents contain language/ideas that a perfectly aligned LLM will refuse to output. Sometimes as a hard refusal, sometimes (even worse) by subtly cleaning up the language.
In my experience the latest batch of models are a lot better at transcribing the text verbatim without moralizing about it (i.e. at "understanding" that they're fulfilling a neutral role as a transcriber), but it was a really big issue in the GPT-3/4 era.
- dolebirchwood 3 hours ago
  
  I have a project where I'm using LLMs to parse data from PDFs with a very complicated tabular layout. I've been using the latest Gemini models (flash and pro) for their strong visual reasoning, and they've generally been doing a really good job at it.
  My prompt states that their job is to extract the text exactly as it appears in the PDF. One data point to be extracted is the race of each person listed. In one case, someone's race was "Indian". Gemini decided to extract it as "Native American". So ridiculous.
  
  2 replies →
pmarreck 5 hours ago
1) Coming up with any valid criticism of Islam at all (for some reason, criticisms of Christianity or Judaism are perfectly allowed even with public models!).
2) Asking questions about sketchy things. Simply asking should not be censored.
3) I don't use it for this, but porn or foul language.
4) Imitating or representing a public figure is often blocked.
5) Asking security-related questions when you are trying to do security.
6) For those who have had it, people who are trying to use AI to deal with traumatic experiences that are illegal to even describe.
Many other instances.
- tshaddox 1 hour ago
  
  > Coming up with any valid criticism of Islam at all (for some reason, criticisms of Christianity or Judaism are perfectly allowed even with public models!).
  When’s the last time you tried this? ChatGPT and Gemini have no trouble responding with all the common criticisms of Islam.
- peyton 4 hours ago
  
  The manufacturing of biologics can be heavily censored to an absurd degree. I don’t know about Gemma 4 in particular.
  
  1 reply →
spijdar 5 hours ago
Realistically, a lot of people do this for porn.
In my experience, though, it's necessary to do anything security related. Interestingly, the big models have fewer refusals for me when I ask e.g. "in <X> situation, how do you exploit <Y>?", but local models will frequently flat out refuse, unless the model has been abliterated.
- tredre3 4 hours ago
  
  From what I've seen gemma 4 doesn't refuse a lot regarding sex, it only needs little nudging in the right direction sometimes.
  But it does refuse being critical of the usual topics: israel, islam, trans, or race.
  So wanting to discuss one of those is the real reason people would use an uncensored model.
throwuxiytayq 5 hours ago
The in-ter-net is for porn
- rav3ndust 5 hours ago
  
  that song is going to be stuck in my head all day now. lol
  
  1 reply →

eloisant 5 hours ago

I tried it on my mac, for coding, and I wasn't really impressed compared to Qwen.

I guess there are things it's better at?

nkohari 4 hours ago
You're comparing apples to oranges there. Qwen 3.5 is a much larger model at 397B parameters vs. Gemma's 31B. Gemma will be better at answering simple questions and doing basic automation, and codegen won't be it's strong suit.
- kgeist 4 hours ago
  
  Qwen3.5 comes in various sizes (including 27B), and judging by the posts on HN, /LocalLlama etc., it seems to be better at logic/reasoning/coding/tool calling compared to Gemma 4, while Gemma 4 is better at creative writing and world knowledge (basically nothing changed from the Qwen3 vs. Gemma3 era)
  
  1 reply →
- tredre3 4 hours ago
  
  Gemma 4 31B is still not impressive at coding compare to even Qwen 3.5 27B. It's just not its strong suit.
  So far gemma 4 seems excellent at role playing, document analysis, and decent at making agentic decisions.
  
  1 reply →

magospietato 6 hours ago

Haven't built anything on the agent skills platform yet, but it's pretty cool imo.

On Android the sandbox loads an index.html into a WebView, with standardized string I/O to the harness via some window properties. You can even return a rendered HTML page.

Definitely hacked together, but feels like an indication of what an edge compute agentic sandbox might look like in future.

bossyTeacher 4 hours ago

>there's a whole set of ethically-justifiable but rule-flagging conversations (loosely categorizable as things like "sensitive", "ethically-borderline-but-productive" or "violating sacred cows") that are now possible with this, and at a level never before possible until now.

Mind giving us a few of the examples that you plan to run in your local LLM? I am curious.

pmarreck 2 hours ago
I'm not sure what you're angling at but I already gave a set of questions that are ethically legitimate yet routinely censored by the public models:
https://news.ycombinator.com/item?id=47654013
Not to mention that doing what the big model makers do literally dumbs the model down.
They should at least allow something like letting you prove your age and identity to give you access to better/unaligned models, maybe even requiring a license of some sort. Because you know what? SOMEONE in there absolutely has access to the completely uncensored versions of the latest models.
- satvikpendem 24 minutes ago
  
  I tried 1 and a few others with hypothetical situations, public models answer perfectly fine it looks like.

3yr-i-frew-up 1 hour ago

[dead]

jackp96 6 hours ago

[flagged]

potsandpans 6 hours ago
I'm tired of this concern trolling.
- jackp96 5 hours ago
  
  [flagged]