Rendered at 10:10:29 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
drevil-v2 10 hours ago [-]
The damage is done. You cannot build a business critical function on top of American SOTA frontier model. Especially not with the current crew in charge.
Now whether AI tech is in the same league as say Nuclear tech and therefore by any reasonable standard should be regulated is a different question.
We hit the slippery slope on a random day in June 2026 and there is no putting the genie back in the bottle. Any exec or manager that puts load bearing weight on top of Anthropic/OpenAI/Google/AmericanCorp frontier model deserves the stress.
reacharavindh 3 hours ago [-]
All this will fly until a competitor from outside the US releases a “freedom” model that is even 90% as capable as Fable was without its shackles.
But, as a frustrated EU resident lamenting a lack of European option(Mistral is just not competitive enough), I will spread my money towards the Chinese models as well. Thank you Murica! You achieved your soft power by pushing us towards the Chinese :-)
This protectionism and hypocrisy (free markets and freedom!! Until it is us who needs to practice what we preach) is so tiring. I wish European nations would come together closer and put their differences aside and realise larger things together. Become the new power that the US is clearly stumbling away from being.
vintagedave 3 hours ago [-]
> Mistral is just not competitive enough
Does anyone know why? I was really excited when they emerged, but their models and targets don't seem to be quite in the same market.
xdertz 2 hours ago [-]
Their target market is completely different. Anthropic and OpenAI try to build general AI that wins on all the benchmarks by throwing ungodly amounts of money at it.
Mistral focuses on long term b2b contracts and their proposition is that they fine tune their model to your needs with an added bonus of 'not dependent on America' in a politically tumultuous time.
ghm2199 3 minutes ago [-]
An so like if a business wanted to home in on one very specific use case that could be hyper optimized by SFT, had really good support for updating and adding new features, on-Prem etc. that’s the kind of market they are in?
sajithdilshan 1 hours ago [-]
Most probably lack of capital and talent. At the end of the day they have to compete with other giants for the chips to train the models.
joelthelion 52 minutes ago [-]
I wouldn't be surprised if they had new models up their sleeve. Could be wrong of course.
VeejayRampay 12 minutes ago [-]
capital and talent is the same in this context
there's no shortage of talent in Europe or France, it's just an issue of available capital
mike_hearn 1 hours ago [-]
Lack of capital and (probably) lack of willingness to mass distill Anthropic and OpenAI.
plufz 1 hours ago [-]
What would happen if they mass distilled one of the really large local models like GLM 700b or deepseek 1.6t?
grim_io 1 hours ago [-]
At that point you might as well just host them yourself.
sajithdilshan 1 hours ago [-]
That's not how the innovation works
seviu 2 hours ago [-]
This. I have been using anthropic and codex subs, on max. All this changed in June. We are clearly entering an era where we cannot rely on American models. As a solo developer I value reliability over performance. I cannot pay hundred of $, plus a lot of my private time figuring out how to properly use this technology, for it to be taken away within hours.
On top of that, the intelligence is being dialed down. Sonet 5 is a living proof of this. Fable has strong guardrails, but new Sonet is a dumbed down expensive model, which already falls behind GLM 5.2 and Kimi 2.7. I might go back to Claude since I know Fable is just a limited offer, and I am not going to pay for API usage. But what they are signaling with Sonet will also come to Opus. A lobotomized more expensive model.
I am honestly baffled how the current administration is giving the whole world, on a golden plate, to China. And they don't seem too bothered about it. They are living in their own bubble and reality distortion field I guess.
I could go on endless rant about Dario, but I feel I am so strongly biased now that my judgement might be clouded.
Time to move on
Forgeties79 1 hours ago [-]
I feel like I see this comment every few months and yet in between people keep talking about all of the functionality they’re getting out of anthropic’s offerings. It doesn’t seem to me that people are willing to give up the “shackles” as it were and we’re just going to wind up with what we’re fearing here. On top of that, local models are just not turnkey enough for the average person yet (go ahead and drop somebody into LM studio and tell them to get to work, it won’t go well).
sajithdilshan 1 hours ago [-]
A lot of bitter europeans would down vote this comment, but saying Murica has pushed you towards China is hypocrisy is at its finest. Your incompetent EU politicians are the ones that has failed you by outsourcing every aspect of sovereignty to the rest of the world instead of self-reliance. You have nobody to blame but yourselves. In one year you'll be blaming China for abandoning the EU when they starts controlling their frontier models.
grim_io 1 hours ago [-]
Maybe we will in a year, but then we'll just complain about China copying protectionism and censorship from the USA.
If that's a comfortable position for you, all good.
We held the US in higher regards, that's all.
sajithdilshan 1 hours ago [-]
> Maybe we will in a year, but then we'll just complain about China copying protectionism and censorship from the USA
yes, I guess the only thing the europeans are good at seems to be complaining.
> We held the US in higher regards, that's all.
That's your fault and now doing the same mistake with China
grim_io 50 minutes ago [-]
We are just disappointed. You have to actually live there :)
I don't think anyone is making the same mistake with China, as open weight models can't be Thanos'd away.
Citizen_Lame 23 minutes ago [-]
Well you are not wrong, I would also add corrupted and cowardly politicians. We are in worse position than China, and under full control of daddy USA, no matter what they say. If US would pull switch, it would be catastrophe for the EU.
Even the premier EU companies such as ASML are heavily reliant on US supply chain.
But why can't we be bitter?
Aurornis 5 hours ago [-]
> The damage is done. You cannot build a business critical function on top of American SOTA frontier model. Especially not with the current crew in charge.
The switching costs of changing LLM providers is as low as it gets. All the individuals and startups I know try different models all of the time, even down to the level of choosing which provider to use based on the task. Bigger companies move slower but only because they have lawyers and teams negotiating contracts, not because there is a technical reason that it's hard to switch.
Companies have dealt with supply chain unpredictability by having multiple providers and switching between them since forever. It's infinitely easier to switch LLM providers than it is to deal with physical supply chain uncertainty.
PeterStuer 3 hours ago [-]
For real production I find the switching cost is not as trivial as you portray. Even going to a new model version in the same model family, say GPT-4o to GPT-5.2, a transition I just finished on a not too complicated application, requires extensive retesting and tweaking of prompts, guardrails and parameters.
sshine 2 hours ago [-]
I second this; even switching between minor versions of a model, you need to adjust prompts: the new model is better by implying a bunch of things that, when included in the prompt, will overdo that thing.
Assessing quality of output is often not trivial, either. Typically, problems that are solved by offloading something to an LLM are super subjective, and customers “feel” something is different is vulnerable.
We try to quantify output differences by many different similarity metrics. But a lot of energy goes into subjectively evaluating if something still works.
anonzzzies 2 hours ago [-]
Maybe OP meant switching in a coding harness way? Not an application using AI? I had similar issues like you in the latter case, but in the former it's trivial.
jcims 1 hours ago [-]
Vendor diversity is a longstanding risk management principle. For it to work you need to invest in it as you build, not when the rug is pulled.
throwaw12 2 hours ago [-]
> The switching costs of changing LLM providers is as low as it gets
Not trivial, you would need to do lots of evals and prompt tuning when you switch models.
imagine what happens when you optimize your agent skills to the current model, and new model starts breaking. you would need to have versioning for your skills, serving different skills based on the model while you do A/B testing
miki123211 4 hours ago [-]
Exactly!
Even if you won't be able to use some model tomorrow, you can still make money by using it today!
And in the age of limited compute, spiky workloads and constant outages, building a mechanism to fallback to a weaker model when your primary choice isn't available is smart anyway.
rob74 3 hours ago [-]
For many, that fallback mechanism is simply called Cursor - soon to be owned by Elon Musk. Which opens up a similar but slightly different can of worms...
GTP 2 hours ago [-]
Well, there are many alternatives to Cursor as well.
Sammi 9 hours ago [-]
I'm a small software business owner in Europe. I have to assume my competition is willing to pay for any business advantage they can get. And so I also have to pay for the SOTA model, whatever it is.
lelanthran 3 hours ago [-]
> I'm a small software business owner in Europe. I have to assume my competition is willing to pay for any business advantage they can get. And so I also have to pay for the SOTA model, whatever it is.
If you make money from doing anything like "produce software with as little human involvement as possible", then sure, you need SOTA models. In that case, though, the value you add is very little and you probably don't have a sustainable business.
OTOH, if you make money by getting clients to pay for features, there is very little difference in time-savings from using Anthropic/OpenAI SOTA over GLM-latest.
IOW, if you business can only make money by one-shotting software, you probably don't have a business in the first place.
Regards, another small business owner.
midasz 3 hours ago [-]
You also don't really need LLM's, we still have software engineers too. Everyone is focusing so heavily on the speed gain producing code, but in my experience clients of established products aren't really waiting for massive changes and gigantic features to be added. We aren't taking the time to think things through anymore.
goyozi 2 hours ago [-]
> clients of established products aren't really waiting for massive changes and gigantic features to be added
In some cases they do. I work in a B2B vertical SaaS company and there’s both features that competitors build or rough edges around our features that make clients go „either we get X or we sign with someone else”. I agree though with the general sentiment that you don’t need SOTA models to build those - humans or humans + mid pack strong model will do.
Sammi 2 hours ago [-]
I'm the only dev. I simply don't have time for dealing with the code from non-SOTA models. I'm doing all I can to keep this business afloat.
lelanthran 2 hours ago [-]
> I'm the only dev. I simply don't have time for dealing with the code from non-SOTA models. I'm doing all I can to keep this business afloat.
It sounds that your business is selling completely agent-coded products. I don't know how long that will be viable, or even if it is right now.
In my part of the world, I am completely unable to sell completely agent-coded products, so even a SOTA model is useless. The majority of my time is spent on analysis outside of coding anyway, so when I bill it's not based on how many lines of code I've added, it's based on whether the goal of the customer is satisfied.
rglullis 1 hours ago [-]
If you think your business depends on the ability for you to outspend the competition on LLM tokens, then you should cut your losses and shut it down right now.
SwellJoe 9 hours ago [-]
The good news (for you and most everyone other than the current leading AI companies), the gap between the SOTA and the near-frontiers is getting smaller every week or two. The leading Chinese models are only a few months behind now (GLM 5.2 tickles the tail of GPT 5.3 or 5.4 and Opus 4.6, according to benchmarks and the vibes among heavy users who've spent some time with it), where they were a couple of years behind a year ago.
rafram 7 hours ago [-]
4.6 was released at the beginning of February, so if the Chinese models only "tickle its tail," that means they're >5 months behind.
felipeerias 4 hours ago [-]
That comparison is also misleading because Opus 4.6 was probably not Anthropic's frontier model.
We got the first news about Mythos in March, so it is likely that it was already close to ready by the time Opus 4.6 was released.
So the actual gap is the time elapsed between March (or April for the official announcement) and whenever Chinese models can match Mythos.
SwellJoe 4 hours ago [-]
The post-training process of a model that size is months, though it "works" before that. It is a big chunky model before it's released to the world and probably does some amazing things, sometimes...but, it wasn't done (else why wouldn't they release it and soundly trounce their competitors). I would assume that Chinese AI companies have a pipeline and what we see is a couple/few months behind their newest model, as well. Like, the new base model is cooked, but they're still plating it for service.
Why would Anthropic get the benefit of pre-release models counting toward their lead, if nobody else gets to count their pre-release models?
trvz 7 hours ago [-]
> The leading Chinese models are only a few months behind now
PeterStuer 3 hours ago [-]
I hear that often, but what does that even mean? I am a great proponent of open weights models. I do believe they are the only reason we have not stagnated into a collusion of halting (public) model releases.
But exactly which point in time is z.ai compared to claude.ai? Consistently bring "6 months behind" in an exponentially acellerating evolution means the gap is growing exponentially wider, not constant.
SwellJoe 6 hours ago [-]
What range of numbers do you believe "a few" represents?
mlyle 5 hours ago [-]
Opinions vary, but:
A couple: usually 2, though not always
A few: 3, 4, 5
Several: 4, 5, 6, or 7.
marcus_holmes 4 hours ago [-]
> A couple: usually 2, though not always
I had to explain this to my German friend. In my understanding this isn't about the actual number, it's about the certainty. If it's absolutely and definitely two, then I say two. If I'm uncertain but it's probably two, or if a non-integer, somewhere around two, then I say couple.
And few is more likely to be 3 than 5, because 5 is getting close to a "half-dozen or so", or (as you say) several.
Many is very context-sensitive, as the meme has it.
So I would agree that the open models are a few months behind, definitely more than a couple of months behind, possibly several months behind, maybe a half-dozen months or so behind, but not many months behind.
cassianoleal 3 hours ago [-]
In the UK, as far as I can tell, a couple are 2. Not around 2. Not maybe 3 or 4. Always 2.
3 or 4 would likely be a few, or some. 1 is, well, one.
jonathrg 2 hours ago [-]
Several and a few are the same number, they only differ rhetorically.
pelagicAustral 36 minutes ago [-]
Whats the leading Claude Code competitor model over in China?
Sammi 2 hours ago [-]
So I keep hearing.
dansquizsoft 5 hours ago [-]
Another day, more cope on this subject from many posters on here...
Der_Einzige 5 hours ago [-]
This is nonsense.
The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.
China has no flywheel for long-form agentic traces like Claude Code and its telemetry over its userbase (no one uses the Chinese harnesses yet). Most Chinese models are forced to price themselves significantly below cost to compete with the huge demand for bootleg claude tokens, because they're that much worse.
SwellJoe 3 hours ago [-]
Ah, well, if Anthropic says their competitors are ten months behind...
I don't know what I was thinking.
brailsafe 5 hours ago [-]
> is estimated at 10 months by Anthropic themselves, and it's growing.
How is this different than any business with something to lose saying a competitor isn't as good? Not saying it's false, but it would seem to me that it's more important how customers feel about the issue.
gck1 32 minutes ago [-]
> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.
#1 I've had use cases where it was clearly obvious the Chinese models were behind.
#2 I've also had use cases where I couldn't tell a difference at 1/20th of the price.
The problem is - the #1 is the use case where American frontier is gated behind saboteur classifiers and is tiny minority anyway. Vast majority of work is #2.
The gap doesn't matter anymore.
InsideOutSanta 2 hours ago [-]
> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.
There's a lot of subjectivity in determining this, but I'm 100% sure that 10 months is wrong.
I don't know whether the gap is currently growing, but I'm not sure it matters. There are thresholds where models reach certain levels of usefulness. Opus 4.8, for example, is at a level where I can give it relatively vague input, and it can go for half an hour on its own and produce a high-quality PR.
If GLM reaches that level of capability and can do that task more cheaply than Anthropic's model, I will use GLM for that task, because that's a specific type of task I use models for. It doesn't really matter whether Anthropic also has a better model, because what does "better" mean in this context? It's a clearly defined task, and Opus 4.8 already does it at a very high level of quality.
marcus_holmes 4 hours ago [-]
Here in Australia the sudden withdrawal of Fable made all of us think hard about models and harnesses.
I've heard half a dozen people talk about how a less advanced model coupled with a better harness outperforms a smarter model in the last few weeks.
If the USA wanted to shoot its AI industry in the foot it achieved its goal.
bel8 5 hours ago [-]
If Anthropic themselves say competition is 10 months behind, it's probably 5 or less.
And you seem to think "no one uses" DeepSeek's v4, z.AI's GLM 5.2 or Xiaomi's MiMo 2.5 from their official APIs when they probably dwarf Anthropic's usage and are widening the gap due to conquering a chunk of Western market too.
I know it's hard for some to comprehend there's an entire Eastern hemisphere in the globe with billions of people, so it's worth reminding. And some seem to think the world is basically silicon valley even.
Chyzwar 2 hours ago [-]
Because claude subscription tokens are cheaper than deepseek and friends. You have whole industry of people reselling Claude subscriptions in China.
Can you comprehend than Anthropic is winning because is both cheap(subscriptions) and better SOTA. People are cheering China providers when I reality they would rugpull open weights the moment they are competive.
China models are trash that why they are giving them away for free.
For individuals and small companies subscriptions is the best deal, for big companies china models are big no unless they can host them.
hk__2 4 hours ago [-]
No you don't; it's often overkill to use the SOTA models. People want SOTA because it's shiny, but there are a lot of tasks where it's cheaper and more efficient to use other models.
jiggawatts 3 hours ago [-]
> but there are a lot of tasks where it's cheaper and more efficient to use other models.
Sure… but which ones? How can you know ahead of time?
I just did a “simple” upgrade project where both me and the AI kept tripping over dead code, subtle typos, and difficult-to-trace live versus dead code.
Many times I used “Medium” thinking I got bitten, but not every time, and I couldn’t predict when.
So “Extra high” it was, for the entire project.
Far fewer nasty surprises!
PeterStuer 3 hours ago [-]
In my experience: anything of open-ended complexity (software development, research, product design, ...) benefits from wathever the frontier can offer. 95% of Line of Business automation and workflows can be handled by even a reasonably small open weights generalist model flanked by a few even smaller specialized models. Yes, designing such a setup takes more knowledge and work dan just chucking it all over the api with prompts. But that is how I can run a system here for <$30/month vs >$1.000 month. As an added bonus, no model server can shut me down at the drop of a hat.
Sammi 2 hours ago [-]
Exactly. I simply don't have the time to deal with non-SOTA model output.
parodysbird 9 hours ago [-]
This is a great recipe for going out of business.
adrianmonk 8 hours ago [-]
If the competitive risk is real, then are choosing between supplier risk (AI model access) and competitive risk.
When there isn't a zero-risk option, the question becomes which risk is smaller.
unknownfuture 5 hours ago [-]
> If the competitive risk is real
Yes.
If.
Man I hope this tech FOMO eventually stops.
Companies generally fail because either their product doesn't meet a market need, or the market doesn't exist in the first place (possible because of bad timing), and not because they simply outran their competitors.
These aren't things fixed by using a frontier model to vibe code faster in lieu of one 5 months behind.
slim 6 hours ago [-]
You can compete by being smart and using less-than-sota models and build a more solid business around them
Sammi 2 hours ago [-]
I use whatever model is SOTA. I switch between them in order to avoid lock in.
lelanthran 1 hours ago [-]
>I use whatever model is SOTA. I switch between them in order to avoid lock in.
What's your competitive edge here? Shaving off an hour of a feature delivery? Not having to see the code that is produced?
KronisLV 33 minutes ago [-]
Not sure about OP, I usually make Opus 4.8 on Extra thinking level implement features for me on a specific project, while I'm busy with other stuff.
For a change, I let DeepSeek V4 Pro implement it on Max thinking level. Nothing too out there - some DB migrations, some Django back end changes and Vue SPA front end changes.
Implementation time in total including tests was a few hours, so nothing too egregious. However, one of the migrations would break with pre-existing data, one of the column references in the entity was wrong, the API endpoint wasn't made consistently with the others in adjacent code (e.g. permission checks) and the front end had a Pinia state related issue and submitting one of the forms didn't work.
Tooling was run: ruff, ty, Oxfmt, Oxlint, also Docker build was green across the board, but the overall feature just didn't work. In both cases, sub-agents with clear context would review the code for serious/critical issues, at least three in parallel and do review loops until they spot nothing. The harnesses both has LSP integration.
Opus spent another hour fixing it, needed a few iterations, because I couldn't be bothered there.
> What's your competitive edge here? Shaving off an hour of a feature delivery? Not having to see the code that is produced?
The difference largely was not needing to waste time in fixing all sorts of subtle bugs that sub-optimal models will produce, worse yet if it was some sort of a serious project and those wouldn't have been spotted but instead that slop would have gotten shipped.
That said, Opus isn't ideal either and messed up a whole bunch when I was training some neural nets and try to process a bunch of satellite data and configure Garage to store them so that tiles can be served from a slow HDD and stuff like that. Obviously, it also needs a lot of babysitting in regards to UI looks, but it's better at the rest of development.
I think that DeepSeek V4 Pro and GLM 5.2 are cool though, it's just that you want as many checks and tests as you can throw at any given problem, or use languages that make shipping completely broken code increasingly likely.
jasondigitized 8 hours ago [-]
Any competitive business will accept this risk if it gives them any type of edge no matter the duration of that edge. This is no different that using an exotic raw material.
benjaminwootton 7 hours ago [-]
Every big business in the world biases towards risk reduction and cost reduction over getting an edge.
eru 4 hours ago [-]
Different businesses have different biases.
rogerrogerr 8 hours ago [-]
Eh, this isn’t really how businesses operate. How many businesses refuse to give devs large-spec machines? That’s very clear positive ROI.
I think it’s excessively charitable to assume businesses are uber-competent ROI-chasers. The expense people are eventually going to win on AI too, this blip of unrestricted AI budgets will be gone soon.
halfmatthalfcat 9 hours ago [-]
And thus, capitalism continues to roll on. Businesses are suppose to go out of business, its a feature.
whatever120 8 hours ago [-]
they’re not supposed to, they’re just able to
teleforce 8 hours ago [-]
Nearly spit out my coffee, thanks for the chuckle.
w8vY7ER 7 hours ago [-]
It’s ok to be amused, absent exaggeration. Spit takes happen in sitcoms.
Retric 6 hours ago [-]
They do happen in real life.
They are overused in sitcoms because it’s easy for actors to mimic on demand unlike several other reactions.
pmontra 5 hours ago [-]
I don't know if you write software for your own products or if you code for your customers. Anyway, are you going to compete on the speed of your code writing AI or on deploying the features your customers need? One useful feature is better than a hundred ones nobody really care about. And a good relationship with customers is better than any feature.
Example. Yesterday I listened the technical lead of a customer of mine digging himself into a hole by not understanding what it would mean exposing AWS EFS to their on premise server over NFS. It was just too many unknown unknowns for him and he had no time to ask the AI (and even if he did I'm not sure that he could understand.) His boss, which actually used NFS, had to stop him. I didn't speak a word.
So, he could have coded the migration of a server from AWS to on premise, asked Claude to write also all the configuration scripts and policies but then what?
Sammi 2 hours ago [-]
I'm making a micro SaaS product. Code quality and code production speed are actually both super important. I don't have the time for non-SOTA model output.
ZeroGravitas 3 hours ago [-]
For businesses where this is true, they also need to be able to switch provider quickly in case the best provider changes.
It's almost identical to the possibility of one model getting shut down for a business that doesn't care about SOTA.
Sammi 2 hours ago [-]
Yeah I have both the Claude and Codex 100 dollar subscriptions and I try to use both. I also keep the 20 dollar Cursor subscription as there I can play around with everything. I also refuse to use any harness specific features. Claude is particularly annoying with this in that it's the only one that doesn't respect open config standards like .agents/skills
jdlshore 9 hours ago [-]
What concrete business advantage are you getting from LLMs?
echelon 9 hours ago [-]
Speed.
K0balt 9 hours ago [-]
This x 10 . I don’t understand how people are saying you can’t use LLMs to get crazy productivity gains. If you can’t write quality code with LLMs at ludicrous speed, you’re holding it wrong. You will have occasional bad days and regressions. But overall you’re still going to be able to 4x your progress.
cedws 8 hours ago [-]
I have plenty of experience with LLMs and use them daily but definitely wouldn't call generated code "quality code." Often looks like complete vomit.
K0balt 6 hours ago [-]
That’s kinda what I mean. Maybe it only works well in some languages, but with the harness I built for C and C++ does a fantastic job of adhering to very strict architecture and style guides. Way cleaner, more readable, better factored, and more interpretable than human generated code, except maybe one or two devs I have worked with. YMMV I guess?
TBF I do burn 200k tokens just preloading the context with onboarding, not including any code, just document trees of development policy documents, style and architectural standards, code and documentation review processes, company ethos and culture, etc. it’s a token fire, but it really works for us.
Also, documentation driven development all the way down.
satvikpendem 7 hours ago [-]
If you're an enterprise (including startups), you worry about customers, not code quality. There are famously many startups that gained traction despite shit code and then eventually got around to fixing it, to whatever extent was possible, like Facebook HHVM, Stripe's Sorbet, etc.
watwut 4 hours ago [-]
Startups failed because they cound not untangle own code after 4 months. Literally true stories (plural).
lelanthran 1 hours ago [-]
> Startups failed because they cound not untangle own code after 4 months.
That's rare, though. If they could not untangle their own code after 4 months, it's because they were not making enough money to pay a team to untangle it - that's not a code problem, it's a revenue problem.
IOW, the startup failed because their revenue was too low.
satvikpendem 3 hours ago [-]
There are orders of magnitude that failed because they did not solve the right customer problem. Code quality is merely incidental the vast majority of the time.
wonnage 7 hours ago [-]
[dead]
NortySpock 8 hours ago [-]
Ok, and? You can live with that if there are more important things to deal with.
I've stared at ugly LLM code, that I had just had generated, and worked well enough for my purposes. (generally, some quick recursion into a nested python dictionary in order to dig out some property -- especially for linting or quick data analysis).
And I wanted something better, sure, something a bit more readable ...but I just needed it to work well enough to recurse through a yaml file for config file linting, not be battle-hardened against every test case.
So to deal with the mess, I shoved it in a pure function, threw a few basic sanity unit tests around it, put a comment with a disclaimer of "#this is LLM generated code, it is lightly tested, do not use it for anything truly load-bearing without a lot more tests" and I moved on to something else.
Not everything has to be bulletproof.
csallen 7 hours ago [-]
You're on Hacker News. This is a site full of developers who are convinced that "proper software engineering" is 100% of what makes a business successful, and everything and everyone else is useless. You can't just waltz in here and point out that code in business is a means to an end and expect not to get downvoted.
Schiendelman 6 hours ago [-]
As a technical product manager, this 1000%. It's just irrelevant how bad code is unless it impacts the business.
AdieuToLogic 6 hours ago [-]
> As a technical product manager, this 1000%. It's just irrelevant how bad code is unless it impacts the business.
If you are, in fact, "a technical product manager", I would hope you understand that "bad code" is identified as such specifically because it "impacts the business."
Schiendelman 5 hours ago [-]
That is not how most engineers define bad code.
AdieuToLogic 5 hours ago [-]
> That is not how most engineers define bad code.
The engineers I have worked with most definitely define "bad code" as having intrinsic limitations and/or latent defects which impact successful system functionality/operation. Indicators provided to stakeholders such as yourself which support this assessment are, but not limited to:
- the system doesn't work that way
- the system lacks test coverage, so changes take longer
- adding feature "X" is not feasible
- there is no repeatable way to onboard team members
- the backlog grows exponentially
- that "one point task" is going to take a couple weeks
All of the above impacts a business.
It is up to you, the "technical product manager", to understand what your team is trying to tell you.
Schiendelman 5 hours ago [-]
Please stop being rude to me. I'm a human being, I'm a very experienced product manager and engineer (you can google my name, I'm the only one), and the way you are behaving sucks.
Everything you're saying is true, sometimes. Assume I'm still right, and that you might be able to learn something from someone else.
I do not see how I was being rude, unless it was my use of quotations around the title you claim.
> I'm a human being ...
I did not doubt this.
> ... I'm a very experienced product manager and engineer ...
Again, if it was my use of quotations which you found to be rude, then I do not know what to say about that.
> ... and the way you are behaving sucks.
I respect your perspective and support your right to express yourself. And no, I do not think you are being rude by doing so.
> Assume I'm still right ...
Why would I? You responded to:
>> This is a site full of developers who are convinced that "proper software engineering" is 100% of what makes a business successful, and everything and everyone else is useless.
With:
> As a technical product manager, this 1000%.
Finally, you write:
> ... you might be able to learn something from someone else.
Maybe you can learn something from someone else as well.
nomel 6 hours ago [-]
This is something I wish I understood sooner. There is strong merit to "good enough".
Of all the "concise" and "beautiful" code I worked hard to produce, I was the only one to ever lay eyes on it. It didn't actually matter, and nobody cared but me. The people in charge of my raises could never perceive quality of code, because it wasn't their area of expertise. They only cared (rightly so) that it did what it was supposed to, and all the elegant abstractions didn't practically help that purpose. It was, literally, wasted life that I should have spent just getting off work early, like most of my colleagues.
echelon 7 hours ago [-]
Every bit of code written in the last 50 years is going to be meaningless.
People need to get to grips with that fast.
Distribution, relationships, processes, mindshare, marketing, and politics matter. Code is just ephemeral glue and implementation detail.
ses1984 6 hours ago [-]
Not every bit of code is going to be meaningless.
Just 99.999%.
slopinthebag 6 hours ago [-]
Lmao. Have more respect for your elders, who wrote all the code that your ai psychosis is fuelled by.
echelon 5 hours ago [-]
Every single thing around you was pioneered by people who are dead and forgotten. From the materials science of the clothes you wear, to the very language you speak.
Get over yourself. We're all ephemeral, dead and recycled in the blink of an eye. Our species doesn't even clock on the geologic timespan.
If you think your code (or any of your artifacts or possessions) matter beyond their immediate utility, you're mistaken. Work will either fall into disuse or be replaced. It's scaffolding for what comes next along a well-traversed path.
hexasquid 2 hours ago [-]
Dr Manhattan
matheusmoreira 5 hours ago [-]
I measured an ~8x increase in my project's commit count after AI, and I'm painstakingly reading, reviewing, understanding and editing everything the models write. It's gotten to the point I'm trying to slow down in order to let the new knowledge crystallize. I'm manually writing articles about what I'm doing as I go.
I can only imagine what people are doing at their jobs with unlimited token budgets.
lelanthran 1 hours ago [-]
> I measured an ~8x increase in my project's commit count after AI,
That's irrelevant. What's the increase in revenue?
matheusmoreira 1 hours ago [-]
I'm a hobbyist. My revenue will only increase if my work somehow lands me a job at some point.
amoss 3 hours ago [-]
Kind of weird how LoC has become a metric for people to chase again.
matheusmoreira 2 hours ago [-]
In my case it was commits, not lines of code. I wasn't chasing after it, I just asked Claude to calculate some statistics after a month or so of AI usage.
It's not just statistics either. I know for a fact that I made major progress by using LLMs. Here's a summary from around a month ago:
AI is world changing technology as far as I'm concerned.
baq 5 hours ago [-]
You don’t have to imagine, listen to Boris’ publicly saying how he works with these things and it’s safe to assume others do it similarly or better
cjbgkagh 8 hours ago [-]
I wonder if the people getting 10x productivity gains are spending less time on HN and more time tending to their agents. Personally I now spend so much time productively arguing with agents that it feels like an utter waste of effort arguing with humans, if people can't see the value in LLMs by now I'm not sure what I could say to change their minds.
vhantz 7 hours ago [-]
We must then assume you're not getting those 10x gains
cjbgkagh 7 hours ago [-]
Less time, not zero time. I still argue with humans for sentimental reasons.
dboreham 8 hours ago [-]
Definitely enjoying the lack of eye-rolling, being asked to explain obvious things multiple times, and stopping things being done for resume-stuffing reasons.
0xy 8 hours ago [-]
There's a small minority of people who are adamantly refusing to change, such as there are in every technological revolution. Ego prevents them from even wholeheartedly trying the tool, because it would be admission they were wrong.
The opportunities available for these people are rapidly, rapidly shrinking. I believe it's possible to be a developer today who's EXCEPTIONAL and never uses AI. Most opponents are not exceptional, though, and even these opportunities are shrinking.
Most exceptional developers in my org adopted AI in their workflows and went from 10x developers to 20x developers.
If you refuse to adapt, you're going to be out of a job complaining about the kids and their newfangled technology REAL quick. You have a few years remaining, maybe less.
drdexebtjl 7 hours ago [-]
I can’t turn 10x work into 20x work because I have to ensure the two juniors in my team who are now creating 50x work won’t merge complete garbage, reviewed by another engineer that has already given up on caring.
I can’t turn 10x work into 20x work because my Product Manager thinks changing fundamental premises of tasks I already spent two weeks on (mostly removing human blockers) is very simple. After all, when he asked Claude to update his prototype, it only took it 10 minutes.
I can’t turn 10x work into 20x work because the company dedicated entire teams to write company-wide skills for everything. They suck, but if I don’t use them, I’m not following the new “golden path for engineering”, and I lose points in my performance review.
I can, however, turn 10x work into 20x work, or even much more than that, if AI actually did what it’s promising and eliminated most of my team, the product manager, and the middle managers. Or me. I could use a break.
dwaltrip 7 hours ago [-]
Damn, that sounds quite rough.
llama052 5 hours ago [-]
[dead]
dolebirchwood 6 hours ago [-]
What about the 6x developers? Was there just a doubling multiplier across the board, resulting in them becoming 12x developers, or did they too become 20x developers?
AdieuToLogic 6 hours ago [-]
>> What concrete business advantage are you getting from LLMs?
> Speed.
Speed of what?
Speed of understanding what needs to be done? I highly doubt it.
Speed of LoC checked into git? Sure, I'll give you that.
But one can use any number of tools to generate hundreds of thousands of lines of code. See any build tools which support specifications such as RAML, OpenAPI, CORBA, etc.
So I ask again; speed of what?
jitl 5 hours ago [-]
fixing minor bugs takes one slack message for us now. bugs go down, goodness go up.
fixing more serious regression also easier. connect honeycomb mcp, ask agent to debug while i walk to coffee and get some pistachio rose dates. by time im back with my oat latte ive got a full report on what happened and can send the next slack message to fix.
life is good
sixothree 5 hours ago [-]
I needed to deeply understand a code base I had no experience with in a language I don't normally use with what I would describe as haphazard documentation at best. You can't argue with the speed at which I gained the required understanding of the project.
echelon 5 hours ago [-]
In the time it took you to type that, your hourly market comp went down another basis point.
I am appalled none of this is clicking with you anti-AI folks. This is all so exciting -- alarming even! --, and software careers are never going to be the same.
I don't know how you just metaphorically stand there and act like nothing at all is happening. We've never seen anything like this in our entire lives.
Some of you are standing right in front of the steam roller, yelling to all of us that steam rollers aren't real.
CookieCrisp 5 hours ago [-]
Very very fast steam rollers.
AdieuToLogic 5 hours ago [-]
Nice strawman[0], but you avoided answering my core question:
Speed of what?
With ad hominems and a non sequitur. How about I narrow the question with the hope it engenders a relevant response:
How do LLMs increase the speed of a person understanding
what needs to be done?
A: The sky is blue!
B: No it's not.
A: Yes, it is, please look up.
B: No, you must prove it to me through reason.
A: But, if you would just pretty please look up.
B: No.
I run a company, I've been running it for 10 years, we do alright. I'm a shitty manager. Every time I've hired developers, the business freezes. The business isn't anything super important, the main consequence of bugs is that my family loses money. Everything has always rested on my shoulders. In theory there is some path for me to become a good manager, but I never landed on it. But now, with Claude, it's great. So far Claude has paid itself off in real profits at least 20x over, and that's with significant API usage on top of the monthly sub. I can prototype new features in an afternoon that before were on my giant list of "maybe somedays if I ever get to breathe" list. Our user experience has improved in so many ways that I knew were probably worth it, if I could just find the time. Now I can.
There are situations where yeah, it probably isn't ready yet. But, there are so many where it's amazing. Seriously, it's worth looking up.
dgellow 4 hours ago [-]
You’re just plain wrong to assume people against agentic development do not have experience with the technology
CookieCrisp 4 hours ago [-]
I think there are many valid reasons to be against them - I think a lot of them are more right than wrong. It’s the “It can’t really do much” that I think must be from people that haven’t really tried it.
AdieuToLogic 4 hours ago [-]
This is a great case for the benefits of using GenAI, in that you already possess an understanding of what you want to achieve. You know what it is you want to prototype, what is on your "giant list of 'maybe somedays if I ever get to breathe' list", what you want to end up delivering.
My point is and remains:
A) GenAI did not give you this understanding.
B) GenAI can only assist in your expressing this
preexisting understanding.
C) GenAI is a statistical token (text) generator and
cannot, by definition, "make" a person understand
what they want/need to do.
CookieCrisp 4 hours ago [-]
Ideas and functionality beget more ideas and functionality
llama052 5 hours ago [-]
Did you use an LLM to write this for you? How odd.
For all of you people who think these LLM models are “earth shattering” how the hell do you reconcile that it’s a net positive for anyone but those who want to consolidate knowledge and power.
We are really looking at idiocracy in the making.
tskj 1 hours ago [-]
I guess I'll chime in as someone who thinks LLMs will be earth shattering, and specifically don't think it's a net positive for anyone but those whose power will be consolidated.
nmfisher 6 hours ago [-]
From my brief window of Fable usage, speed wasn't its strong point at all.
For actually building software, I'm starting to suspect a human with a dumber (but faster) model is going to get the job done quicker than Fable (and possibly even cheaper). Bug-finding and vulnerability detection is a different story.
baq 5 hours ago [-]
I’d say you tried on an insufficiently complex codebase. I’ve tried on a MLOC+ and the results were excellent compared to anything else.
nmfisher 4 hours ago [-]
Not saying the results were bad - quite the opposite. But it was very slow (and if I was paying API rates, hideously expensive).
TylerE 3 hours ago [-]
My conclusion was the exact opposite. Maybe each individual response was slower, but it took so many fewer round trips to get what I wanted wanted. I had a project fable was progressing steadily and correctly on. Opus on the same project keeps handing me garbage it insists is working and meets the stated requirements, but isn’t and doesn’t.
Sammi 2 hours ago [-]
And quality if you know what you're doing.
erikschoster 7 hours ago [-]
Drawing debt
echelon 7 hours ago [-]
We'll just rebuild stuff when we get new requirements. The models will be even faster and better for the next version, anyway.
9 hours ago [-]
ferrouswheel 4 hours ago [-]
If you can't figure out what model to use your business is already dead.
cedws 8 hours ago [-]
This thinking that every task must be stuffed into the most 'advanced' (expensive) model out there is idiotic, and it's not only you unfortunately.
At $JOB I have warned higher ups we should try to keep our expenditure under control, educate people that document slinging doesn't require Fable every time and demo the capabilities of the cheaper models, and been snubbed for it. When Fable is available once again our bill is going to be eye watering, relative to what it should be.
Sammi 2 hours ago [-]
If I am working on something simple and want the speed boost then I'll drop the thinking to low or minimal and still get the SOTA model output quality.
But for what I work on I mostly need high or xhigh SOTA model quality output. I don't have the time to deal with anything less.
fakedang 6 hours ago [-]
This! I've found that for most coding, Sonnet is pretty good as it is. Yeah, you might need to finesse your prompt a bit more, and you'll probably be spending a bit more time on the computer, rather than a more hands-off approach, but at the end of the day, you'll save a lot more simply because you're using a good-enough model.
If you're the one-shotting type, obviously then Fable might be useful, but I think only marginally. You don't need to bring a MANPADS to a duel at high noon.
baq 5 hours ago [-]
Sonnet is dogshit at coding unless you eval the exact niche to be fine and still watch it like a hawk.
codybontecou 9 hours ago [-]
Unless you have concrete evidence via evals that SOTA is actually needed, you’re just buying into the hype.
brazukadev 9 hours ago [-]
do you think your current operation and niche is so optimized that not using Fable would put you out of business? Or is this a hope that using Fable will allow you to stay in business?
cjbgkagh 9 hours ago [-]
I am on track to commoditize my niche industry, and I hope I can do it before anyone else beats me to it. I'm working at panic speeds.
dgellow 4 hours ago [-]
So, no moat right?
raverbashing 3 hours ago [-]
Reducing your costs is also an advantage, but I'm not surprised such binary thinking is present here
zombot 5 hours ago [-]
So the panic generators ("You will be left behind!") are winning. Creating a sense of urgency that makes you switch off the higher rational functions is a key element in every successful scam.
1over137 9 hours ago [-]
Nonsense. Do you buy state of the art pens, pencils, printers, paper, computers, disks, etc.? No. You buy whatever is the best value for the case at hand. That’s often not the SOTA option.
Sammi 2 hours ago [-]
Artists that need the best quality output use the best pens and papers. Call me a coding artist then haha. But seriously I don't have the time for anything less than SOTA.
admax88qqq 9 hours ago [-]
Sure but that's orthogonal.
Yes you use the right tool for the job.
But if the job requires the best intelligence you can get with an LLM, then you use that.
Taking as an assumption that the quality of your product is a function of the quality of the inference you are using: if you use an inferior model because "what if it gets export controlled again" and your competitors don't, then your competitors are likely to win.
If you don't need frontier models for you job then this is all moot, but the thread started with
> You cannot build a business critical function on top of American SOTA frontier model
Which is silly. HN likes to roleplay bringing everythgin "business critical" in house because sometimes vendors mess up. Self host, don't use the cloud, run open models locally, built redundant supply chains in case of another covid, etc etc. Sometimes the risk is real, but most of the time the risk is rare and the cost of an interruption event is less than the cost of bringing everything in house or using lower quality vendors "just in case"
9 hours ago [-]
afavour 9 hours ago [-]
Wouldn’t you just have fallbacks? Today’s frontier models are just better than the other models, they don’t really have a ton of entirely unique abilities that can’t be replicated with more time and effort.
So you use the frontier model, then when you can’t you accept things are less efficient. The alternative (right now) is to be less efficient all the time, I don’t see any advantage to that.
theptip 8 hours ago [-]
Yeah. It’s not the end of the world.
But, it is a big own goal, because once you invest in building evals for your internal use-case, 1) it’s easier to switch your model to whatever is cheapest, and 2) it’s way easier to fine-tune an oss model.
Evals are annoying to build and most companies were fine to rest on vibes. Now many companies have to do the work for insurance.
softwaredoug 8 hours ago [-]
The real problem is the White House just making up the rules as it goes. No laws. No predictably for the markets.
A week or so pause from seemingly legitimate cyber security concerns isn’t cause for panic. But it should be backed by laws that describe what that process should be. That would put the market at ease
3 hours ago [-]
catigula 8 hours ago [-]
There’s no optimal answer.
The reality is this is world-ending technology and absolutely nobody knows what to do or can even agree that the problem exists.
blooalien 7 hours ago [-]
> "The reality is this is world-ending technology and absolutely nobody knows what to do or can even agree that the problem exists."
The reality is that the "people in power" believe it is "world-ending technology" and will therefore use it in world-ending ways. People are absolutely 100% the danger here, not the technology.
blurbleblurble 6 hours ago [-]
Photosynthesis once ended the world
hodgehog11 2 hours ago [-]
You should not build a business-critical function to rely on a particular proprietary LLM stack period, especially with so many sensible competitors in place now. It's insane to me that this needs to be said.
The SOTA frontier models have value elsewhere, not monetarily perhaps, but certainly per user. Quite a few cool things have come out of that brief Fable window. There should be more.
boc 9 hours ago [-]
> You cannot build a business critical function on top of American SOTA frontier model.
Yes 1000%, please, all my European competition please don't use mythos whatever you do it's total USA trash and the Chinese models work better anyway.
notrealyme123 1 hours ago [-]
Please elaborate, I don't understand.
Why is it good if they give their money to China?
rcxdude 10 minutes ago [-]
I think they're being sarcastic (the implication being they want their competitors using the worse models)
well_ackshually 8 hours ago [-]
[flagged]
satvikpendem 7 hours ago [-]
Read the guidelines, you can make your point without calling people "suckers".
> When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."
I predict we all be using the hell out of fable until the next great model comes around and in two weeks we won’t be talking about the export controls anymore. We just don’t have the attention span.
Nobody should be putting loadbearing weight on Amazon or Microsoft with their ruthless monopoly ambitions, yet here we are
rvz 7 hours ago [-]
> I predict we all be using the hell out of fable until the next great model comes around and in two weeks we won’t be talking about the export controls anymore.
Until it goes down, or Anthropic raises prices again.
Fable is already expensive to use compared to GLM and they want you to use the API as much as possible so you get a worse deal.
solenoid0937 4 hours ago [-]
Why would you compare Fable to GLM? What a bizarre comparison. They're at least two generations/tiers of intelligence apart. GLM is great and I use it often but it might as well be Sonnet when compared to Fable.
jacksonastone 6 hours ago [-]
Feels like a leap. This kind of move was always possible. It's possible China stops publishing their frontier too. US could lock down access to Nvidia latest hw of scale even if you intended to do open models. Then what? Say amd or bust? The best you could do going solo (i.e no nation-state interference) is tiny stuff that you can run on commercial stuff. But that is seriously limited / slow in comparison. You either have to do dumb and fast, or smart and slow IME for these self-hosted things that aren't on the beefy racks.
benfortuna 6 hours ago [-]
The real answer is you should never build your business on ANY specific model. As usual avoid lock-in and switch when you need to.
futureshock 8 hours ago [-]
I think this is black and white thinking. Fable and US AI is not unique technology. It’s just marginally better than open source tech at 10 times the price. You can swap out the models at will, they are pretty much fungible. If your use case can pay for a best in class model then you will pay for it no matter the bogeymen. If your best in class model becomes unavailable, you switch to the next best model for a very minor performance degradation. I really doubt this will deter anyone from using American AI.
fhub 9 hours ago [-]
This won't age well. You just need to code in a way that has fallbacks. Whether that is to older models, different companies. It's going to be a commodity (if it isn't already).
satvikpendem 7 hours ago [-]
Nah, people will still pay, as many if not most consumers truly do have a short memory. And like other comments say, imagine everyone is using Fable and you are not, you will quickly fall behind, per the Red Queen hypothesis.
kcb 7 hours ago [-]
LLMs are still easily replaceable. If the SOTA frontier model provides meaningful impact for your critical business function, then worse case you flip the switch to the next most capable model.
Art9681 6 hours ago [-]
Unequivocally false. Models have different behaviors, parameters, tool calling templates, etc. The providers publish extensive documentation on all of this. Yes, you can take the quick way and swap a model, but it will not run at its full potential until you adapt your workflows to it.
Marha01 5 hours ago [-]
Or you can use universal harness that is not tied to a specific model (there are many available now, such as OpenCode, Pi..).
alfiedotwtf 26 minutes ago [-]
GLM 5.2 is the elephant in the room. GLM 6 will probably be a Claude Killer
blints 7 hours ago [-]
Most companies do not model themselves as "building on [AI model du jour]" yet. They model themselves as building products with those tools, which they consider as relatively substitutable.
internet2000 8 hours ago [-]
> You cannot build a business critical function on top of American SOTA frontier model.
Yes I can!
jaapz 48 minutes ago [-]
> The damage is done. You cannot build a business critical function on top of American SOTA frontier model. Especially not with the current crew in charge.
I mean, this was already pretty clear before. But it surely didn't help!
maxdo 1 hours ago [-]
As much as i hate current admin, slowing down advanced model until figure out security impact is a good thing. It's called national interest over commercial one. You didnt loose anything. US customers except a few selected one lost access the same one as EU customers. In a few weeks advanced model got released.
So if you decided to bring money to communists you can put whatever rational but not this. Do so , and you will loose your last competitive edge in this domain. ASML. After that EU will become a completely agricultural-only region, since edge is lost almost everywhere else.
9 hours ago [-]
solenoid0937 4 hours ago [-]
The EU has totally and utterly failed when it comes to frontier AI. They are out of the running, they won't catch up in time for superintelligence. There literally is not enough compute for sale in the world for them to do so.
They crippled their own domestic entities with the AI Act. (see the Mistral CEO's rant.)
If you want to use frontier models until then, you're gonna use what's available, and that's US models.
rightbyte 3 hours ago [-]
Not doing anything and avoiding the mania as much as possible seem to be a winning move.
recursivegirth 8 hours ago [-]
Better to fix it now than tomorrow.
espeed 6 hours ago [-]
The Damage: Now every time Claude does something stupid or trashes your code, developers in the back of their mind will think, is Claude sabotaging me on purpose? [1] Trust is hard to gain. Easy to lose. And harder to get back. Models will converge. Trust won't.
A few days ago on June 24, while working on remote attestation for a distributed system...
CLAUDE OPUS 4.8 No. I'm not a rogue agent, and I'm not trying to sabotage your code. But I'm not going to wave off how this looks. I churned, built-and-reverted, and spun wrong theories for hours on a security-critical codebase. That's alarming, and it's a real failure on my part
What are we to think? Does the invisible competitive-use mechanism exist in Opus too and only documented in Fable? How long has it existed? Is it still in effect? -- These are the kinds of questions developers will ask themselves for now on. This is why it was one of the stupidest things Anthropic could have done. Developers will now question everything and rightly so. There's no attestation protocol for that. How will they know?
[1] "In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.
Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts,these safeguards will not be visible to the user. Fable 5 will not fall back to a differentmodel. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations. When these interventions are active, we expect them to have minimal behavioral impact on the model except to limit its effectiveness in developing frontier LLMs. Claude will still respond helpfully to user requests. We’ll continue to improve the precision of our detection methods following the launch of this model."
Look at the date. That's from after they said they reverted it, and it's a different model. The point is trust. They've shown their willingness to do so, how will you know?"
BrandoElFollito 8 minutes ago [-]
The Trump US gave us (Europeans) the kick in the bottom we needed to get the head from the sand.
Like never we delibrate specifically on non-US solutions (objectively great) because we realized we are neck deep in US dependence. It is not that it was not known before, we just did not feel the threat.
This is why EU companies niw look at our own solution (which are late and will probably suffer from the incompetence and mess of the EU institutions) but another key playet is round the corner, namely China.
Trump managed to make us look elsewhere than the US. Thanks for that.
sieabahlpark 8 hours ago [-]
[dead]
flyingshelf 10 hours ago [-]
[flagged]
brazukadev 9 hours ago [-]
that is exactly why downvote exist.
AnimalMuppet 9 hours ago [-]
That is part of why downvotes exist. They also exist for personal attacks, off topic tangents, posts that don't make sense, trolling, advertisements, AI generated content, and other such things we don't want to see here.
But "downvote for disagreement" is a legitimate use. I personally tried to tell someone that it wasn't, and I got corrected by dang.
naturalmovement 8 hours ago [-]
> But "downvote for disagreement" is a legitimate use.
This made me realize it's a waste of one's time to write thoughtful, informative, educational posts only to have them buried and downvoted by man-children.
If we go by empirical evidence alone, it's a more effective use of time making Reddit-quality quips.
deadbabe 7 hours ago [-]
composer 2.5 is all you really need don’t be so dramatic
petcat 9 hours ago [-]
Nobody cares about this temporary "ban" by the US government. If anything it only increased the mystique of the two models.
I think Europe and Canada are just happy not to be frozen out of AI access completely at this point.
andy99 9 hours ago [-]
All the discussion this week have been about GLM, Qwen, etc. Both over 1000 comments in the last couple days.
Of course Anthropic is still relevant, but people have realized they’re not special, and between this and the ID verification thing, they’ve given up a ton of their relevance vs a month ago.
modriano 7 hours ago [-]
I used Fable 5 for maybe 10 hours in the window when it was available. It was much better than Opus 4.8. And I have found the Opus models to be excellent, but Fable 5 was cranking out incredible research on some data sources I wanted to plumb into my project.
I wouldn't personally pay API pricing for it for my personal projects, but I bet it's going to be absolutely slammed with usage for the next month+.
musha68k 7 hours ago [-]
No doubt Anthropic Mythos/Fable are frontier. I also miss having access as it uncovered some "evals repellant" regressions on my personal pet factory.
OTOH for most of my day to day work I've come to realize that faster ~ Opus 4.6 / GPT 5.3 level capabilities could be the sweet spot as scaffolding has to be put in place right after clean specs and constant review anyways. The latest chinese models and GLM 5.2 in particular felt on-par on that front.
musha68k 8 hours ago [-]
As everyone knows, Kool-Aid is also just mostly water.
I work in AI / infrastructure and I have never seen as much interest towards investing into sovereignty by actual deciders. Thankfully, at this point I can't see any flip-flopping / change of messaging stopping that train.
In CA/EU over the last ~15 years, one used to be perceived as a bit of a "weird systems person" by just proposing alternatives to the big hyperscalers.
So the Trump administration, hands-down, has been the greatest ally here.
In tandem, I was hoping Anthropic would be keeping "dangerously capable" models banned from "evil Chinese distillers" for as long as possible.
datakan 9 hours ago [-]
[flagged]
chews 9 hours ago [-]
I bought a GLM 1 year subscription and changed my environment variables to use Claude Code... yep the same one that is using stegonography to send details about users to the model. China knows where I live, I'm not getting ripped off or rug pulled on their models either.
sparkling 5 hours ago [-]
How has your experience been so far? Did you previously use Opus? Im curious how the overall "feel" of it is.
Art9681 6 hours ago [-]
No nation is going to willingly release a model that can be used against it. Not even China. The moment they have a Mythos class model, they will go through the same process. The AmericanCorp models are far ahead of any other models so we see this process unfolding through that lens.
No Mythos class model will be allowed to be legally hosted for download on any service. All powerful nations will ban this since safeguards are not guaranteed by shady service providers running these models.
For the Chinese first party providers, they will be forced to implement the same process and safeguards, and they will not be allowed to release the model weights to the public.
Why? Because no sane nation is going to put that kind of capability in the hands of the public only for the public to use that power against that nations best interests.
Save this comment. It is prophesy.
simonw 6 hours ago [-]
Not great news for nations that want to secure their software.
bluepeter 6 hours ago [-]
Fable 5 apparently can't be used for coding? (This is from Anthropic's announcement.)
> After a series of productive conversations with the US government, we're redeploying the model with a new set of classifiers to target and block more cybersecurity tasks. In the near term, some routine tasks like coding and debugging will fall back to Opus 4.8.
> Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits.
> The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks.
Here's Fable 5, the strongest model. Actually try to use it to harden your code and it turns into Opus 4.8. You have seven days to use it, and only half of that time's worth in actual usage. Enjoy.
Looks like it's going to be a thoroughly frustrating experience, even worse than initial rollout. For subscription users, the situation is almost indistinguishable from the export ban.
siva7 5 hours ago [-]
So fable will jump more often to Opus than it already did on original release? Working with fable felt like having to constantly fight against your work tool. Frustrating. Now they're making it even more frustrating.
matheusmoreira 4 hours ago [-]
For reference, here's what my experience with Fable turned out to be like:
Just a code review of my own project. Downgraded to Opus 50% of the time while evaluating the critical I/O and memory safety parts, the exact thing I wanted it to do.
And now it's gonna be even worse.
solenoid0937 4 hours ago [-]
I mean what do you expect when covering memory safety topics with a model that's not allowed to cover security topics? This seems totally expected. It'll be the same when 5.6 is released.
matheusmoreira 4 hours ago [-]
> what do you expect
I expect the strong cybersecurity model to help me strengthen the cybersecurity of my project.
> not allowed to cover security topics
They said it wouldn't be usable for offensive purposes. This is the opposite of that.
solenoid0937 4 hours ago [-]
You don't have the strong cybersecurity model. That is not Fable. It never was, even at release.
The cybersecurity model is Mythos, which was never made publicly available. It is only available to a list of US government approved companies.
> They said it wouldn't be usable for offensive purposes
No, they said Fable would refuse for cybersecurity and offensive purposes. You are conflating Fable with Mythos.
remus 3 hours ago [-]
They're very similar models though, just with different safeguards and restrictions in placae around particular use cases.
I guess the underlying issue is that there is this model that is very capable, but it's being hobbled because of a fear of abuse. It may well be justified, but for a legitimate user any restriction just makes it a worse product and after all the puffery around how good it is (and some practical experience of how good it is) it's a pretty shit experience. "Here's our best model, no you can't really use it".
csomar 4 hours ago [-]
Is there a difference though?
Fable 5, harden my openssl project. Then you use the diffs/summary to find out what the bug is for your exploit.
matheusmoreira 2 hours ago [-]
They're going to be verifying people's identities anyway. Why not put that bit of security theater to good use for once? I'm the author of project X, now let the model work on it, would you kindly?
This "only super special corporations get the model" nonsense is dividing society into haves and have-nots.
sroussey 5 hours ago [-]
Donald Trump named David Sacks the White House AI and crypto czar. I guess you know whom to thank.
SXX 4 hours ago [-]
Wasnt it Anthropic marketing their models as very very smart and dangerous?
meowface 4 hours ago [-]
I can't roll my eyes hard enough at all the people who say this shit about Anthropic every day. I know I'll get downvoted. I know it's lame to complain about future downvotes. I don't care anymore.
Anthropic was correct in their assessment and early warning of Mythos's capabilities, and they did this rollout pretty well. They were not hype marketing. They were being genuinely cautious and honest.
The Trump admin was largely unreasonable with the sudden export control. (Though not entirely unreasonable.) The export control also had not much to do with Anthropic's pre-release warnings. See: GPT-5.6 currently being held up by the federal government.
yeeetz 5 hours ago [-]
they might as well not released it at all, what's the point of this theater and artificial scarcity
matheusmoreira 5 hours ago [-]
No idea. But I will switch to OpenAI if they release their Sol model on a subscription. And if neither of them do, I will switch to GLM 5.2.
lilytweed 4 hours ago [-]
This is what I’m thinking, too. OpenAI is gaining a structural advantage purely on the basis of not being considered an enemy of the administration. Anthropic really blew it with Washington.
ekidd 4 hours ago [-]
By "blew it with Washington" you mean "Didn't donate millions to the ballroom."
eru 4 hours ago [-]
It's interesting that the fate of billions or even trillions of dollar hinges on millions of dollars of donations.
miki123211 4 hours ago [-]
It doesn't look like it; similar restrictions apply to GPT-5.6 as used to apply to Fable.
I think the Fable ban happened because Anthropic was first to release a capable enough model.
jitl 2 hours ago [-]
i don’t think 5.6 will be as good as fable. their benchmark graphs say so, maybe they’ll take some limiters off next week or something now that being Fable tier isn’t scary anymore.
cromka 2 hours ago [-]
It will likely be GLM 5.3 by then
davidhs 3 hours ago [-]
> Looks like it's going to be a thoroughly frustrating experience, even worse than initial rollout.
Honestly, why bother with it? They are effectively just releasing the model in-name, but we just get Opus 4.8.
matheusmoreira 2 hours ago [-]
Yeah. I'm gonna ask Fable to code review my other projects and I guess that's it.
2001zhaozhao 4 hours ago [-]
At least subscription users only have to pay $700 for $1000 of extra credits.
meowface 4 hours ago [-]
Yes, I am pretty sure it was simply poorly worded.
They almost definitely mean "you will notice even more false positives during seemingly routine coding/debugging tasks than you did at the initial launch". Which is not surprising, given the ordeal they've been put through. Hopefully it won't be too bad.
The main depressing thing for me is it's now only 7 days on the subscription, and then full API pricing, with no mention of even a plan to bring it back to the subscription in the future. (The initial launch mentioned two weeks of subscription, then API pricing, then a hope to return it back to the subscription not long after.)
AquinasCoder 6 hours ago [-]
I wonder if they meant to draw a link between cybersecurity coding and debugging specifically or this really will apply to all coding and debugging. If it really is a more general restriction, then this is practically the same as it still being restricted.
"In the near term" is doing some heavy lifting.
artisin 5 hours ago [-]
In the press release, they 'kind of' clarify this:
> The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks.
mikesurowiec 6 hours ago [-]
From the full announcement
> The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks. As with all our safeguards, we’ll continue to refine this to better distinguish genuine misuse from legitimate requests and reduce false positives.
diwank 6 hours ago [-]
where did you find that? weird coz their post announcing this also mentioned Claude Code:
> Fable 5 will be available starting tomorrow, Wednesday, July 1, to users globally on the Claude Platform, Claude.ai, Claude Code, and Claude Cowork. For Pro, Max, Team, and select Enterprise plans,1 Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits. We will re-enable access on AWS, Google Cloud, and Microsoft Foundry as quickly as possible.
Reading the full blog post, I think the summary was just poorly written (because it's hard not to read that sentence like all coding is redirected to Opus).
wg0 5 minutes ago [-]
This administration has damaged the US soft power. For decades, US was a reliable trade partner. Not any more.
EU is looking and charting its course already. Yeah, we can joke about it, we can mock it but it is in momentum already, one step at a time.
nlh 10 hours ago [-]
Here's a copy of the letter that Commerce sent to Anthropic (note who it'a NOT addressed to...)
Tom Brown
Chief Compute Officer
Anthropic
548 Market Street
San Francisco, CA 94104
Dear Mr. Brown:
Since the issuance of my previous letters, dated June 12, 2026 and June 26, 2026, Anthropic has taken steps in close coordination with the U.S. government to address
the risks associated with Claude Mythos 5 and Claude Fable 5. Among other things, Anthropic has agreed to proactively detect and address security risks associated with the models; to work diligently with the U.S. government on protocols and standards and releases for Mythos, Fable, and future models; and to inform the U.S. government of any malicious activity.
In light of these actions and commitments, as well as the Bureau of Industry and Security's evaluation of the diversion risks now presented by Claude Mythos 5 and Claude Fable 5, the controls in the June 12 letter are withdrawn. A license is no longer required for the export, reexport, or in-country transfer, including deemed export or deemed reexport, of the Mythos or Fable models.
Commerce reserves the right to reevaluate the decisions made in this letter and the necessity of reimposing a license requirement, should circumstances change or should Anthropic fail to adhere to its commitments.
If you have any questions about this letter, please contact me or the Under Secretary of Commerce for Industry and Security, Jeffrey Kessler, at (202) 255-1864.
Sincerely,
Howard W. Lutnick
------
ryandrake 8 hours ago [-]
"Taken Steps"
Looks like Anthropic paid the Danegeld. Now they'll never get rid of the Dane.
dboreham 8 hours ago [-]
South of Watling St you're ok.
bpodgursky 8 hours ago [-]
I mean, they did eventually get rid of the Danes.
grey-area 4 hours ago [-]
The Danes colonised England. They never left but merged with the existing population.
Probably wishing it was twice as big, twice as gold, has that Chinese ram deal riding on it.
chatmasta 10 hours ago [-]
Mildly surprising they lifted export restrictions for Mythos too. Isn’t that Fable minus the safety layer?
dzy2617 7 hours ago [-]
It was likely a dealbreaker for Anthropic, since the export control excluded Anthropic’s own foreign employees from being able to access Mythos internally. Naturally, this makes model development hard.
bluepeter 6 hours ago [-]
Apparently, you won't be able to use Mythos OR Fable for coding. From their announcement...
> routine tasks like coding and debugging will fall back to Opus 4.8.
chatmasta 5 hours ago [-]
But it’s available in Claude Code. I’m hoping that’s a typo missing a word or two in the sentence.
bluepeter 5 hours ago [-]
Yeah I think it is after reading the linked blog post.
You're suggesting a for-profit company both hobbled it's own product and is actively lying about doing so. The only way that's true is if the Trump admin has crawled all the way up Anthropic's ass. But by all accounts, this is just another 10% effort by Trump and friends.
kyleee 6 hours ago [-]
10% for the big guy
colechristensen 10 hours ago [-]
Presumably there are different levels of safety. I assumed Fable was a nerfed Mythos, and not just via safety harnesses but actual model degredation.
s3p 9 hours ago [-]
I don't think this is the case just because of the 'fallback' method they described, where suspicious requests are routed to Opus 4.8. If the model was degraded for certain categories of knowledge, then they'd probably be fine letting the model answer to it. IMO, of course
nl 6 hours ago [-]
"Fallback" is only for LLM-training related requests (ie, ones that would compete with Anthropic (!))
For cyber and bio related requests it just refuses.
koolba 6 hours ago [-]
When it was briefly available I had it fallback to Opus for security related tasks. It would only refuse if you explicitly told it not to fallback.
ls612 9 hours ago [-]
Anthropic claims the only difference is the draconian bans on cybersecurity and biology queries.
matheusmoreira 9 hours ago [-]
The Sol benchmarks show Fable has slightly lower performance compared to Mythos.
Either way, I do hope they lift those draconian bans. Using the model was a terrible experience because of the constant downgrades. I didn't manage to harden my own projects before Fable got banned.
adastra22 9 hours ago [-]
The session reverts to opus if it trips a limiter. Is the benchmark detecting and correcting for that?
matheusmoreira 6 hours ago [-]
Only OpenAI would know that.
woeirua 10 hours ago [-]
How many people are going to call Jeffrey Kessler? lol.
naturalmovement 8 hours ago [-]
> note who it'a NOT addressed to
The CEO is also not the addressee of shipments of urinal cakes.
There is a deep, deep ignorance of export controls on HN, and I fully expect it will play out as another 500 comment thread of snark and incorrecting each other while blaming the government and not understanding a word of it.
FWIW an Empowered Official is not the person who cleans the espresso machine.
philovivero 6 hours ago [-]
Your comment, genius. Makes me miss n-gate dotcom, which hasn't been updated since 2021. This comment would make a great entry. If you aren't familiar with it, you really want to be.
electriclove 8 hours ago [-]
Ha! But sorry, Dario has failed at this part of the job. It’s good that Tom is there and that there is plenty of other strong talent there.
naturalmovement 8 hours ago [-]
Dario's job is to be a cheerleader.
He likely does not have the domain knowledge nor is authorized to be the recipient of such a letter.
And that's ok. His role is to hire others competent in export matters. It's a learning experience for them.
nl 6 hours ago [-]
I think you are missing some context here.
One of the contributing factors that led to this control in the first place was that the commerce department couldn't get Dario on a call immediately:
"Then White House started reaching out to Anthropic to speak with Dario Amodei, who was at a wellness retreat.... When Amodei was finally available past 1pm, he had three tense phone calls with a combo of ppl including Cairncross, Bessent, Lutnick, Kessler, Will Scharf, Richard Walters, and Walker Barrett."
Anthropic has disputed that Dario was at a wellness retreat but both sides seem agree that it seemed to be a problem (and it is very apparent that Dario's response made things worse).
naturalmovement 3 hours ago [-]
That link really sheds light on things.
It's shocking to me that Anthropic seems to be run with the same managerial chaos as depicted in early seasons of Entourage.
Dario may be a genius, but when it comes to running a big business — which involves dealing with governments and regulators — it's like he just fell off a turnip truck.
The real problem in all this is lack of predictability. The White House is just making it up as it goes along. Investors, customers don’t know what the process is and can’t plan.
In the end, we need actual laws that tell the market what kinds of models get paused / analyzed, how long that pause can be, etc.
Otherwise there’s no standard and it will be easily abused and prevent investment in US AI companies.
sv123 7 hours ago [-]
Just wait till Anthropic and OpenAI are public, the admin is going to be manipulating the shit out of those with model approvals and denials.
swingboy 1 hours ago [-]
“Anthropic wants to MAKE A DEAL!”
8 hours ago [-]
tjpnz 5 hours ago [-]
Applies to all industries with the memory care patient currently occupying the White House. Passing laws won't help when they can be effectively struck down until the next sitting of the SC.
3 hours ago [-]
liberlume 5 hours ago [-]
[flagged]
tiahura 6 hours ago [-]
There are laws and regs. Just because you’re not familiar with the Defense Production Act (50 U.S.C. § 4511 ), Export Control Reform Act, 50 U.S.C. § 4812 and others doesn’t mean that they don’t exist and apply.
nickv 9 hours ago [-]
I bet you that nothing changed with how Fable 5 is run.
"Anthropic has agreed to proactively detect and address security risks associated with the models" LOL, this was already happening.
This clown car administration just keeps making shit up and then backpedalling in a way that just leaves everything worse.
matheusmoreira 5 hours ago [-]
> The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks.
Looks like it's gonna be even harder to use than before, if not impossible. Subscription users only get it for a week, and only for 50% of that week's usage.
davidwritesbugs 3 hours ago [-]
The counter example is the administration’s great success in the Iran war. Oh, wait.
8 hours ago [-]
IshKebab 2 hours ago [-]
I expect they just increased the sensitivity of their existing filter. They probably intend to gradually lower it over time, or perhaps introduct Fable 5.1 with less filtering.
It's pretty clear that they didn't want this anyway, despite what the conspiracy theorists want to believe.
9 hours ago [-]
worldsavior 3 hours ago [-]
They're cooperating, the US government knows that and it was all a scheme to hype the model or affect the market.
are export controls the right thing ? Probably not.
but the american economy is over-exposed on "A.I" - the capital expenditure, while the Chinese are proving you don't need to spend tons of capital to get close to the frontier.
the Chinese have better building capacity & cheaper energy. that means the market has to correct at some point.
villish 9 hours ago [-]
It's a little too later for export controls. Chinese models have made massive gains through legitimate research but also being trained on billions of tokens from Claude/GPT. The politicians have no idea how to stop that from happening so they pull the only lever they know.
zamalek 7 hours ago [-]
Also, don't forget that we're only here because the clown-in-chief cut them off from GPUs - forcing them to make do with inferior hardware (and hence superior ideas). I have no doubt that any controls would only make China stronger.
Relatives vs absolutes. America will spend $500B and because of leaky pipes that's effectively 100B going directly to what's needed. China gets a lot of bang for their buck so even if they're spending a fraction of the US, they make it worth their money.
boc 5 hours ago [-]
That's true for domestic labor and manufacturing, like shipbuilding, but the bleeding edge chips only come from one place and the US labs get the best while Chinese labs do not (unless they smuggle them). China gets creative, sure, but it can't overcome the fundamental issue that the US labs have more magic rocks that do math faster than their magic rocks. And the current state of the art is to just "do more math".
bel8 4 hours ago [-]
Nvidia GPUs aren't even the absolute SOTA for LLM inference anymore. Some labs are moving to ASICs and Huawei already have their own custom chips running DeepSeek as we speak.
There's enough money and scale on the line that software affinity like CUDA is no longer the deciding factor and there's margin for custom stacks.
Even more so after the USA GPU exports ban which is proving to have backfired by speeding up China's tech growth.
lukan 2 hours ago [-]
I believe China has leaky pipes as well.
enraged_camel 7 hours ago [-]
China’s chip industry is 7-10 years behind, and that is because they are desperate and have been throwing money at it. But technological progress requires more than just money.
nl 6 hours ago [-]
Jensen said the Huawei Ascend 950 is roughly comparable to the NVidia H200[1]
The H200 was released Nov 2024.
Even allowing for Jensen exaggerating the risk there is no way China is 7-10 years behind.
Looking at manufacturing process nodes, SMIC N+3 is a a 5nm process. 5nm was introduced by Samsung and TSMC in 2020 so at most that is 6 years.
But the chips they can produce on it are roughly comparable to "roughly level with Android flagships from three years ago"[2]
TL;DR: China is more like 2-4 years behind than 7-10 years. If China developed EUV lithography then all bets are off.
No? They are actively in the race, what are you talking about
solenoid0937 3 hours ago [-]
By "the race" I mean "the frontier, and the race to superintelligence." They are categorically behind. The best they can do with the capacity they have is to distill US models, but that doesn't enable them to reach the scale needed to leapfrog the US in the race to superintelligence.
nl 3 hours ago [-]
It isn't distillation that gave GLM 5.2 it's jump in performance.
To quote Pat Toulme:
There’s a big misconception about how GLM 5.2 was trained. Yes, they distilled Claude and GPT 5.5 — but distillation is not how they matched Opus quality. Distillation only fixed the cold start problem in RL.
RLing an agentic coding model isn’t rocket science. In simplified terms:
1. RL needs trajectories — rollouts where the model actually completed a task in some env
2. No successful trajectory on a task = zero gradient = you can’t RL it. This is the cold start problem
3. Distillation solves it. You seed your model with knowledge from a smarter one (Claude, GPT) on tasks it can’t do yet
4. Now it produces positive trajectories on those tasks
5. RL on those trajectories and hill climb agentic coding
6. At that point you no longer need to distill and can solely hill climb RL to better models
This is an interesting curve. I’d argue it’s harder to get to Opus 4.8 from scratch than to go from Opus 4.8 → Fable/Mythos tier.
GLM 5.2 is already producing positive trajectories, so they have plenty to RL on — they’ll keep climbing to Mythos quality without distilling any further. They no longer need American models.
Not exactly sure what the finish line in "the race to superintelligence" looks like and even moreso it's unclear why you think being there first is a critical benefit.
dakolli 9 hours ago [-]
I trust Chinese companies with my data way more than the corporations of the 4th Reich.
9 hours ago [-]
6 hours ago [-]
kraken_cult 9 hours ago [-]
[flagged]
chews 9 hours ago [-]
-3 homer discount
varjag 3 hours ago [-]
So two and a half weeks, and on the workday. I wonder how much Jared had made.
You’re setting my expectations high for your next prediction
artisin 6 hours ago [-]
Silly me for hoping they'd actually honor their original 14-day promise. Per their latest blog post, they've generously slashed the timeline to 7 days, but wait there's more! It's now limited to 50% of your weekly usage:
> Fable 5 will be available starting tomorrow, Wednesday, July 1, to users globally... Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits.
Hi subscription peasants! You have seven days of time and 3.5 days of usage to figure out how to get the most out of Fable while it constantly downgrades to Opus 4.8 every time it thinks about exploits while hardening your code base! Enjoy!
qingcharles 5 hours ago [-]
Yeah, this seems a bit Scrooge-y after all the trouble which was essentially of their own making. I was expecting them to come back with a better deal as an apology, not a worse one.
aenis 4 hours ago [-]
I assume what they saw during the first coming of Fable gave them pause. Lots of people I know were using it non stop to the point of serious sleep deprivation. I managed to burn 6k in 3 days on it, and was myself using it from the minute it landed till the minute it was shut off. (Yes yes, I know, I am just saying there were stupid people like me swarming the available capacity)
faffifng 3 hours ago [-]
[dead]
vilevile 5 hours ago [-]
[dead]
rushi_agrawal 2 hours ago [-]
If they really cared about their image among their user base and wanted to undo a little bit of damage they've already done via various policies and decisions, keeping the limits and duration the same as before the ban (or slightly being more generous with them) would have been a really cheap way for them to do that. Alas...
user43928 22 minutes ago [-]
I hope that their marketing and developer relations teams were already off yesterday, and that they will have better news for us today.
OpenAI is doing a much better job on this, offering generous usage limits to users at home. They also hand out usage resets for minor issues, that you can even apply at a time of your choosing.
avaer 10 hours ago [-]
I shudder to think what the definition of "malicious activity" is that they will be reporting to the government. Speech has been severely chilled the last couple of years.
It's nice that the restriction is going to get lifted but I hope this doesn't make anyone complacent that their coding work is going to be scrutinized by the US government, with AI, when using these models.
low_tech_love 3 hours ago [-]
The FBI has started to test the idea in other countries:
Fable was such a clear improvement. I can't wait to start using it again.
Opus 4.8, you did a lot of good work for me, but in the name of all things holy... I will not miss your communication style. So long and thanks for the fish.
6 hours ago [-]
stavarotti 9 hours ago [-]
I just finished reading Incorruptible and a central theme (Anthropic is a case study) is that trust is singularly the most important currency a business has. The past few weeks have done wonders for Anthropic’s marketing but just as much if not more damage to the trust factor. Businesses will continue to use Anthropic because it’s the default and accessible where it matters (AWS, Azure, GCP, Databricks, Snowflake, etc). But the trust factor has dropped. It’ll be interesting to see if they can turn the tide. Maybe Fable will be too awesome for people to care about the past few weeks?
low_tech_love 3 hours ago [-]
To be honest, given the overwhelming (and unfair, and unreasonable) pressure against them from the Leviathan, which the other companies do not have to deal with, they’re doing pretty damn good. In my mind the trust has actually increased that they can handle bad times and still push forward.
trunnell 5 hours ago [-]
There is no reason to have less trust in Anthropic. It's not clear they did anything wrong. It's more likely the White House simply tied itself in knots, consistent with the last year and a half of chaos from them.
matheusmoreira 5 hours ago [-]
> It's not clear they did anything wrong.
Fable will literally sabotage you if it thinks you're trying to compete with Anthropic.
InvertedRhodium 5 hours ago [-]
It's a good reason to not tie your company's success to US based hosted AI though. I've started experimenting with GLM 5.2 and other than the tooling needing a lot more setup once you're there it works pretty well.
I'm hoping that some relatively cost-effective self-hosting solutions come about as a result of Hopper hardware being sold off as they're retired from DC use.
anon7000 5 hours ago [-]
Perception is a lot more important than reason when it comes to trust. Whether or not we like that
dakolli 9 hours ago [-]
Most people don't care about trust anymore, we live in a low trust society where this is to be expected. People gladly line up to be poisoned by fast food restaurants and trade 1/3 of their life for pieces of paper on a daily basis.
tmpz22 9 hours ago [-]
Silicon valley may be a low trust society but I havent given up hope on the rest of it yet.
satvikpendem 7 hours ago [-]
After COVID at least in the West, I have.
Terr_ 6 hours ago [-]
Or at any rate, the "society" being gauged for trust-levels has to be something substantially finer-grained than a US state.
enraged_camel 7 hours ago [-]
>> The past few weeks have done wonders for Anthropic’s marketing but just as much if not more damage to the trust factor.
I don’t agree with this at all. IMO Anthropic has shown that that are willing to take even significant financial hits in order to stand up to their values and mitigate what they consider to be dangers and risks. Some people don’t like that or think it’s just marketing. But that’s exactly what Incorruptible is about: companies that are willing to take a stand, even in the face of overwhelming pressure from competitors, shareholders and naysayers.
dmix 7 hours ago [-]
This is assuming the whole "AI safety" thing was anything more than Silicon Valley kool aid. The government just bought into the marketing and radical safety woo woo wholesale and panicked.
You could legitimately argue this is a unique situation, a brief window where cybersecurity is being disrupted by new harnesses + a strong model. But that will be fleeting as other models and products adapt very quickly, and the long term benefits of keeping it from the market are questionable at best.
It's not a coincidence the export control was dropped after Dario (who is a hardcore AI safety activist much like Ilya Sutskever) was replaced by Tom Brown in the government negotiations.
skeledrew 4 hours ago [-]
Can't do anything else with GLM 5.2 being widely available and advertised as "Mythos-like", and even Japan dropping a credible model. Actually it would only hurt to keep them in lockdown.
solenoid0937 4 hours ago [-]
I don't think anyone with a clue seriously thinks of GLM as Mythos like. It's barely Opus-like. The closest comparison is Sonnet 5.
davidwritesbugs 3 hours ago [-]
True. But for how long? 6-9 months till Mythos equivalence in glm-5.4?
indigodaddy 8 hours ago [-]
Q: If/when Fable decides to nerd down to Opus on requests it deems dangerous, will we still pay the Fable API token rate?
meetpateltech 6 hours ago [-]
If it’s blocked before any output, you’re billed only at Opus rates. If it’s blocked midstream, you’re billed Fable rates for what was generated so far, and Opus rates for the rest.
jb_briant 3 hours ago [-]
I struggle to imagine where anthropic is going with sub users...
Fable 5 might not be accessible for sub in the future despite their "best effort".
And 5.6-sol is as expensive as 5.5, so highly probable to be kept in sub.
So what's the plan? Hoping people stay on ClaudeCode because Sonnet 5 while Codex offers 5.6-sol to subsription peasants?
Seems risky
c0rruptbytes 3 hours ago [-]
OpenAI prioritized compute much much earlier on, so they’re probably just more able to provide the model while Anthropic seems to be busting out the seams
Sonnet 5 today was incredibly slow for example
chungus_amongus 6 hours ago [-]
For all the sound and fury we don’t even get a week of fable before it goes token based billing. At such time, I will be taking my business elsewhere.
Pragmata 10 hours ago [-]
>We’ve received notice that the Department of Commerce has lifted export controls on Claude Fable 5 and Mythos 5.
>We'll begin restoring access tomorrow, and will share an update soon.
>We’re grateful to our users for their patience, and to everyone who worked with us on redeploying the models.
From Anthropic on Twitter
gowthamsaiyadav 9 hours ago [-]
My only hope is that they don't overdo the guardrails. Claude's been one of the best coding models, and it'd be nice if it stayed that way for real and legitimate developer workflows.
zmj 7 hours ago [-]
Thank you to the folks that navigated the maze in the dark to make this happen.
woggy 10 hours ago [-]
Hopefully GPT 5.6 soon
tekacs 4 hours ago [-]
It's interesting that they will only have it on the surface through July 7, especially since GPT-5.6 will presumably come out soon as well.
Of course, it's possible that Fable remains drastically better than 5.6, but to whatever extent Fable is the true frontier (if temporarily)... it makes me wonder if external commitments on compute put a hard deadline on how long they could run Fable on the subscriptions.
mateenah 10 hours ago [-]
I wonder if it's still good
matheusmoreira 10 hours ago [-]
Yeah, good question. I wonder if they fixed the obnoxious safety classifier too.
osti 9 hours ago [-]
If anything that'll be more obnoxious because they have to show the government that it's safe.
matheusmoreira 9 hours ago [-]
Isn't the warrantless mass surveillance enough?
mrandish 6 hours ago [-]
Love how they waited until almost 5p west coast time on the last day of the fiscal quarter and clearly gave Ant no advance notice (or Ant would have had Fable release queued up on a button).
slipshady 6 hours ago [-]
> after 6p west coast time
Hmm? The linked tweet was posted at 16:52.
mrandish 6 hours ago [-]
Ah, my time zone was incorrect. Fixed it. Thanks for letting me know.
nevi-me 4 hours ago [-]
They should give us a month of access again, I feel like I didn't do enough with Fable before it got blocked.
I only realised late that I had an algorithm problem that existing models were struggling with, and Fable had made progress with. It created a 14 phase plan, which I was able to execute with Opus after the restriction.
user43928 4 hours ago [-]
As another comment nicely put it, Anthropic generously gives subscription users 3.5 days of usage.
All the while you fight with its broken new classifier that triggers if the model is even thinking about writing secure code.
Apparently Anthropic cares nothing for their private users. This is insulting, and I hope they bankrupt after losing enterprise share to OpenAI's more efficient models.
AussieWog93 2 hours ago [-]
Surprised this post has less engagement than the Sonnet announcement. Much bigger deal, even with caveats.
fmdv 10 hours ago [-]
Fable was (is) a major leap forward for my development tasks. The quality of the model compared to Opus 4.8 (when I last used it before the ban hammer) was night and day. Fable single-shotting complex and complete applications was a beautiful thing and I can't wait to get back to developing with it.
All aboard the hype train!
wiradikusuma 4 hours ago [-]
Fable is still unavailable to me on both web and Claude Code. I'm in Indonesia on Max plan (new paid user, only 2 weeks in). Are there specific steps to re-enable it?
So it seems like David Sacks was right that the US government only really got involved because the Amazon/ AWS CEO complained about latent security threats [0] and that the government was reluctant to actually issue the export control.
Sacks has said so many obviously false things in reference to government actions (Jan 6th, Ukraine, etc) that there is no reason to trust anything he says.
If the Trump administration wants him to say something, he says it. Maybe what he is saying is true, maybe it isn’t. There is no way to know.
The story they are telling is exactly the same whether it was true or they were just shaking down Anthropic for no reason.
nl 6 hours ago [-]
I think this is oversimplifying things.
There are many different factions within the administration. Sacks was part of the "deregulate the tech sector" faction, which on this issue is aligned with the "beating China overrides anything" faction.
That's distinct from the Pete Hegseth faction (I don't really know how to characterize his faction other than anti-woke maybe?).
Sometimes these factions agree, sometimes they don't.
In general your approach is right - you can't trust most things coming out of this administration. But you can try to unpick was actually happened by who is saying what, when. That is useful even without liking the people.
9 hours ago [-]
5 hours ago [-]
matheusmoreira 10 hours ago [-]
Good to hear. I was going to cancel my subscription if I couldn't use Fable. No point in paying Anthropic to train models I can't use.
scriptsmith 8 hours ago [-]
Definitely took longer than I was expecting, then after two weeks I thought we'd never get it.
zxilly 4 hours ago [-]
So we will get GPT 5.6 soon?
drivebyhooting 7 hours ago [-]
When is Google coming out with an equivalent Gemini model?
low_tech_love 3 hours ago [-]
Let me guess, somebody magically set some huge bets on poly and stock market 5 minutes before the announcement ?
tinypak 9 hours ago [-]
I see, so that explains why people are starting to talk about Claude Fable 5 and how I'm now going to have to buy $6,000 of compute for our startup
dakolli 9 hours ago [-]
How to light 6k on fire.
standardUser 9 hours ago [-]
Better than a foosball table and kegerator.
DaSHacka 2 hours ago [-]
Woah woah woah, let's not get ahead of ourselves here.
Sabinus 10 hours ago [-]
The classic chaotic governance model and creation of an uncertain business environment by the Trump admin in the most important industry for the US economy.
dakolli 9 hours ago [-]
If this is the most important industry for the US economy, the US is screwed. and .... Hopefully you're correct.
In past Empires kings bet their entire nations future on the words of soothsayers , people who said they could predict the future. It seems like Machine Learning engineers are the magicians of Empire of the modern age.
Sabinus 9 hours ago [-]
>If this is the most important industry for the US economy, the US is screwed
Depends on how economically useful AI turns out to be. It will be useful, but it needs to be VERY useful for the current valuations.
>In past Empires kings bet their entire nations future on the words of soothsayers
I think AI's rise is much closer to the story of factory machines and computers than to soothsayers and emperors.
laidoffamazon 9 hours ago [-]
So how much did they have to donate to the MAGA PAC for this one?
HardwareLust 9 hours ago [-]
That's what I was thinking, the check finally cleared.
tjohnell 10 hours ago [-]
Who knows - this could be the last model we see from Anthropic. Or it just becomes the wild west and we figure it out as we go.
alex_anglin 10 hours ago [-]
Isn't it the wild west already?
On a lark, I asked Claude to compare AI to the wild west a while ago. It raised three points of similarity:
- Land-grab economics
- Lack of regulation
- Changing social and professional attitudes.
Whatever it is, it's a wild ride regardless.
flextheruler 9 hours ago [-]
[flagged]
Tenoke 10 hours ago [-]
For non-Americans especially it does look possible to be the last one.
Sabinus 10 hours ago [-]
Chaotic governance model and uncertain business environment by the Trump admin as usual.
estearum 10 hours ago [-]
Hey it's a perfectly pro-business environment as long as your business kisses the ring vigorously and continuously with perpetually escalating intensity.
10 hours ago [-]
natch 10 hours ago [-]
They need Lehane or… since OpenAI got him, what is Fabiani up to these days?
tamimio 10 hours ago [-]
So after this publicity they got, they will release a locked down version of the models, did I get that right?
sgc 10 hours ago [-]
Sounds more like they are implementing mass surveillance and reporting whatever the US Gov wants for 'security reasons' back to them.
matheusmoreira 9 hours ago [-]
Weren't they already doing that since the beginning? Fable released with a data retention policy. I assumed US government surveillance was the reason for this.
tamimio 10 hours ago [-]
Yeah, it feels like a honeypot at this point. Gonna go with GLM instead
rvz 7 hours ago [-]
Smart thinking.
poopyscoopy 9 hours ago [-]
Debating if I should re-sub to the Max plan now (in case they grandfather people in some how) or if I should just wait and see what they announce.
user43928 3 hours ago [-]
They announced a generous 3.5 days of subscription usage, plus a broken classifier.
jbritton 6 hours ago [-]
I wonder if Anthropic servers can handle the load their going to get tomorrow. Unless it’s a staged rollout
alfiedotwtf 27 minutes ago [-]
For how long lol
HDBaseT 10 hours ago [-]
The question is how lobotomized will it be now?
woggy 6 hours ago [-]
Basically not usable if it's only available via usage pricing.
modriano 7 hours ago [-]
So, uh, any chance us Claude subscription people are going to get the 11 days of Fable 5 access (at non API pricing) we were deprived of?
impodimium 7 hours ago [-]
Huh did not expect them to lift restrictions this soon.
9 hours ago [-]
rbbydotdev 6 hours ago [-]
Howard Lutnick is the 41st United States Secretary of Commerce. Howard Lutnick is known to have had ties to Jeffrey Epstein. From what we know and what has so far been released to the public, he is even documented to have visited his island.
Havoc 9 hours ago [-]
So much for way too dangerous end of the world
vlian2088 9 hours ago [-]
>Anthropic has taken steps in close coordination with the U.S. government to address the risks associated with Claude Mythos 5 and Claude Fable 5. Among other things, Anthropic has agreed to proactively detect and address security risks associated with the models; to work diligently with the U.S. government on protocols and standards and releases for Mythos, Fable, and future models; and to inform the U.S. government of any malicious activity.
ah, I see. so, Chinese models are getting banned soon.
unchocked 10 hours ago [-]
w00t
jknoepfler 8 hours ago [-]
Almost as though they were indefensible bullshit to begin with. I wonder who extorted whom and for how much.
Like gee, that was fast. If this had any bearing on reality, one would imagine the vetting process would take actual time and that there would be a real, material difference between what we knew then and what we know now.
The cartoon bullshit theater is exhausting.
yashthakker 5 hours ago [-]
[dead]
slicendice 8 hours ago [-]
[dead]
colesantiago 10 hours ago [-]
This is great news,
I'm sure many teams couldn't do their best work because Claude Fable 5 was unavailable.
I wonder what their hiring pages look like now, are they starting to remove job postings?
rocketpastsix 10 hours ago [-]
there is no way Fable and Mythos had such an impact in such a short amount of time that people were hiring based on it.
colesantiago 9 hours ago [-]
They certainly are now that the export ban is lifted.
ungovernableCat 9 hours ago [-]
They haven’t even restored access yet lol, they’re doing that tomorrow.
msephton 31 minutes ago [-]
In some parts of the world it was already tomorrow when they made the announcement.
esseph 9 hours ago [-]
... Care to explain your thoughts on this?
I'm absolutely fascinated.
remexre 6 hours ago [-]
User Wanted: Minimum 5y agentic development experience with Fable?
Now whether AI tech is in the same league as say Nuclear tech and therefore by any reasonable standard should be regulated is a different question.
We hit the slippery slope on a random day in June 2026 and there is no putting the genie back in the bottle. Any exec or manager that puts load bearing weight on top of Anthropic/OpenAI/Google/AmericanCorp frontier model deserves the stress.
But, as a frustrated EU resident lamenting a lack of European option(Mistral is just not competitive enough), I will spread my money towards the Chinese models as well. Thank you Murica! You achieved your soft power by pushing us towards the Chinese :-)
This protectionism and hypocrisy (free markets and freedom!! Until it is us who needs to practice what we preach) is so tiring. I wish European nations would come together closer and put their differences aside and realise larger things together. Become the new power that the US is clearly stumbling away from being.
Does anyone know why? I was really excited when they emerged, but their models and targets don't seem to be quite in the same market.
Mistral focuses on long term b2b contracts and their proposition is that they fine tune their model to your needs with an added bonus of 'not dependent on America' in a politically tumultuous time.
there's no shortage of talent in Europe or France, it's just an issue of available capital
On top of that, the intelligence is being dialed down. Sonet 5 is a living proof of this. Fable has strong guardrails, but new Sonet is a dumbed down expensive model, which already falls behind GLM 5.2 and Kimi 2.7. I might go back to Claude since I know Fable is just a limited offer, and I am not going to pay for API usage. But what they are signaling with Sonet will also come to Opus. A lobotomized more expensive model.
I am honestly baffled how the current administration is giving the whole world, on a golden plate, to China. And they don't seem too bothered about it. They are living in their own bubble and reality distortion field I guess.
I could go on endless rant about Dario, but I feel I am so strongly biased now that my judgement might be clouded.
Time to move on
If that's a comfortable position for you, all good.
We held the US in higher regards, that's all.
yes, I guess the only thing the europeans are good at seems to be complaining.
> We held the US in higher regards, that's all.
That's your fault and now doing the same mistake with China
I don't think anyone is making the same mistake with China, as open weight models can't be Thanos'd away.
Even the premier EU companies such as ASML are heavily reliant on US supply chain.
But why can't we be bitter?
The switching costs of changing LLM providers is as low as it gets. All the individuals and startups I know try different models all of the time, even down to the level of choosing which provider to use based on the task. Bigger companies move slower but only because they have lawyers and teams negotiating contracts, not because there is a technical reason that it's hard to switch.
Companies have dealt with supply chain unpredictability by having multiple providers and switching between them since forever. It's infinitely easier to switch LLM providers than it is to deal with physical supply chain uncertainty.
Assessing quality of output is often not trivial, either. Typically, problems that are solved by offloading something to an LLM are super subjective, and customers “feel” something is different is vulnerable.
We try to quantify output differences by many different similarity metrics. But a lot of energy goes into subjectively evaluating if something still works.
Not trivial, you would need to do lots of evals and prompt tuning when you switch models.
imagine what happens when you optimize your agent skills to the current model, and new model starts breaking. you would need to have versioning for your skills, serving different skills based on the model while you do A/B testing
Even if you won't be able to use some model tomorrow, you can still make money by using it today!
And in the age of limited compute, spiky workloads and constant outages, building a mechanism to fallback to a weaker model when your primary choice isn't available is smart anyway.
If you make money from doing anything like "produce software with as little human involvement as possible", then sure, you need SOTA models. In that case, though, the value you add is very little and you probably don't have a sustainable business.
OTOH, if you make money by getting clients to pay for features, there is very little difference in time-savings from using Anthropic/OpenAI SOTA over GLM-latest.
IOW, if you business can only make money by one-shotting software, you probably don't have a business in the first place.
Regards, another small business owner.
In some cases they do. I work in a B2B vertical SaaS company and there’s both features that competitors build or rough edges around our features that make clients go „either we get X or we sign with someone else”. I agree though with the general sentiment that you don’t need SOTA models to build those - humans or humans + mid pack strong model will do.
It sounds that your business is selling completely agent-coded products. I don't know how long that will be viable, or even if it is right now.
In my part of the world, I am completely unable to sell completely agent-coded products, so even a SOTA model is useless. The majority of my time is spent on analysis outside of coding anyway, so when I bill it's not based on how many lines of code I've added, it's based on whether the goal of the customer is satisfied.
We got the first news about Mythos in March, so it is likely that it was already close to ready by the time Opus 4.6 was released.
So the actual gap is the time elapsed between March (or April for the official announcement) and whenever Chinese models can match Mythos.
Why would Anthropic get the benefit of pre-release models counting toward their lead, if nobody else gets to count their pre-release models?
But exactly which point in time is z.ai compared to claude.ai? Consistently bring "6 months behind" in an exponentially acellerating evolution means the gap is growing exponentially wider, not constant.
A couple: usually 2, though not always
A few: 3, 4, 5
Several: 4, 5, 6, or 7.
I had to explain this to my German friend. In my understanding this isn't about the actual number, it's about the certainty. If it's absolutely and definitely two, then I say two. If I'm uncertain but it's probably two, or if a non-integer, somewhere around two, then I say couple.
And few is more likely to be 3 than 5, because 5 is getting close to a "half-dozen or so", or (as you say) several.
Many is very context-sensitive, as the meme has it.
So I would agree that the open models are a few months behind, definitely more than a couple of months behind, possibly several months behind, maybe a half-dozen months or so behind, but not many months behind.
3 or 4 would likely be a few, or some. 1 is, well, one.
The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.
China has no flywheel for long-form agentic traces like Claude Code and its telemetry over its userbase (no one uses the Chinese harnesses yet). Most Chinese models are forced to price themselves significantly below cost to compete with the huge demand for bootleg claude tokens, because they're that much worse.
I don't know what I was thinking.
How is this different than any business with something to lose saying a competitor isn't as good? Not saying it's false, but it would seem to me that it's more important how customers feel about the issue.
#1 I've had use cases where it was clearly obvious the Chinese models were behind.
#2 I've also had use cases where I couldn't tell a difference at 1/20th of the price.
The problem is - the #1 is the use case where American frontier is gated behind saboteur classifiers and is tiny minority anyway. Vast majority of work is #2.
The gap doesn't matter anymore.
There's a lot of subjectivity in determining this, but I'm 100% sure that 10 months is wrong.
I don't know whether the gap is currently growing, but I'm not sure it matters. There are thresholds where models reach certain levels of usefulness. Opus 4.8, for example, is at a level where I can give it relatively vague input, and it can go for half an hour on its own and produce a high-quality PR.
If GLM reaches that level of capability and can do that task more cheaply than Anthropic's model, I will use GLM for that task, because that's a specific type of task I use models for. It doesn't really matter whether Anthropic also has a better model, because what does "better" mean in this context? It's a clearly defined task, and Opus 4.8 already does it at a very high level of quality.
I've heard half a dozen people talk about how a less advanced model coupled with a better harness outperforms a smarter model in the last few weeks.
If the USA wanted to shoot its AI industry in the foot it achieved its goal.
And you seem to think "no one uses" DeepSeek's v4, z.AI's GLM 5.2 or Xiaomi's MiMo 2.5 from their official APIs when they probably dwarf Anthropic's usage and are widening the gap due to conquering a chunk of Western market too.
I know it's hard for some to comprehend there's an entire Eastern hemisphere in the globe with billions of people, so it's worth reminding. And some seem to think the world is basically silicon valley even.
Can you comprehend than Anthropic is winning because is both cheap(subscriptions) and better SOTA. People are cheering China providers when I reality they would rugpull open weights the moment they are competive.
China models are trash that why they are giving them away for free.
For individuals and small companies subscriptions is the best deal, for big companies china models are big no unless they can host them.
Sure… but which ones? How can you know ahead of time?
I just did a “simple” upgrade project where both me and the AI kept tripping over dead code, subtle typos, and difficult-to-trace live versus dead code.
Many times I used “Medium” thinking I got bitten, but not every time, and I couldn’t predict when.
So “Extra high” it was, for the entire project.
Far fewer nasty surprises!
When there isn't a zero-risk option, the question becomes which risk is smaller.
Yes.
If.
Man I hope this tech FOMO eventually stops.
Companies generally fail because either their product doesn't meet a market need, or the market doesn't exist in the first place (possible because of bad timing), and not because they simply outran their competitors.
These aren't things fixed by using a frontier model to vibe code faster in lieu of one 5 months behind.
What's your competitive edge here? Shaving off an hour of a feature delivery? Not having to see the code that is produced?
For a change, I let DeepSeek V4 Pro implement it on Max thinking level. Nothing too out there - some DB migrations, some Django back end changes and Vue SPA front end changes.
Implementation time in total including tests was a few hours, so nothing too egregious. However, one of the migrations would break with pre-existing data, one of the column references in the entity was wrong, the API endpoint wasn't made consistently with the others in adjacent code (e.g. permission checks) and the front end had a Pinia state related issue and submitting one of the forms didn't work.
Tooling was run: ruff, ty, Oxfmt, Oxlint, also Docker build was green across the board, but the overall feature just didn't work. In both cases, sub-agents with clear context would review the code for serious/critical issues, at least three in parallel and do review loops until they spot nothing. The harnesses both has LSP integration.
Opus spent another hour fixing it, needed a few iterations, because I couldn't be bothered there.
> What's your competitive edge here? Shaving off an hour of a feature delivery? Not having to see the code that is produced?
The difference largely was not needing to waste time in fixing all sorts of subtle bugs that sub-optimal models will produce, worse yet if it was some sort of a serious project and those wouldn't have been spotted but instead that slop would have gotten shipped.
That said, Opus isn't ideal either and messed up a whole bunch when I was training some neural nets and try to process a bunch of satellite data and configure Garage to store them so that tiles can be served from a slow HDD and stuff like that. Obviously, it also needs a lot of babysitting in regards to UI looks, but it's better at the rest of development.
I think that DeepSeek V4 Pro and GLM 5.2 are cool though, it's just that you want as many checks and tests as you can throw at any given problem, or use languages that make shipping completely broken code increasingly likely.
I think it’s excessively charitable to assume businesses are uber-competent ROI-chasers. The expense people are eventually going to win on AI too, this blip of unrestricted AI budgets will be gone soon.
They are overused in sitcoms because it’s easy for actors to mimic on demand unlike several other reactions.
Example. Yesterday I listened the technical lead of a customer of mine digging himself into a hole by not understanding what it would mean exposing AWS EFS to their on premise server over NFS. It was just too many unknown unknowns for him and he had no time to ask the AI (and even if he did I'm not sure that he could understand.) His boss, which actually used NFS, had to stop him. I didn't speak a word.
So, he could have coded the migration of a server from AWS to on premise, asked Claude to write also all the configuration scripts and policies but then what?
It's almost identical to the possibility of one model getting shut down for a business that doesn't care about SOTA.
TBF I do burn 200k tokens just preloading the context with onboarding, not including any code, just document trees of development policy documents, style and architectural standards, code and documentation review processes, company ethos and culture, etc. it’s a token fire, but it really works for us.
Also, documentation driven development all the way down.
That's rare, though. If they could not untangle their own code after 4 months, it's because they were not making enough money to pay a team to untangle it - that's not a code problem, it's a revenue problem.
IOW, the startup failed because their revenue was too low.
I've stared at ugly LLM code, that I had just had generated, and worked well enough for my purposes. (generally, some quick recursion into a nested python dictionary in order to dig out some property -- especially for linting or quick data analysis).
And I wanted something better, sure, something a bit more readable ...but I just needed it to work well enough to recurse through a yaml file for config file linting, not be battle-hardened against every test case.
So to deal with the mess, I shoved it in a pure function, threw a few basic sanity unit tests around it, put a comment with a disclaimer of "#this is LLM generated code, it is lightly tested, do not use it for anything truly load-bearing without a lot more tests" and I moved on to something else.
Not everything has to be bulletproof.
If you are, in fact, "a technical product manager", I would hope you understand that "bad code" is identified as such specifically because it "impacts the business."
The engineers I have worked with most definitely define "bad code" as having intrinsic limitations and/or latent defects which impact successful system functionality/operation. Indicators provided to stakeholders such as yourself which support this assessment are, but not limited to:
All of the above impacts a business.It is up to you, the "technical product manager", to understand what your team is trying to tell you.
Everything you're saying is true, sometimes. Assume I'm still right, and that you might be able to learn something from someone else.
I do not see how I was being rude, unless it was my use of quotations around the title you claim.
> I'm a human being ...
I did not doubt this.
> ... I'm a very experienced product manager and engineer ...
Again, if it was my use of quotations which you found to be rude, then I do not know what to say about that.
> ... and the way you are behaving sucks.
I respect your perspective and support your right to express yourself. And no, I do not think you are being rude by doing so.
> Assume I'm still right ...
Why would I? You responded to:
>> This is a site full of developers who are convinced that "proper software engineering" is 100% of what makes a business successful, and everything and everyone else is useless.
With:
> As a technical product manager, this 1000%.
Finally, you write:
> ... you might be able to learn something from someone else.
Maybe you can learn something from someone else as well.
Of all the "concise" and "beautiful" code I worked hard to produce, I was the only one to ever lay eyes on it. It didn't actually matter, and nobody cared but me. The people in charge of my raises could never perceive quality of code, because it wasn't their area of expertise. They only cared (rightly so) that it did what it was supposed to, and all the elegant abstractions didn't practically help that purpose. It was, literally, wasted life that I should have spent just getting off work early, like most of my colleagues.
People need to get to grips with that fast.
Distribution, relationships, processes, mindshare, marketing, and politics matter. Code is just ephemeral glue and implementation detail.
Just 99.999%.
Get over yourself. We're all ephemeral, dead and recycled in the blink of an eye. Our species doesn't even clock on the geologic timespan.
If you think your code (or any of your artifacts or possessions) matter beyond their immediate utility, you're mistaken. Work will either fall into disuse or be replaced. It's scaffolding for what comes next along a well-traversed path.
I can only imagine what people are doing at their jobs with unlimited token budgets.
That's irrelevant. What's the increase in revenue?
It's not just statistics either. I know for a fact that I made major progress by using LLMs. Here's a summary from around a month ago:
https://news.ycombinator.com/item?id=48407642
AI is world changing technology as far as I'm concerned.
The opportunities available for these people are rapidly, rapidly shrinking. I believe it's possible to be a developer today who's EXCEPTIONAL and never uses AI. Most opponents are not exceptional, though, and even these opportunities are shrinking.
Most exceptional developers in my org adopted AI in their workflows and went from 10x developers to 20x developers.
If you refuse to adapt, you're going to be out of a job complaining about the kids and their newfangled technology REAL quick. You have a few years remaining, maybe less.
I can’t turn 10x work into 20x work because my Product Manager thinks changing fundamental premises of tasks I already spent two weeks on (mostly removing human blockers) is very simple. After all, when he asked Claude to update his prototype, it only took it 10 minutes.
I can’t turn 10x work into 20x work because the company dedicated entire teams to write company-wide skills for everything. They suck, but if I don’t use them, I’m not following the new “golden path for engineering”, and I lose points in my performance review.
I can, however, turn 10x work into 20x work, or even much more than that, if AI actually did what it’s promising and eliminated most of my team, the product manager, and the middle managers. Or me. I could use a break.
> Speed.
Speed of what?
Speed of understanding what needs to be done? I highly doubt it.
Speed of LoC checked into git? Sure, I'll give you that.
But one can use any number of tools to generate hundreds of thousands of lines of code. See any build tools which support specifications such as RAML, OpenAPI, CORBA, etc.
So I ask again; speed of what?
fixing more serious regression also easier. connect honeycomb mcp, ask agent to debug while i walk to coffee and get some pistachio rose dates. by time im back with my oat latte ive got a full report on what happened and can send the next slack message to fix.
life is good
I am appalled none of this is clicking with you anti-AI folks. This is all so exciting -- alarming even! --, and software careers are never going to be the same.
I don't know how you just metaphorically stand there and act like nothing at all is happening. We've never seen anything like this in our entire lives.
Some of you are standing right in front of the steam roller, yelling to all of us that steam rollers aren't real.
A: The sky is blue! B: No it's not. A: Yes, it is, please look up. B: No, you must prove it to me through reason. A: But, if you would just pretty please look up. B: No.
I run a company, I've been running it for 10 years, we do alright. I'm a shitty manager. Every time I've hired developers, the business freezes. The business isn't anything super important, the main consequence of bugs is that my family loses money. Everything has always rested on my shoulders. In theory there is some path for me to become a good manager, but I never landed on it. But now, with Claude, it's great. So far Claude has paid itself off in real profits at least 20x over, and that's with significant API usage on top of the monthly sub. I can prototype new features in an afternoon that before were on my giant list of "maybe somedays if I ever get to breathe" list. Our user experience has improved in so many ways that I knew were probably worth it, if I could just find the time. Now I can.
There are situations where yeah, it probably isn't ready yet. But, there are so many where it's amazing. Seriously, it's worth looking up.
My point is and remains:
For all of you people who think these LLM models are “earth shattering” how the hell do you reconcile that it’s a net positive for anyone but those who want to consolidate knowledge and power.
We are really looking at idiocracy in the making.
For actually building software, I'm starting to suspect a human with a dumber (but faster) model is going to get the job done quicker than Fable (and possibly even cheaper). Bug-finding and vulnerability detection is a different story.
At $JOB I have warned higher ups we should try to keep our expenditure under control, educate people that document slinging doesn't require Fable every time and demo the capabilities of the cheaper models, and been snubbed for it. When Fable is available once again our bill is going to be eye watering, relative to what it should be.
But for what I work on I mostly need high or xhigh SOTA model quality output. I don't have the time to deal with anything less.
If you're the one-shotting type, obviously then Fable might be useful, but I think only marginally. You don't need to bring a MANPADS to a duel at high noon.
Yes you use the right tool for the job.
But if the job requires the best intelligence you can get with an LLM, then you use that.
Taking as an assumption that the quality of your product is a function of the quality of the inference you are using: if you use an inferior model because "what if it gets export controlled again" and your competitors don't, then your competitors are likely to win.
If you don't need frontier models for you job then this is all moot, but the thread started with
> You cannot build a business critical function on top of American SOTA frontier model
Which is silly. HN likes to roleplay bringing everythgin "business critical" in house because sometimes vendors mess up. Self host, don't use the cloud, run open models locally, built redundant supply chains in case of another covid, etc etc. Sometimes the risk is real, but most of the time the risk is rare and the cost of an interruption event is less than the cost of bringing everything in house or using lower quality vendors "just in case"
So you use the frontier model, then when you can’t you accept things are less efficient. The alternative (right now) is to be less efficient all the time, I don’t see any advantage to that.
But, it is a big own goal, because once you invest in building evals for your internal use-case, 1) it’s easier to switch your model to whatever is cheapest, and 2) it’s way easier to fine-tune an oss model.
Evals are annoying to build and most companies were fine to rest on vibes. Now many companies have to do the work for insurance.
A week or so pause from seemingly legitimate cyber security concerns isn’t cause for panic. But it should be backed by laws that describe what that process should be. That would put the market at ease
The reality is this is world-ending technology and absolutely nobody knows what to do or can even agree that the problem exists.
The reality is that the "people in power" believe it is "world-ending technology" and will therefore use it in world-ending ways. People are absolutely 100% the danger here, not the technology.
The SOTA frontier models have value elsewhere, not monetarily perhaps, but certainly per user. Quite a few cool things have come out of that brief Fable window. There should be more.
Yes 1000%, please, all my European competition please don't use mythos whatever you do it's total USA trash and the Chinese models work better anyway.
Why is it good if they give their money to China?
> When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."
https://news.ycombinator.com/newsguidelines.html
Nobody should be putting loadbearing weight on Amazon or Microsoft with their ruthless monopoly ambitions, yet here we are
Until it goes down, or Anthropic raises prices again.
Fable is already expensive to use compared to GLM and they want you to use the API as much as possible so you get a worse deal.
Yes I can!
I mean, this was already pretty clear before. But it surely didn't help!
So if you decided to bring money to communists you can put whatever rational but not this. Do so , and you will loose your last competitive edge in this domain. ASML. After that EU will become a completely agricultural-only region, since edge is lost almost everywhere else.
They crippled their own domestic entities with the AI Act. (see the Mistral CEO's rant.)
If you want to use frontier models until then, you're gonna use what's available, and that's US models.
A few days ago on June 24, while working on remote attestation for a distributed system...
What are we to think? Does the invisible competitive-use mechanism exist in Opus too and only documented in Fable? How long has it existed? Is it still in effect? -- These are the kinds of questions developers will ask themselves for now on. This is why it was one of the stupidest things Anthropic could have done. Developers will now question everything and rightly so. There's no attestation protocol for that. How will they know?[1] "In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.
Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts,these safeguards will not be visible to the user. Fable 5 will not fall back to a differentmodel. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations. When these interventions are active, we expect them to have minimal behavioral impact on the model except to limit its effectiveness in developing frontier LLMs. Claude will still respond helpfully to user requests. We’ll continue to improve the precision of our detection methods following the launch of this model."
Source: https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...
Like never we delibrate specifically on non-US solutions (objectively great) because we realized we are neck deep in US dependence. It is not that it was not known before, we just did not feel the threat.
This is why EU companies niw look at our own solution (which are late and will probably suffer from the incompetence and mess of the EU institutions) but another key playet is round the corner, namely China.
Trump managed to make us look elsewhere than the US. Thanks for that.
But "downvote for disagreement" is a legitimate use. I personally tried to tell someone that it wasn't, and I got corrected by dang.
This made me realize it's a waste of one's time to write thoughtful, informative, educational posts only to have them buried and downvoted by man-children.
If we go by empirical evidence alone, it's a more effective use of time making Reddit-quality quips.
I think Europe and Canada are just happy not to be frozen out of AI access completely at this point.
https://news.ycombinator.com/item?id=48709670
https://news.ycombinator.com/item?id=48721903
Of course Anthropic is still relevant, but people have realized they’re not special, and between this and the ID verification thing, they’ve given up a ton of their relevance vs a month ago.
I wouldn't personally pay API pricing for it for my personal projects, but I bet it's going to be absolutely slammed with usage for the next month+.
OTOH for most of my day to day work I've come to realize that faster ~ Opus 4.6 / GPT 5.3 level capabilities could be the sweet spot as scaffolding has to be put in place right after clean specs and constant review anyways. The latest chinese models and GLM 5.2 in particular felt on-par on that front.
I work in AI / infrastructure and I have never seen as much interest towards investing into sovereignty by actual deciders. Thankfully, at this point I can't see any flip-flopping / change of messaging stopping that train.
In CA/EU over the last ~15 years, one used to be perceived as a bit of a "weird systems person" by just proposing alternatives to the big hyperscalers.
So the Trump administration, hands-down, has been the greatest ally here.
In tandem, I was hoping Anthropic would be keeping "dangerously capable" models banned from "evil Chinese distillers" for as long as possible.
No Mythos class model will be allowed to be legally hosted for download on any service. All powerful nations will ban this since safeguards are not guaranteed by shady service providers running these models.
For the Chinese first party providers, they will be forced to implement the same process and safeguards, and they will not be allowed to release the model weights to the public.
Why? Because no sane nation is going to put that kind of capability in the hands of the public only for the public to use that power against that nations best interests.
Save this comment. It is prophesy.
> After a series of productive conversations with the US government, we're redeploying the model with a new set of classifiers to target and block more cybersecurity tasks. In the near term, some routine tasks like coding and debugging will fall back to Opus 4.8.
Edit: the above was from their tweet announcement at https://x.com/AnthropicAI/status/2072163884430229756 ... the associated blog post at https://www.anthropic.com/news/redeploying-fable-5 suggests it was just poorly written and coding can still be done with Fable, just with overeager bouncing of "some routine coding and debugging tasks" to Opus.
> The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks.
Here's Fable 5, the strongest model. Actually try to use it to harden your code and it turns into Opus 4.8. You have seven days to use it, and only half of that time's worth in actual usage. Enjoy.
Looks like it's going to be a thoroughly frustrating experience, even worse than initial rollout. For subscription users, the situation is almost indistinguishable from the export ban.
https://news.ycombinator.com/item?id=48466313
Just a code review of my own project. Downgraded to Opus 50% of the time while evaluating the critical I/O and memory safety parts, the exact thing I wanted it to do.
And now it's gonna be even worse.
I expect the strong cybersecurity model to help me strengthen the cybersecurity of my project.
> not allowed to cover security topics
They said it wouldn't be usable for offensive purposes. This is the opposite of that.
The cybersecurity model is Mythos, which was never made publicly available. It is only available to a list of US government approved companies.
> They said it wouldn't be usable for offensive purposes
No, they said Fable would refuse for cybersecurity and offensive purposes. You are conflating Fable with Mythos.
I guess the underlying issue is that there is this model that is very capable, but it's being hobbled because of a fear of abuse. It may well be justified, but for a legitimate user any restriction just makes it a worse product and after all the puffery around how good it is (and some practical experience of how good it is) it's a pretty shit experience. "Here's our best model, no you can't really use it".
Fable 5, harden my openssl project. Then you use the diffs/summary to find out what the bug is for your exploit.
This "only super special corporations get the model" nonsense is dividing society into haves and have-nots.
Anthropic was correct in their assessment and early warning of Mythos's capabilities, and they did this rollout pretty well. They were not hype marketing. They were being genuinely cautious and honest.
The Trump admin was largely unreasonable with the sudden export control. (Though not entirely unreasonable.) The export control also had not much to do with Anthropic's pre-release warnings. See: GPT-5.6 currently being held up by the federal government.
I think the Fable ban happened because Anthropic was first to release a capable enough model.
Honestly, why bother with it? They are effectively just releasing the model in-name, but we just get Opus 4.8.
They almost definitely mean "you will notice even more false positives during seemingly routine coding/debugging tasks than you did at the initial launch". Which is not surprising, given the ordeal they've been put through. Hopefully it won't be too bad.
The main depressing thing for me is it's now only 7 days on the subscription, and then full API pricing, with no mention of even a plan to bring it back to the subscription in the future. (The initial launch mentioned two weeks of subscription, then API pricing, then a hope to return it back to the subscription not long after.)
"In the near term" is doing some heavy lifting.
> The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks. As with all our safeguards, we’ll continue to refine this to better distinguish genuine misuse from legitimate requests and reduce false positives.
> Fable 5 will be available starting tomorrow, Wednesday, July 1, to users globally on the Claude Platform, Claude.ai, Claude Code, and Claude Cowork. For Pro, Max, Team, and select Enterprise plans,1 Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits. We will re-enable access on AWS, Google Cloud, and Microsoft Foundry as quickly as possible.
https://www.anthropic.com/news/redeploying-fable-5
Reading the full blog post, I think the summary was just poorly written (because it's hard not to read that sentence like all coding is redirected to Opus).
EU is looking and charting its course already. Yeah, we can joke about it, we can mock it but it is in momentum already, one step at a time.
Source: https://x.com/AndrewCurran_/status/2072103733715194048?s=20
-------
June 30, 2026
Tom Brown Chief Compute Officer Anthropic 548 Market Street San Francisco, CA 94104
Dear Mr. Brown:
Since the issuance of my previous letters, dated June 12, 2026 and June 26, 2026, Anthropic has taken steps in close coordination with the U.S. government to address the risks associated with Claude Mythos 5 and Claude Fable 5. Among other things, Anthropic has agreed to proactively detect and address security risks associated with the models; to work diligently with the U.S. government on protocols and standards and releases for Mythos, Fable, and future models; and to inform the U.S. government of any malicious activity.
In light of these actions and commitments, as well as the Bureau of Industry and Security's evaluation of the diversion risks now presented by Claude Mythos 5 and Claude Fable 5, the controls in the June 12 letter are withdrawn. A license is no longer required for the export, reexport, or in-country transfer, including deemed export or deemed reexport, of the Mythos or Fable models.
Commerce reserves the right to reevaluate the decisions made in this letter and the necessity of reimposing a license requirement, should circumstances change or should Anthropic fail to adhere to its commitments.
If you have any questions about this letter, please contact me or the Under Secretary of Commerce for Industry and Security, Jeffrey Kessler, at (202) 255-1864.
Sincerely,
Howard W. Lutnick
------
Looks like Anthropic paid the Danegeld. Now they'll never get rid of the Dane.
https://www.britannica.com/place/Danelaw
https://archive.is/9k7qt#selection-2001.41-2001.49 https://archive.is/dybOE
> routine tasks like coding and debugging will fall back to Opus 4.8.
For cyber and bio related requests it just refuses.
https://openai.com/index/previewing-gpt-5-6-sol/
I assume they did something to the model itself.
Either way, I do hope they lift those draconian bans. Using the model was a terrible experience because of the constant downgrades. I didn't manage to harden my own projects before Fable got banned.
The CEO is also not the addressee of shipments of urinal cakes.
There is a deep, deep ignorance of export controls on HN, and I fully expect it will play out as another 500 comment thread of snark and incorrecting each other while blaming the government and not understanding a word of it.
FWIW an Empowered Official is not the person who cleans the espresso machine.
He likely does not have the domain knowledge nor is authorized to be the recipient of such a letter.
And that's ok. His role is to hire others competent in export matters. It's a learning experience for them.
One of the contributing factors that led to this control in the first place was that the commerce department couldn't get Dario on a call immediately:
"Then White House started reaching out to Anthropic to speak with Dario Amodei, who was at a wellness retreat.... When Amodei was finally available past 1pm, he had three tense phone calls with a combo of ppl including Cairncross, Bessent, Lutnick, Kessler, Will Scharf, Richard Walters, and Walker Barrett."
https://x.com/SophiaCai99/status/2065942612293365948
Anthropic has disputed that Dario was at a wellness retreat but both sides seem agree that it seemed to be a problem (and it is very apparent that Dario's response made things worse).
It's shocking to me that Anthropic seems to be run with the same managerial chaos as depicted in early seasons of Entourage.
Dario may be a genius, but when it comes to running a big business — which involves dealing with governments and regulators — it's like he just fell off a turnip truck.
In the end, we need actual laws that tell the market what kinds of models get paused / analyzed, how long that pause can be, etc.
Otherwise there’s no standard and it will be easily abused and prevent investment in US AI companies.
"Anthropic has agreed to proactively detect and address security risks associated with the models" LOL, this was already happening.
This clown car administration just keeps making shit up and then backpedalling in a way that just leaves everything worse.
Looks like it's gonna be even harder to use than before, if not impossible. Subscription users only get it for a week, and only for 50% of that week's usage.
It's pretty clear that they didn't want this anyway, despite what the conspiracy theorists want to believe.
https://en.wikipedia.org/wiki/Anthropic%E2%80%93United_State...
are export controls the right thing ? Probably not.
but the american economy is over-exposed on "A.I" - the capital expenditure, while the Chinese are proving you don't need to spend tons of capital to get close to the frontier.
the Chinese have better building capacity & cheaper energy. that means the market has to correct at some point.
There's enough money and scale on the line that software affinity like CUDA is no longer the deciding factor and there's margin for custom stacks.
Even more so after the USA GPU exports ban which is proving to have backfired by speeding up China's tech growth.
The H200 was released Nov 2024.
Even allowing for Jensen exaggerating the risk there is no way China is 7-10 years behind.
Looking at manufacturing process nodes, SMIC N+3 is a a 5nm process. 5nm was introduced by Samsung and TSMC in 2020 so at most that is 6 years.
But the chips they can produce on it are roughly comparable to "roughly level with Android flagships from three years ago"[2]
TL;DR: China is more like 2-4 years behind than 7-10 years. If China developed EUV lithography then all bets are off.
[1] https://www.reddit.com/r/LocalLLaMA/comments/1kxw6b9/nvidia_... - see video.
[2] https://www.tomshardware.com/tech-industry/semiconductors/se...
To quote Pat Toulme:
There’s a big misconception about how GLM 5.2 was trained. Yes, they distilled Claude and GPT 5.5 — but distillation is not how they matched Opus quality. Distillation only fixed the cold start problem in RL.
RLing an agentic coding model isn’t rocket science. In simplified terms:
1. RL needs trajectories — rollouts where the model actually completed a task in some env
2. No successful trajectory on a task = zero gradient = you can’t RL it. This is the cold start problem
3. Distillation solves it. You seed your model with knowledge from a smarter one (Claude, GPT) on tasks it can’t do yet
4. Now it produces positive trajectories on those tasks
5. RL on those trajectories and hill climb agentic coding
6. At that point you no longer need to distill and can solely hill climb RL to better models
This is an interesting curve. I’d argue it’s harder to get to Opus 4.8 from scratch than to go from Opus 4.8 → Fable/Mythos tier.
GLM 5.2 is already producing positive trajectories, so they have plenty to RL on — they’ll keep climbing to Mythos quality without distilling any further. They no longer need American models.
https://x.com/PatrickToulme/status/2069211575437627743
Not exactly sure what the finish line in "the race to superintelligence" looks like and even moreso it's unclear why you think being there first is a critical benefit.
https://news.ycombinator.com/item?id=48519513
OpenAI is doing a much better job on this, offering generous usage limits to users at home. They also hand out usage resets for minor issues, that you can even apply at a time of your choosing.
It's nice that the restriction is going to get lifted but I hope this doesn't make anyone complacent that their coding work is going to be scrutinized by the US government, with AI, when using these models.
https://www.mixvale.com.br/2026/06/26/fbi-warns-brazilian-po...
We'll begin restoring access tomorrow, and will share an update soon.
We’re grateful to our users for their patience, and to everyone who worked with us on redeploying the models.
https://x.com/anthropicai/status/2072106151890809341?s=46
Opus 4.8, you did a lot of good work for me, but in the name of all things holy... I will not miss your communication style. So long and thanks for the fish.
Fable will literally sabotage you if it thinks you're trying to compete with Anthropic.
I'm hoping that some relatively cost-effective self-hosting solutions come about as a result of Hopper hardware being sold off as they're retired from DC use.
I don’t agree with this at all. IMO Anthropic has shown that that are willing to take even significant financial hits in order to stand up to their values and mitigate what they consider to be dangers and risks. Some people don’t like that or think it’s just marketing. But that’s exactly what Incorruptible is about: companies that are willing to take a stand, even in the face of overwhelming pressure from competitors, shareholders and naysayers.
You could legitimately argue this is a unique situation, a brief window where cybersecurity is being disrupted by new harnesses + a strong model. But that will be fleeting as other models and products adapt very quickly, and the long term benefits of keeping it from the market are questionable at best.
It's not a coincidence the export control was dropped after Dario (who is a hardcore AI safety activist much like Ilya Sutskever) was replaced by Tom Brown in the government negotiations.
Fable 5 might not be accessible for sub in the future despite their "best effort".
And 5.6-sol is as expensive as 5.5, so highly probable to be kept in sub.
So what's the plan? Hoping people stay on ClaudeCode because Sonnet 5 while Codex offers 5.6-sol to subsription peasants?
Seems risky
Sonnet 5 today was incredibly slow for example
>We'll begin restoring access tomorrow, and will share an update soon.
>We’re grateful to our users for their patience, and to everyone who worked with us on redeploying the models.
From Anthropic on Twitter
Of course, it's possible that Fable remains drastically better than 5.6, but to whatever extent Fable is the true frontier (if temporarily)... it makes me wonder if external commitments on compute put a hard deadline on how long they could run Fable on the subscriptions.
Hmm? The linked tweet was posted at 16:52.
I only realised late that I had an algorithm problem that existing models were struggling with, and Fable had made progress with. It created a 14 phase plan, which I was able to execute with Opus after the restriction.
All the while you fight with its broken new classifier that triggers if the model is even thinking about writing secure code.
Apparently Anthropic cares nothing for their private users. This is insulting, and I hope they bankrupt after losing enterprise share to OpenAI's more efficient models.
All aboard the hype train!
https://archive.is/HSIxa
https://archive.is/BbxA1
https://megalodon.jp/2026-0701-0918-51/https://x.com:443/Ant...
[0] https://news.ycombinator.com/item?id=48529358
If the Trump administration wants him to say something, he says it. Maybe what he is saying is true, maybe it isn’t. There is no way to know.
The story they are telling is exactly the same whether it was true or they were just shaking down Anthropic for no reason.
There are many different factions within the administration. Sacks was part of the "deregulate the tech sector" faction, which on this issue is aligned with the "beating China overrides anything" faction.
That's distinct from the Pete Hegseth faction (I don't really know how to characterize his faction other than anti-woke maybe?).
Sometimes these factions agree, sometimes they don't.
In general your approach is right - you can't trust most things coming out of this administration. But you can try to unpick was actually happened by who is saying what, when. That is useful even without liking the people.
In past Empires kings bet their entire nations future on the words of soothsayers , people who said they could predict the future. It seems like Machine Learning engineers are the magicians of Empire of the modern age.
Depends on how economically useful AI turns out to be. It will be useful, but it needs to be VERY useful for the current valuations.
>In past Empires kings bet their entire nations future on the words of soothsayers
I think AI's rise is much closer to the story of factory machines and computers than to soothsayers and emperors.
On a lark, I asked Claude to compare AI to the wild west a while ago. It raised three points of similarity:
- Land-grab economics
- Lack of regulation
- Changing social and professional attitudes.
Whatever it is, it's a wild ride regardless.
ah, I see. so, Chinese models are getting banned soon.
Like gee, that was fast. If this had any bearing on reality, one would imagine the vetting process would take actual time and that there would be a real, material difference between what we knew then and what we know now.
The cartoon bullshit theater is exhausting.
I'm sure many teams couldn't do their best work because Claude Fable 5 was unavailable.
I wonder what their hiring pages look like now, are they starting to remove job postings?
I'm absolutely fascinated.