DNSSEC disruption affecting .de domains – Resolved

14 hours ago (status.denic.de)

Looks like a DNSSEC issue, not a nameserver outage. Validating resolvers SERVFAIL on every .de name with EDE:

RRSIG with malformed signature found for a0d5d1p51kijsevll74k523htmq406bk.de/nsec3 (keytag=33834) dig +cd amazon.de @8.8.8.8 works, dig amazon.de @a.nic.de works. Zone data is intact, DENIC just published an RRSIG over an NSEC3 record that doesn't validate against ZSK 33834. Every validating resolver therefore refuses to answer.

Intermittency fits anycast: some [a-n].nic.de instances still serve the previous (good) signatures, so retries occasionally land on a healthy auth. Per DENIC's FAQ the .de ZSK rotates every 5 weeks via pre-publish, so this smells like a botched rollover.

  • So a single configuration mistake in a single place wiped out external reachability of a major economy. It happened in the evening local time and should be fixable, modulo cache TTLs, by morning. This will limit the blast radius somewhat.

    Still, at this level, brittle infrastructure is a political risk. The internet's famous "routing around damage" isn't quite working here. Should make for an interesting post mortem.

    • I am reminded of the warning that zonemaster gives about putting your domain name servers on a single AS, as is common practice for many larger providers. A lot of people do not want others to see this as a problem since a single AS is a convenient configuration for routing, but it has the downside of being a single point of failure.

      Building redundant infrastructure that can withstand BGP and DNS configuration mistakes are not that simple but it can be done.

      6 replies →

    • DNS is a centralization risk, yes. Somehow we've decided this is fine. DNSSEC isn't the only issue - your TLD's nameservers could also be offline, or censored in your country.

      19 replies →

    • "The internet's famous "routing around damage" isn't quite working here."

      DNS is a look up service that runs on the internet.

      Internet routing of IP packets is what the internet does and that is working fine (for a given value of fine).

      You remind me of someone using the term "the internet is down" that really means: "I've forgotten my wifi password".

      2 replies →

    • > So a single configuration mistake in a single place wiped out external reachability of a major economy.

      Real world beats sci-fi :) And isn't it why we love IT ? And hate it too, because of "peoples in charge"...

    • fail-closed protocols have introduced some brittleness. A HTTP 1.0 server from 1999 probably still can service visitors today. A HTTPS/TLS 1.0 server from the same year wouldn't.

      3 replies →

    • >So a single configuration mistake in a single place wiped out external reachability of a major economy.

      And fuck nothing at all happened as a result.

      2 replies →

    • There is the kritis (critical infrastructure law) law, which trys to enforce some standards to make things not as brittle.

    • I have a bad feeling, that the impact will be quite severe for some services, as monitoring, performance, and security services might get disrupted. and just cleaning up is a big mess.. Worst case, some ot will experience outage and / or damage. But maybe I am just overestimating the severity of this.

    • It looks like a failed key replacement during a scheduled maintenance event. Normally this sort of thing is thoroughly tested and has multiple eyes on for detailed review and planning before changes get committed, but obviously something got missed.

    • > The internet's famous "routing around damage"

      ...is only for Pentagon networks and military stuff. It's not for us normal people. (We get Cloudflare and FAANG bullshit instead.)

      2 replies →

  • I love how I work with IT for 20 years and don't understand a single acronym here other than DNSSEC

    • I've been in IT 30+ years, been running DNS, web servers, etc. since at least 1994. I haven't bothered with DNSSEC due to perceived operational complexity. The penalty for a screw up, a total outage, just doesn't seem worth the security it provides.

      6 replies →

    • To be fair, advanced real world knowledge of public/private key PKIs (x.509 or other), things like root CAs, are a fairly esoteric and very specialized field of study. There's people whose regular day jobs are nothing but doing stuff with PKI infrastructure and their depth of knowledge on many other non-PKI subjects is probably surface level only.

      21 replies →

Apparently the DENIC team was on a party this evening! Party hard, but not too hard. https://bsky.app/profile/denic.de/post/3ml4r2lvcjg2h

  • Interesting "bus problem" to have in a scenario where everyone who is qualified, experienced and trusted enough to commit lives changes (or perform a revert, undo results of a botched maintenance, etc) in an emergency situation is not completely sober.

    • Sobriety is just factor to be weighed in an emergency situation. 30 years ago I was at a ski resort with about 50 friends having a drinking competition in the resort's main bar. Late that night two ski lodges collapsed, trapping people inside. Around midnight, soon after the winner was announced, the police entered and asked "who's able to drive a crane truck?" The winner of the competition put his hand up and informed them of how much he had had to drink. Don't care they said, so he drove a crane big enough to lift a building up a single lane 35km mountain road in nighttime ice conditions. (The crane made it, but sadly most of the people in the ski lodges didn't. https://en.wikipedia.org/wiki/1997_Thredbo_landslide )

      1 reply →

Cloudflare has now disabled DNSSEC validation on their 1.1.1.1 resolver: https://www.cloudflarestatus.com/incidents/vjrk8c8w37lz

I must be early. There's not a single tptacek DNSSEC rant in this thread yet.

Yes, all .de domains down because of DNSSEC failure at Denic https://dnsviz.net/d/de/dnssec/

I have never used DNSSEC and never really bothered implementing it, but do I understand it correctly that we took the decentralized platform DNS was and added a single-point-of-failure certificate layer on top of it which now breaks because the central organisation managing this certificate has an outage taking basically all domains with them?

  • > which now breaks because the central organisation managing this certificate has an outage

    The ".de" TLD is inherently managed by a single organization, and things wouldn't be much better if its nameservers went down. Some of the records would be cached by downstream resolvers, but not all of them, and not for very long.

    > we took the decentralized platform DNS was and added a single-point-of-failure certificate layer on top of it

    DNSSEC actually makes DNS more decentralized: without DNSSEC, the only way to guarantee a trustworthy response is to directly ask the authoritative nameservers. But with DNSSEC, you can query third-party caching resolvers and still be able to trust the response because only a legitimate answer will have a valid signature.

    Similarly, without DNSSEC, a domain owner needs to absolutely trust its authoritative nameservers, since they can trivially forge trusted results. But with DNSSEC, you don't need to trust your authoritative nameservers nearly as much [0], meaning that you can safely host some of them with third-parties.

    [0]: https://news.ycombinator.com/item?id=47409728

    • > DNSSEC actually makes DNS more decentralized: without DNSSEC, the only way to guarantee a trustworthy response is to directly ask the authoritative nameservers. But with DNSSEC, you can query third-party caching resolvers and still be able to trust the response because only a legitimate answer will have a valid signature.

      but how would one verify the signature if the DNSKEY expired and you cannot fetch a fresh one because the organisation providing those keys is down? As far as I understood the TTL for those keys is different and for DENIC it seems to be 1h [0]. So if they are down for more than an hour and all RRSIG caches expire, DNS zones which have a higher TTL than 1h but use DNSSEC would also be down?

      [0] dig RRSIG de. @8.8.8.8

      de. 3600 IN RRSIG DNSKEY 8 1 3600 20260519214514 20260505201514 26755 de. [...]

      3 replies →

  • DNSSEC doesn't change the degree to which DNS is decentralized. It's always been hierarchical. In the absence of caching, every DNS query starts with a request to the root DNS servers. For foo.com or foo.de, you first need to query the root servers to determine the nameservers responsible for .com and .de. Then you contact the .com or .de servers to ask for the foo.com and foo.de nameservers. All DNSSEC does is add signatures to these responses, and adds public keys so you can authenticate responses the next level down.

    A list of root nameserver IP addresses is included with every local recursive DNS resolver. The list changes, albeit slowly, over the years. With DNSSEC, this list also includes public keys of those root servers, which also rotate, slowly.

  • What you see here is decentralisation working. The issue is with the operator of the de TLD, and as such only that TLD is affected. DNS is not decentralised in such a way, that multiple organisations run the infrastructure of a TLD, those are always run by a single entity.(.com and .net are operated by Verisign)

    So what the issue is, that the operator has, does not change the impact.

Crazy. I can't remember an incident like this ever happened before and it's still not fixed? .de is probably the most important unrestricted domain after .com from an economical perspective. Millions of businesses are "down".

https://status.denic.de/ says "Partial Service Disruption" for DNS Nameservice now.

EDIT: it says "Service Disruption" now

I just spent the better half of an hour to debug unbound and the pihole because I thought it's a me problem...

Good news though, if you add domain-insecure: "de" to your unbound config everything works fine

.de TLD is online. DNS working fine

DNSSEC not working

If using an open resolver, i.e., a shared DNS cache, e.g., third party DNS service such as Google, Cloudflare, etc., then it might fail, or it might not. It depends on the third party DNS provider

https://datatracker.ietf.org/meeting/118/materials/slides-11...

I'd expect political escalation for something like this but given that this is Germany, who knows.

Shops open normally from 8am to 8pm in Germany. Today we decided to pilot opening hours for .de domains as well

Things seem to be on their way up now, and https://status.denic.de/ is working again, at least from here.

DENIC's status page currently says "Frankfurt am Main, 5 May 2026 – DENIC eG is currently experiencing a disruption in its DNS service for .de domains. As a result, all DNSSEC-signed .de domains are currently affected in their reachability. The root cause of the disruption has not yet been fully identified. DENIC’s technical teams are working intensively on analysis and on restoring stable operations as quickly as possible.

I've considered hard-coding some addresses into firmware as a fallback for a DNS outtage (which is more likely than not just misconfigured local DNS.) Events like this help justify this approach to the unconcerned.

  • The irony is that DNS is a global and distributed system meant to be resilient. It’s the DNSSEC layer on top in this case causing problems.

    • The global and distributed system relies on the system actually returning valid responses. If the root servers are broken, whether it's a problem with RRSIG records or A records, the TLD is broken.

      If my domains' DNS servers start pointing at localhost, that doesn't mean DNS is a broken protocol.

So glad I found someone mention this. Amazon.de, SPIEGEL.de is down. Highly prominent sites unreachable. I wonder how long this will last and how big of a thing this ends up being once people talk about it :o Feels big to me

  • amazon.de, spiegel.de are down for me, too. heise.de works, but that might've been cached somewhere on my side.

    • dig manages to dig out ips for heise.de and tagesschau.de but not spiegel.de amazon.de and google.de However, dig @8.8.8.8 has still amazon.de cached, unlike 1.1.1.1 so perhaps Google to the rescue?

      [Edit] After playing around with it, google seems to have at least some pages cached. After setting dns to 8.8.8.8 amazon.de and spiegel.de work again, my blog does not.

That postmortem should be a fun read, can't wait.

  • Given how amateurish German IT operations is, there is no guarantee whatsoever there will be a post-mortem nor whether it then will make it out under 3-6 months with all the necessary approvals.

    • Bla bla, always easy to rant...

      https://blog.denic.de/denic-informiert-uber-die-behebung-der...

      "Die Störung ist inzwischen behoben und alle Systeme laufen wieder stabil. Die genaue Ursache wird derzeit noch analysiert. Sobald belastbare Erkenntnisse vorliegen, wird DENIC diese transparent zur Verfügung stellen."

      translation:

      ‘The disruption has now been resolved and all systems are running smoothly again. The exact cause is currently being investigated. As soon as reliable findings are available, DENIC will make them publicly available.’

  • Ok children, sit down and listen, uncle Culonavirus will tell you a story:

    "It all began with the decommissioning of the last nuclear power plant, ..."

On a slightly unrelated note, I was setting nameservers for two .de domains a few weeks ago and thought my provider was being crazily strict because they kept getting rejected. Turns out you can't point to a nameserver until that nameserver has a zone for the domain, and you can't use nameservers from two providers unless those two providers are both in the NS records at both ends

  • Common paint point with DNSSEC. It’s brutal in the domain industry because when you buy a name with DNSSEC enabled it oftentimes can’t be setup to resolve due to these sorts of issues. Typically seller needs to deactivate first.

This is the kind of system failure that we need really good and well tested disaster recovery plans for. While not necessary this time, DENIC and any critical infrastructure provider should be able to rebuild their entire infrastructure from scratch in a tolerable amount of time (Rather days than hours in the case of a full rebuild). Importantly the disaster recovery plan has to work without reliance on either the system that is failing, but also on adjacent systems that might have hidden dependencies on the failing system.

I'm really not too close to Denic and know nothing about their internals, but just close enough to have experienced the stress of someone working for DENIC second hand during the outage. From the very limited information I happened to gather DENIC had some trouble in addressing the issue because, surprise, infrastructure that they need to do so runs on de domains. [1]

I'm convinced there are all kinds of extended cyclic decencies between different centralization points in the net.

If some important backbone of the internet is down for an extended time, this will absolutely cause cascading failures. And thesw central points of failure are only getting worse. I love Let's Encrypt, but if something causes them to hard fail things will go really bad once certificates start to expire.

We need concrete plans to cold start extended parts of the internet. If things go really bad once and communication lines start to fail, we're in for a bad time.

Maybe governments have redundant, ultra resistant, low tech communication lines, war rooms and a list of important people in the industry who they can find and put in these war rooms so they can coordinate the rebuild of infrastructure. But I doubt it.

[^1] I don't know if there is some kind of disaster plan in the drawer at DENIC that would address this. I don't mean to allege anything against DENIC specifically, but broadly speaking about companies and infrastructure providers, I would not be surprised if there was absolutely no plan on what to do if things really go down and how to cold start cyclic dependencies or where they even are.

ok i picked a bad day to move from one register to another... i just spent the last hour frantically trying to figure out why the new register screwed us or the old register was screwing us...

Should I do my usual rent about how the web PKI refuses to move to a consensus protocol

Am I reading this correctly? All .de domains are down? Looking forward to reading the postmortem.

funfact: enabling DNS sec NOW will fix your domain instantly if dnssec was disabled before

-> no idea if that also "heals" anyone who had dnssec on before.

-> no idea if maybe they need to roll back something and then rebreak the new dnssec i made a minute later lol...

Whole .de TLD seems to go offline right now due to dnssec or missing nic.de nameservers?

  • This works:

        $ unbound-host -t A www.denic.de
        www.denic.de has address 81.91.170.12
    

    This does not:

        $ unbound-host -D -t A www.denic.de
        www.denic.de has address 81.91.170.12
        validation failure <www.denic.de. A IN>: signature crypto failed from 194.246.96.1 for DS denic.de. while building chain of trust
    

    So it does seem DNSSEC-related.

    EDIT My explanation was wrong, this is not how keytags work. The published keytag data is consistent:

        de. 3600 IN DNSKEY 256 3 8 AwEAAfRLmzuIXVf7x5A0+U7hke0dS+GEJG0EdPhnOthCCLhy0t0WqLyoXJOhnfsTJ8vQX5fd9qOJc9gyr3SWJZkXAhPm3yPSC7FWWHF70WZTKKM9CekmKdqwMwq6ZCjMSUcecCuSF4Sbt1MRszV7rFmfGVklA1l5UzNbqwD+Dr5vfcLn ;{id = 33834 (zsk), size = 1024b}
        de. 3600 IN DNSKEY 257 3 8 AwEAAbWUSd/QN9Ae543xzdiacY6qbjwtZ21QfmdgxRdm4Z7bjjHWy249uqxCyjjjoS4LDoRDKmj7ElffMKvTWKE1qFKu0p8TUy4wyhX0M+m5FUjvQ3CiZMi+qY7GSHA5B+Zd73cidmnTeb3e8lso6jEsXg05/VZ2AyAqWF6FexEIFxIqiwwLk4UP0BwZ17Ur3q1qx9VSbPMyHgQ9d6nHUN1EEJsTDA2v0vKumsUyp74ZanRZ/bB/6IzpaaZyr5BLF5pSCNdbRNjVmkwYD0993vm79LueyOeibsoHRc16jhALrIJou1PFjdq7YQsYN0KtqRiJtaAfPprDBREpeamPuW/MnW0= ;{id = 26755 (ksk), size = 2048b}
        de. 3600 IN DNSKEY 256 3 8 AwEAAbTe1PJi8EgIudNGb+KRTxBL2aCu5rXkZ+aIe/TC88pwRdrXYeXODp1ihZWFop5CrbWRBLrk/YUPBE8aBc6oJP+58dSkdMLYkjSkmvdvYx+zXnRLWlF2bapxvZxshATJDfGjGbCiWxKEOoyRx3UhICtHC+cUSddsEvzfacUcBb6n ;{id = 32911 (zsk), size = 1024b}
        de. 3600 IN RRSIG DNSKEY 8 1 3600 20260519030655 20260505013655 26755 de. ke56T5GZt/X6zMBAF+ouyCTnAd7RY7MsnDcfa9jyyOwSouRXhvzim/V13JDTMBAnpAHxWQXoruXrAZ6A6re5N+8Pp2utVkAEKTWs0r4UOLNKoZ2+zMwNplKjNNnY5PJIbHfa5myyziLiIsi//qDIgQEACFk+pZcHXrRdqRoXPCL3UtfaXjk3+duDQdlPnYsJys5UshjVpkALSMChW7J0anzr0sG+f9ytstBneymMwFYOUC3NqbejbLPZsXGPZBQKPAoVJuV5q3znopbcqrDFfjI7bmX3QPYNvOaiT1ElBfi2piJVpDzMaMAmm2jCmvrf5VeTOBccMroh8sBtDPsaEg== ;{id = 26755}
    

    The signature on the SOA record still does not verify:

        de. 86400 IN SOA f.nic.de. dns-operations.denic.de. 1778014672 7200 7200 3600000 7200
        de. 86400 IN RRSIG SOA 8 1 86400 20260519205754 20260505192754 33834 de. aZoiAJ+PaHUDVSHNXfV/R26ZK3GpFB7ek2Z46VnZdmPEDaTww+a7PkiQ98W83xohUunXYSvQCMeGYfUre5UT76eBKThdxW2a6ImX9/x/oEzQ9x/69Y/NSeTckOv9m3HCLBOug01op1koiHOIAVEvonOmXEHHqo1P4sR/fNbcVg4= ;{id = 33834}

The last time .de I remember .de had a major outage like this was 2010. I would cite some sources but... you know. That was a fun afternoon, though.

I am very happy that it doesn't happen more often.

even their own status page is not reachable: https://status.denic.de/

As fallback they should use their X account: https://x.com/denic_de

  • Seems to be up now?

    May 5, 2026 23:28 CEST

    May 5, 2026 21:28 UTC

    INVESTIGATING

    Frankfurt am Main, 5 May 2026 – DENIC eG is currently experiencing a disruption in its DNS service for .de domains. As a result, all DNSSEC-signed .de domains are currently affected in their reachability. The root cause of the disruption has not yet been fully identified. DENIC’s technical teams are working intensively on analysis and on restoring stable operations as quickly as possible. Based on current information, users and operators of .de domains may experience impairments in domain resolution. Further updates will be provided as soon as reliable findings on the cause and recovery are available. DENIC asks all affected parties for their understanding. For further enquiries, DENIC can be contacted via the usual channels.

Wow, I thought I was somehow unaffected but my resolver must just have cached the sites I'd tried.

from my analysis DENIC resigned the .de zone today (May 5, 2026, ~17:49 UTC). The DNSSEC signature (RRSIG) for the NSEC3 record covering the hash range of nearly all .de TLD is cryptographically broken (malformed).

On Monday there was a huge outage affecting several cities quite close to Frankfurt because someone cut major fiber line; today DENIC is having a party and right when everyone is drunk this happens because some post-rotation task cannot be completed.

There are too many coincidences happening.

I work with a few people specialised in IT security, and some of them take their jobs too seriously and will "lock down" everything to the point that it becomes a very real risk that they lock out everyone including themselves.

Fundamentally, security is a solution to an availability problem: The desire of the users is for a system to remain available despite external attack.

Systems that become unavailable to everyone fail this requirement.

A door with its keyhole welded shut is not "secure", it's broken.

  • Security is not just a solution to availability. It is also to keep sensitive data (PII, or business secrets, or passwords, or cryptographic private keys, and so on) away from the hands of bad actors.

    If I’m unable to use Amazon for 24 hours it doesn’t really matter. If a photo copy of my passport is leaked that’s worries and potential troubles for years.

  • Security = Confidentiality + Integrity + Availability

    or alternatively,

    Security = (exclude unauth'd reads) + (exclude unauth'd writes) + (include auth'd reads and auth'd writes)

    Gotta satisfy all parts in order to have security.

    • If you squint at it, you can convert all three to just availability.

          Confidentiality = available to us, but nobody else.
      
          Integrity = available to us in a pristine condition.
      

      It's a bit reductive, I'll admit, but it can be a useful exercise in the same way that everything in an economy can be reduce to units of either: "human time", "money" or "energy". Roughly speaking they're interchangeable.

      E.g.: What's the benefit to you if your data is so confidential that you can't read it either? This is a real problem with some health information systems, where I can't access my own health records! Ditto with many government bureaucracies that keep my records safe and secure from me.

      1 reply →