DNS outage 21st until 23rd of March 2024

On Thursday, March 21, 2024, our system monitoring surprised us that many of our services were not available externally. Ultimately, the cause was that the DNS name entries for the domains and subdomains of our services could not be resolved, and the services themselves (i.e. the virtual machines and web services) continued to run successfully in the background. The DNS entries for e.g. os4x.com, seon.de or c-works.de are stored with our host Server4you. The DNS entries used are stored in the WHOIS database, and these DNS servers (which are operated by Server4you) were not accessible. We tried to reach the hoster directly by phone, but without success: “no connection at this number”, “busy”: no way to reach support. Emails to the hoster were not delivered because their mail server was also offline and their own domain server4you.de did not provide a name resolution for the mail server (the name server that should have provided the MX record was not available). Since the hoster’s homepage was also not available, the situation was very unclear. Through research via distant) has a critical situation; there was speculation about water ingress.

On the morning of March 22nd at 7:47 a.m. Plusserver announced that there was a problem in a battery room. At 2:51 p.m. in the afternoon, the data center operator announced that there was a small fire in the battery system on Thursday morning, the servers were undamaged, the power supply was temporarily out, and they were working on restoring the power supply. This dragged on into the weekend.

On Saturday morning, March 23rd. the operator Server4you could be reached again by phone, but without technical support, you had to be patient. There would be no escalation management and no possibility of contacting higher authorities. At our insistence, an attempt was made to contact us to have our domain entries reconfigured for at least the most important services. On Friday we had already chosen Cloudflare as a possible alternative for DNS services and had made preparations. This suggestion was implemented by Server4you by 8:37 p.m. on Saturday by changing the WHOIS database so that we could manually set up all domains and subdomains on Cloudflare. On Sunday, March 24th This was carried out by around 1 p.m. so that all services can be reached again via Cloudflare’s DNS servers and all customers can work.

We have learned from this incident that there is no promised redundancy (regardless of the provider) and will prepare appropriate backup measures for future critical situations. Communication was not possible during this outage (neither our own nor that of Server4you), we have set up a status page for you hosted on another server location: https://www.c-works-status.com.

Server4you is currently still working on restoring the power supply (currently a 19to heavy cable has to be delivered by truck, which can only be delivered today due to the Sunday driving ban; details see https://www.server4you.de). Other companies such as Plusserver quickly left the location and loaded their servers by truck to Cologne (see https://status.plusserver.com/incidents/s6lzkwsc3tbj).

We are happy to provide you with technical details, we will try to explain everything to you and hope that the failure has not caused any fundamental damage to you. Measures such as automatic Hourly collection of files provided via OFTP2 provided at least some relaxation for the systems we hosted.