AI-Powered Server Automation | Simplify DevOps with LLMs

Now launching products with LLMs internally has become much easier, since you can partially free yourself from DevOps routines. However, this doesn’t mean that you can neglect oversight — you still need to monitor everything.

For example, I am currently setting up a new server for BitGN Sandbox almost from scratch. As an experiment, I sent the following request to a local instance of Codex:

> I installed a new OS with minimal configuration, but the server still doesn’t respond over HTTPS. Fix this. There’s no need to launch the service yet. It’s enough for Caddy to respond at /hello.

My production server runs NixOS, and the deployment documentation includes a section on DNS and configuration setup — that’s more than enough for OpenAI Codex to work with the server autonomously.

Within a few minutes, Codex fixed the configuration and paused, waiting for my permission to proceed. I gave the command to continue and asked it to finish the task.

After that, it made system changes, checked the server, confirmed that Caddy issued certificates, and verified functionality via addresses — everything worked as expected. It also added a test HTTP endpoint.

Encouraged by success, I decided to make the task more complex: I asked it to set up support for wildcard domains — so that virtual machines for agents could be launched at addresses *.eu.bitgn.com without needing to issue a new certificate each time. This was more challenging: it required activating Cloudflare DNS-01 challenge and performing several small configuration adjustments.

Codex explored the plugin source code for Caddy, quickly understood the configs, and set everything up in minutes. I would have taken several hours for this — I know because I tried to do the same yesterday without success and wasted time. But this morning, I realized how to organize a good Feedback Loop — to safely delegate tasks to Codex and get the desired results.

Thanks to NixOS, Codex can do anything with the system (even accidentally damage it). There’s always an option to roll back to a previous version and reboot without issues.

When discussing the creation of core systems with LLMs for business, the main rule was: build tests and create a Feedback Loop to evaluate system quality.

This approach remains relevant when working with agents (both for development and end products). Now, instead of tests, an Engineering Harness along with a Feedback Loop is crucial — to ensure stability and quality.

Created with n8n:
https://cutt.ly/n8n

Created with syllaby:
https://cutt.ly/syllaby