Long start_compute times

What determines how long it takes for start_compute to finish? I’m seeing consistent delays of ~60 seconds. It was consistently around 20 seconds just a few days ago, and 15 seconds less than a week ago.

Hello, we are actively investigating. We recently got a lot of traffic with the launch of the Hasura integration. We will reply here as soon as we fix the issue. The start time should be under 5 seconds.

Thanks for the report! Something happened in the storage layer with that database last week that is causing background compaction operation to fail. That’s affecting the performance of that database. We haven’t found the root cause yet, but we are investigating. You can follow the debugging efforts here if you’re interested: VM/FSM page is lost in image layer · Issue #2601 · neondatabase/neon · GitHub. But I’ll let you know once it’s fixed.

In the meanwhile, if you need it urgently, you could create a new project and dump and restore the database there. I believe it would be fast there. Please don’t delete the old project, though, we want to finish the debugging and fix this properly even if there is a workaround available.

Is there anything you can tell us about the application that might help us to reproduce the issue? What kind of workload is it? What kind of updates, etc? You can also send me details by email at support@neon.tech if you don’t want to share them here in the public.

Didn’t help, unfortunately.

The database in question has two cron jobs that update it regularly. One usually adds a couple rows, and the other updates almost all of the rows. Both jobs run every ~6 hours, though not at the same time.