Longer than 1 minute cold starts on main branch

Sooo… I run an app that we use at my place of work for scrum planning, and we encountered issues that forced me to add a huge timeout to ensure things ran smoothly. These cold starts are causing 504 errors due to next auth being unable to fetch the session. My bill of course spiked due to this. For now I migrated to Planetacale to have something stable until I see this working well for an extended period of time.

I deleted that project and started a new one and the same region… and now the cold starts are basically instant??? I am incredibly confused. All of this is on us-west-2 btw

Hey @atridad, sorry to hear that you’ve faced such a long start.

We are actively working on resolving this problem. With cold start there are basically two different streams of work:
a) reducing average/p90/p99 cold start times across all of the connection attempts
b) keeping the slowest cold start operation in reasonable range

With a) we recently made a big progress and reduced p90 from ~5s to <=1s and in some regions to 200-300ms. But slowest operations can still take way more time as you’ve experienced. Now we are working towards fixing b) to have sane slowest starts. I’ll post an update here with some graphs across the fleet once we fix b).

These cold starts are causing 504 errors due to next auth being unable to fetch the session. My bill of course spiked due to this

Do you mean Neon bill? Period of the cold start should not be included in the bill currently, please DM me if that was not a case for you and we’ll investigate.

1 Like

It spiked because I had to bump the timeout.

Thanks for the quick reply though! I will give it a bit and probably run a second instance with neon at the same time. Ideally I could reliably use Neon for this since I am not limited to 3 free projects like on planetacale and using neon pro is cheap enough that I don’t mind paying to use 20 projects

For context: I do use Prisma and next-auth here, and it was doing this when tabbing back to the window after being away for a while. It would try to check the session and just hang.

So I tried the same thing on a new project and get timeouts. Basically from a UI perspective, what happens is it someone starts a vote, tabs over to the story we are voting on, and after a minute or so comes back, and then it just signs them out because they can’t get the session