Hi.
I’m running a mass import job on my db.
I have several workers writting to my db, each of them doing 3 basic calls (may i mention that on some tables, there are several indexed columns).
I have an events queue that is feeding my workers to a pace of 75 https calls / sec.
I’m using the pooled connection with pgbouncer flag on prisma.
When I try to speed up a little bit the flow, I keep encountering this error: “Timed out fetching a new connection from the connection pool”.
Am I doing something wrong ? What are the best practices to deal with that kind of jobs ?
May I add that I configure auto-scaling but my db compute would never scale…
Hey, Prisma here
Can you expand a bit on your setup - do these workers all start their own Prisma Client instance? How long do these workers live? How many workers are active at the same time, trying to talk to the database?
What is your current Prisma configuration?
What is the connection_limit
? What is your pool_timeout
?
Is what you mentioned your full error message you are getting?
Do you have statistics for the connections used on the database side?
Hi Jan !
Actually those “workers” are 3 containers of an API.
This API exposes a route which, when called, is pushing something in the db.
The aws eventbridge API Destination pushes events to the API route at a rate of 75calls/sec.
My API is build with turbo repo and my db config is in a workspace in packages/database.
The prisma file looks like that:
generator client {
provider = "prisma-client-js"
}
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
}
The prisma client is exported like this:
import { PrismaClient } from "@prisma/client";
declare global {
var prisma: PrismaClient | undefined;
}
export const prisma = global.prisma || new PrismaClient();
if (process.env.NODE_ENV !== "production") global.prisma = prisma;
export * from "@prisma/client";
My env variable looks like: postgres://***-pooler.us-west-2.aws.neon.tech/noticit?pgbouncer=true&connect_timeout=20&connection_limit=40
The stacktrace:
error: unhandledRejection:
Invalid `prisma.dilicomNotification.create()` invocation:
Timed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 10, connection limit: 40)
PrismaClientKnownRequestError:
Invalid `prisma.dilicomNotification.create()` invocation:
Timed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 10, connection limit: 40)
at Cn.handleRequestError (/app/node_modules/@prisma/client/runtime/library.js:123:6817)
at Cn.handleAndLogRequestError (/app/node_modules/@prisma/client/runtime/library.js:123:6206)
at Cn.request (/app/node_modules/@prisma/client/runtime/library.js:123:5926)
at async l (/app/node_modules/@prisma/client/runtime/library.js:128:9968)
at async DilicomNoticeService._saveNotification (file:///app/apps/api-express/dist/services/DilicomNoticesService.js:22:23)
at async DilicomNoticeService.pushNotification (file:///app/apps/api-express/dist/services/DilicomNoticesService.js:114:9)
at async file:///app/apps/api-express/dist/routes/index.js:38:5 {"date":"Wed Nov 08 2023 09:19:44 GMT+0000 (Coordinated Universal Time)","error":{"clientVersion":"5.4.2","code":"P2024","meta":{"connection_limit":40,"timeout":10},"name":"PrismaClientKnownRequestError"},"exception":true,"os":{"loadavg":[8.58,6.9,5.23],"uptime":12024720.9},"process":{"argv":["/usr/local/bin/node","/app/apps/api-express/dist/index.js"],"cwd":"/app","execPath":"/usr/local/bin/node","gid":65533,"memoryUsage":{"arrayBuffers":134226,"external":4759807,"heapTotal":68726784,"heapUsed":41853328,"rss":198647808},"pid":1,"uid":1001,"version":"v20.9.0"},"stack":"PrismaClientKnownRequestError: \nInvalid `prisma.dilicomNotification.create()` invocation:\n\n\nTimed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 10, connection limit: 40)\n at Cn.handleRequestError (/app/node_modules/@prisma/client/runtime/library.js:123:6817)\n at Cn.handleAndLogRequestError (/app/node_modules/@prisma/client/runtime/library.js:123:6206)\n at Cn.request (/app/node_modules/@prisma/client/runtime/library.js:123:5926)\n at async l (/app/node_modules/@prisma/client/runtime/library.js:128:9968)\n at async DilicomNoticeService._saveNotification (file:///app/apps/api-express/dist/services/DilicomNoticesService.js:22:23)\n at async DilicomNoticeService.pushNotification (file:///app/apps/api-express/dist/services/DilicomNoticesService.js:114:9)\n at async file:///app/apps/api-express/dist/routes/index.js:38:5","trace":[{"column":6817,"file":"/app/node_modules/@prisma/client/runtime/library.js","function":"Cn.handleRequestError","line":123,"method":"handleRequestError","native":false},{"column":6206,"file":"/app/node_modules/@prisma/client/runtime/library.js","function":"Cn.handleAndLogRequestError","line":123,"method":"handleAndLogRequestError","native":false},{"column":5926,"file":"/app/node_modules/@prisma/client/runtime/library.js","function":"Cn.request","line":123,"method":"request","native":false},{"column":9968,"file":"/app/node_modules/@prisma/client/runtime/library.js","function":"async l","line":128,"method":null,"native":false},{"column":23,"file":"file:///app/apps/api-express/dist/services/DilicomNoticesService.js","function":"async DilicomNoticeService._saveNotification","line":22,"method":"_saveNotification","native":false},{"column":9,"file":"file:///app/apps/api-express/dist/services/DilicomNoticesService.js","function":"async DilicomNoticeService.pushNotification","line":114,"method":"pushNotification","native":false},{"column":5,"file":"async file:///app/apps/api-express/dist/routes/index.js","function":null,"line":38,"method":null,"native":false}]}
Unfortunately I’m not an expert in db performance so I don’t know what kind of statistics you could use regarding the db connections.
Thanks anyway or your help
Ok, so you basically have 3 “servers” running with your application code. An external source sends 75 request/second to these three servers combined.
(Is that correct? Are they load-balanced? How do these 75 requests get distributed across the 3 instances?).
You mentioned that you are running with a connection_limit
of 40, and a connect_timeout
of 20 - which means your pool_timeout
is the default of 10 (which the error message output confirms ).
Now we need to understand what the application does:
- How many queries does 1 request usually trigger?
- How long do these queries roughly take on average? Miliseconds? Seconds? 1 second? 20?
With that data, we will be able to do some calculations that will tell us if it is expected that your system runs into this error message because the capacity of the Prisma Client connection pool is exhausted.
Skipping a bit, to potentially unblock you:
- You can probably just increase the
pool_timeout
to a higher number than 10 and see what that does to your system. If it is at capacity, it will probably just lead to a few more queries succeeding before running into the same problem. - You can also further increase the number of connections you give Prisma via
connection_limit
. 3 servers x 40 connections only leads to a total of 120 connections open - and Neon can handle many more than that. Try to increase them slowly though (not directly to 400 - maybe start with 60) and see what the CPU and memory of your API hosts does. Keeping connections open and running queries on them is resource intensive - so sooner or later you will max out the resources and will get other, worse errors.
To understand what Prisma Client is doing on your servers, you can use Prisma Metrics. Put that into an endpoint that you can call, and you should be able to understand how many connections Prisma has open (should be 40 when enough requests come in). Optimally you can ask each of your three servers specifically so you can add up the numbers. If there is something weird, post the data here.
Finally, the number from the database/Neon side we are interested here is the simplest: Open connections to the database. I am not sure if and where you can see that in the Neon web UI (@Mahmoud?) - but if you connect to your database with another database GUI, you can just run SELECT count(*) FROM pg_stat_activity;
to get a rough number. Or you look at the full data via SELECT sa.* FROM pg_catalog.pg_stat_activity sa
and then can get an even better overview. The number should be roughly 120 if all the Prisma Client instances are fully utilizing their connection pool. If it is not, post here.
Hi,
On a daily basis I have less data to import, so I have scaled down to one replica.
I use AWS eventbridge target to make sure there are only N calls/sec sent to my replica.
I configured the event bridge to be 18 calls/sec. (it’s pretty low i think).
Though I’m still experiencing failures due to this error:
PrismaClientKnownRequestError:
Invalid `prisma.dilicomNotification.create()` invocation:
Timed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 20, connection limit: 100)
at Cn.handleRequestError (/app/apps/api-express/prisma/client/runtime/library.js:123:6817)
at Cn.handleAndLogRequestError (/app/apps/api-express/prisma/client/runtime/library.js:123:6206)
at Cn.request (/app/apps/api-express/prisma/client/runtime/library.js:123:5926)
at async l (/app/apps/api-express/prisma/client/runtime/library.js:128:9968)
at async DilicomNotificationModel.create (file:///app/apps/api-express/dist/models/DilicomNotificationModel.js:5:30)
at async DilicomNoticeService.pushNotificationFromDiliwhatPayload (file:///app/apps/api-express/dist/services/DilicomNoticesService.js:126:34)
at async file:///app/apps/api-express/dist/routes/index.js:68:13 {
code: 'P2024',
clientVersion: '5.4.2',
meta: { connection_limit: 100, timeout: 20 }
I ran the command SELECT count(*) FROM pg_stat_activity;
on my db and I found 20
which seems a bit weird doesn’t it ?
There are also some occurences of this error, but only a few:
Can't reach database server at `ep-winter-scene-374387-pooler.us-west-2.aws.neon.tech`:`5432`
Please make sure your database server is running at `ep-winter-scene-374387-pooler.us-west-2.aws.neon.tech`:`5432`.
at Cn.handleRequestError (/app/apps/api-express/prisma/client/runtime/library.js:123:6817)
at Cn.handleAndLogRequestError (/app/apps/api-express/prisma/client/runtime/library.js:123:6206)
at Cn.request (/app/apps/api-express/prisma/client/runtime/library.js:123:5926)
at async l (/app/apps/api-express/prisma/client/runtime/library.js:128:9968)
at async DilicomNotificationModel.create (file:///app/apps/api-express/dist/models/DilicomNotificationModel.js:5:30)
at async DilicomNoticeService.pushNotificationFromDiliwhatPayload (file:///app/apps/api-express/dist/services/DilicomNoticesService.js:126:34)
at async file:///app/apps/api-express/dist/routes/index.js:68:13 {
code: 'P1001',
clientVersion: '5.4.2',
meta: {
database_host: 'ep-winter-scene-374387-pooler.us-west-2.aws.neon.tech',
database_port: 5432
}
}
When I launch the more precise request SELECT sa.* FROM pg_catalog.pg_stat_activity sa
, I only see a few lines working on noticit db (no more than 10).
Does that means that the pooler is not opening a sufficient number of connections ?