Snowflake IDs need access to Postgres instance ID

I want to use this Postgres function to generate Snowflake IDs:

Would there be a way to get an integer ID for whichever Postgres instance the function is running in? This is needed for the shard_id variable.

PS: I would appreciate early access to Neon!

“Postgres instance” is a fuzzy concept. If the Postgres server is restarted, is it still the same instance or should the ID change? If you make a clone of the database, with a branch or more traditionally by restoring a backup, is the new clone a different instance? If you call the function in a read-only replica of the primary instance, is it a different instance?

The idea is to avoid duplicate IDs, so I guess you would want to treat all of those cases as a different “shard” if possible. But then you have the problem of what ID to pick for each case. There is no global registry of “instances” that you could refer to. The shard ID is only 13 bits, so you cannot just assign a random shard ID without risking collisions.

The Instagram blog post on Snowflake IDs describes how they did it. They assign the shard ID when they create the schema for each shard. It’s a constant in the function itself. That works, but it’s up to you to assign the ID, and if necessary, change it if you create a branch or restore from a backup. A read-only replica would have the same ID as the primary, but maybe that’s OK if you never need to assign IDs in read-only nodes. It depends on how the application uses the ID, when exactly you need to change the ID.

1 Like