9 comments

  • eugercek 2 hours ago
    If you use xfs (+`file_copy_method=CLONE`) you can do this with Postgres 18.

    `CREATE DATABASE clankerdb TEMPLATE sourcedb STRATEGY=FILE_COPY;`.

    But Ardent can be useful for many, because cloud providers uses heavily restricted Postgres. And many use Aurora, which doesn't event let configure the `log_line_prefix`.

    Though if cloud providers add file_copy_method=CLONE compatible managed pg ...

    ref: https://boringsql.com/posts/instant-database-clones/

  • znnajdla 3 hours ago
    “Never impacts production data” is impossible to guarantee. Playing with real world data often has side effects outside of the database. For example if you store oauth tokens to external services in your DB (customer integrations) it’s easy to mess up your customers data through a bad API call (been there done that).

    There is still value in carefully testing on your prod DB, but for that you could just easily maintain a read replica. I don’t see the need for a SaaS here.

    • vc289 2 hours ago
      One of the main things people use us for is ease of testing writes on a per dev/agent basis which would be difficult on a read replica!

      On the real world data impact I absolutely agree. We added something called "branch hooks" which essentially let you define SQL to run against the branch before it's returned

      This lets you essentially anonymize and modify the branch to scrub unintended external side effects.

      It's something that we're still working on though and trying to design the right abstractions around because we want to get that part right.

    • 999900000999 1 hour ago
      If it’s production data I probably don’t trust a random startup with it.

      I’m very confused as to the target market here

  • jedberg 2 hours ago
    Looks interesting, curious what your moat here is. What prevents Supabase/Neon from doing this? Actually don't they already do this? How does this differ from the branching Neon and Supabase already offer?
    • vc289 2 hours ago
      We enable branching on any postgres DB through our architecture. So if you're on RDS, Planetscale, etc you can keep your DB where it is but also get the ability to branch with a full clone of the DB.

      Neon does support copy on write branching natively and autoscaling compute but you make certain performance tradeoffs. A lot of the folks we've talked to that use RDS or Planetscale are reliant on things like query latencies supported by that platform's specific architecture but also want the ability to test on branches. We let you get the best of both worlds (branch but leave your DB where it is and freely choose your production environment based on prod concerns)

      Supabase does have branching but they do not branch the data so you can't test any interactions that rely on the data. You can restore from backup as an option but this slows down based on data size since you're actually moving data as opposed to copy on write.

      Longer term we want to be the place you branch all your data infra. So expanding to S3, Snowflake, MySQL etc.

      For now though we're focusing on just postgres and getting it right!

  • nilirl 3 hours ago
    Hi, site looks beautiful!

    How does this compare to managing our own read-only replica with anonymized data?

    • vc289 2 hours ago
      A true read replica won't let you write! So if you need to test something like a backfill and see if anything goes wrong you wouldn't be able to quite as easily.

      We'd let you instantly clone prod + user defined auto-anonymization so you can test writes. The architecture also somewhat takes the place of an existing read replica if you want to use it like that to make it more cost efficient.

      Also since we're using copy on write for the clones they're incredibly storage efficient and the autoscaling compute helps minimize cost on clones by minimizing excess compute uptime

      • jagged-chisel 2 hours ago
        > A true read replica won't let you write!

        I mean, they said "read-only" ...

    • xnx 2 hours ago
      Ardent adds extra dependencies and cost.
  • fmajid 2 hours ago
    Doesn't look open-source. If you are interested in having a Neon or git-like branching for PostgreSQL experience, have a look at Xata, which is based on ZFS like Delphix was:

    https://github.com/xataio/xata

    • polskibus 44 minutes ago
      Would such approach work for MS SQL?
  • cphoover 2 hours ago
    How many people are giving an LLM Agent full read access to their production data? That seems nuts to me.
    • evanvolgas 1 hour ago
      Evan here, from Ardent.

      It's not uncommon (hex.ai, etc all do this, as do developers, MCP tools, etc). One thing we do at Ardent is enable obfuscated read replicas. We can strip PII in the replicas, so your agents are operating on realistic (but not sensitive) data. Moreover, they can do so in a way that doesn't impact your production database and is fast enough to wire into your CI/CD processes.

      Jeremy is correct, though. The main risk/concern is primarily agents with write access. There are two high profile instances in the last year of agents dropping production databases (even when, in one case, after being given explicit instructions to never do such a thing). While read-replicas of a primary DB solve the "agents can't destroy things" problem, they don't solve things like testing schema migrations (in particular) or updates to the data.

    • Normal_gaussian 1 hour ago
      Business side people install Claude, find it fantastic, read about postgres and BigQuery MCP, and immediately demand it.

      Small enough company without suitable MoC and they've got a real chance of getting it.

    • jedberg 2 hours ago
      I'm much more worried about people who give full write access to their agents! But at least this solves that problem.
      • cphoover 2 hours ago
        Jedberg... Wow an internet legend replied to me! ><

        > I'm much more worried about people who give full write access to their agents! But at least this solves that problem.

        Yeah it goes without saying that write access would be crazy... But, it seems like people don't really care about the fact that they are just giving their private data to companies like Anthropic, OpenAI and Google.

        > Branch anonymization Branches default to a full copy of your production data.

        <-- This doesn't seem a safe default to me...

        Perhaps a data policy should be required to be in place before a branch can be cloned... The default configuration giving the LLM full prod data access by default, is a bad standard to set, I think.

        • jedberg 2 hours ago
          > Jedberg... Wow an internet legend replied to me!

          Hey, I put on my pants the same way you do: by having my staff hold them up while I jump into them.

          > But, it seems like people don't really care about the fact that they are just giving their private data to companies like Anthropic/Open AI and Google.

          This isn't quite as risky as it seems. All of them have a TOS that says if you pay them enough money they won't train on your data. But you're right that there are probably a lot of people who aren't on those plans sharing private data.

          > > Branch anonymization Branches default to a full copy of your production data. > <-- This doesn't seem a safe default to me...

          Agreed, and I'm sure it will cause trouble if you don't also bring along with the copies the internal controls around access logging.

          But also, for smaller companies, this isn't an issue since they don't have SOC2 and the other compliance needs yet. So it's probably a sane starting place for Ardent at this time. Most small startups let everyone in the company access the full database anyway.

          > Perhaps a data policy should be required to be in place before a branch can be cloned... The default configuration giving the LLM full prod data access by default, is a bad standard to set, I think.

          Or at least an easy way to copy it from the database you're branching from.

          • vc289 56 minutes ago
            >> I'm sure it will cause trouble if you don't also bring along with the copies the internal controls around access logging

            Yep! Agreed. We've tried to combat this with the "branch_hooks" being team/org level policy objects so we can do enforcement of any kind on the branches before they're ever actually handed to users. This would be things like access control + defined anonymization rules. The broader hope with this class of objects/policies is they can serve as enforcement barriers and essentially allow scoped access at the org level across branches.

            The proxy we run in the middle also helps a lot here. Since the URL is minted by our control plane and is not the "real" DB url we can authenticate each user from the URL they're using and enforce RBAC controls.

            for example:

            User 1's API key is 1234

            The CLI can auto-construct urls like: postgresql://{APIKEY}:{ANYTHING}@{IDENTIFIER}--postgres.routing.tryardent.com:5432/DB_NAME?{params}

            Your API key is something that can be scoped per user

            This is an off the cuff example but essentially we have a way of knowing who is calling the host and thus can enforce if APIKEY = You can't access this DB based on whatever rules.

            Curious to understand what additional pieces would be helpful here because this is 100% very important to get right.

    • evolgas 1 hour ago
      [dead]
  • Serhii-Set 1 hour ago
    [flagged]
  • kramit1288 56 minutes ago
    [dead]
  • galaSerge 1 hour ago
    [flagged]