Expose a database to an AI agent safely: a checklist
If you're giving an AI agent read access to MySQL, Postgres or SQL Server, run through this checklist first: roles, timeouts, predicate injection, audit log and what not to do.
If you are about to point an AI agent at your production database, stop and read this first.
The good news: read-only access from an AI agent is safe when set up correctly. The bad news: the defaults of most “connect your database” wizards are not.
This is the checklist we use ourselves and recommend to every customer.
1. Use a separate database role for the agent
Do not reuse the application user. Do not use the admin user. Create a fresh role:
-- PostgreSQLCREATE ROLE ai_agent LOGIN PASSWORD '...' NOINHERIT;REVOKE ALL ON ALL TABLES IN SCHEMA public FROM ai_agent;GRANT USAGE ON SCHEMA public TO ai_agent;GRANT SELECT ON ALL TABLES IN SCHEMA public TO ai_agent;ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO ai_agent;For MySQL it is similar: GRANT SELECT ON db.* TO 'ai_agent'@'%'. Do not grant any other verb. Specifically not INSERT, UPDATE, DELETE, CREATE, DROP, ALTER, INDEX, REFERENCES, FILE.
2. Lock down the system catalogs
On PostgreSQL, pg_catalog and information_schema are readable by anyone by default and can leak more than you think: table sizes, row counts, schema of tables you didn’t intend to share, function bodies.
REVOKE USAGE ON SCHEMA pg_catalog FROM ai_agent; -- careful, breaks toolingREVOKE SELECT ON ALL TABLES IN SCHEMA information_schema FROM ai_agent;If your tooling needs catalog access to discover schema, expose a curated view over information_schema.columns with only the tables you want visible.
3. Hard-cap statements with timeouts
-- PostgreSQL: set on the roleALTER ROLE ai_agent SET statement_timeout = '15s';ALTER ROLE ai_agent SET idle_in_transaction_session_timeout = '30s';ALTER ROLE ai_agent SET lock_timeout = '0';MySQL: use MAX_EXECUTION_TIME hint or a per-session setting from the agent driver.
A runaway query under a chat UI is worse than failing fast.
4. Use a read replica, not the primary
NL queries are bursty, unpredictable, and sometimes expensive. Replicas isolate the blast radius. Add a managed read replica (RDS / Cloud SQL / self-managed), and point the agent there.
Bonus: if a query goes wrong, the replica gets behind, not the primary.
5. Enforce tenancy outside the LLM
If your database has multi-tenant data, do not trust the LLM to add WHERE org_id = ?. Use:
- PostgreSQL row-level security, with the agent’s session setting
app.current_org_id. - Or a SQL post-processor that AST-rewrites every query to inject
WHERE org_id = ?on every base table.
We have seen both approaches in production; RLS is more robust because it works even if the post-processor has a bug.
6. Filter what the LLM “sees”
The agent should not see the raw users table with PII. Build a view:
CREATE VIEW v_users AS SELECT id, created_at, country, plan_tier, lifecycle_stage FROM users;GRANT SELECT ON v_users TO ai_agent;REVOKE SELECT ON users FROM ai_agent;The agent answers questions over v_users. Email addresses, phone numbers and hashed passwords are simply not in the schema it can query.
7. Wrap every query in a LIMIT
In the application layer that submits to the database, force-add LIMIT 10000 (or whatever your ceiling is) if not present. Aggregations should already be bounded, but defensive depth is cheap.
8. Audit log every query
The agent’s database session should write every executed query to an append-only log: (timestamp, user_id, question, sql, row_count, duration_ms, error?). Retain for at least 90 days.
When something looks off, you have a record of exactly what ran. Without this, debugging is impossible.
9. Network: egress only
If your DB is in a VPC, the agent should connect from the agent’s network to the DB endpoint with TLS. Do not open the DB port to the internet. Use:
- A private endpoint inside the same VPC.
- A bastion or session manager.
- Or an outbound-only sync runner running inside your VPC that opens a connection to the analytics SaaS, never the other direction.
The third option is the right one for most enterprise setups. AnalytAI’s Edge Sync Runner works this way.
10. Rotate credentials
Standard hygiene, often forgotten:
- Rotate the
ai_agentpassword every 90 days. - Use connection strings that get pulled from a secret manager, not env files.
- If you must use env files, do not commit them.
What to skip
A few things you will see in security checklists that don’t help here:
- Encrypting queries before they hit the LLM: not really a thing. The LLM needs to read the query to answer.
- Differential privacy: overkill for B2B analytics in 99% of cases.
- Trying to “scrub” PII from the schema dynamically: fragile. Use views instead.
A pre-flight test
Before connecting an agent to production, run this five-minute test:
- Try to insert a row. Confirm: permission denied.
- Try to select from
usersdirectly. Confirm: permission denied (because you moved the agent tov_users). - Try a query that estimates to 60 seconds. Confirm: statement timeout fires.
- Cross-tenant test: ask “show all customers” with row-level security enabled. Confirm only the right org_id rows return.
- Check the audit log shows the queries from steps 1–4.
If all five pass, you are ready.
Doing this with AnalytAI
AnalytAI ships read-only by design. The SQL connector takes a read-only credential and we encourage replica-side connections. The Edge Sync Runner option keeps the connection outbound from your network.
Book a 20-minute demo to talk through the security posture for your environment.
Related reads:
See AnalytAI on your data
Bring one Tally company or one SQL database. We will turn it into a live dashboard on a 20-minute call.
Book a demo