Connect a database

Connect a real database and Catalyst gives you two things at once: a SQL IDE for querying it yourself, and an AI that understands its schema — so you can ask “what were the top ten customers by revenue last quarter?” and let the model write and run the SQL for you. Every query, whether you type it or the AI runs it, is read-only and automatically capped — you (and the model) can look but never change anything, and never pull a whole table.

A SQL IDE

Multiple query tabs, a searchable schema browser, a results grid, and one-click export to CSV / Excel / TSV. See Query a database.

An AI that knows your schema

The model discovers your connections, inspects the tables it needs, and drafts or runs read-only SQL — then hands the rows to Python for analysis or plots.

Supported databases

Catalyst connects to six database engines:

PostgreSQL
MySQL (and MariaDB)
SQLite
DuckDB
SQL Server
Oracle

A connection is private to whoever created it. The credentials you enter are encrypted before they’re stored, and you never see the password again — Catalyst can use it, but it’s write-only from your side.

Adding a connection

Open Data Sources from the sidebar and add a connection.

Give it a name — this is how you (and the AI) refer to it later, so make it recognizable: Production analytics, billing read replica.
Pick the dialect (Postgres, MySQL, SQLite, DuckDB, SQL Server, or Oracle).
Point it at your database. You can either enter the fields — host, port, database, username, password, and SSL mode — or paste a connection string (postgresql://user:pass@host:5432/dbname) and Catalyst splits out the parts for you.
- SQLite and DuckDB are file databases — instead of a host, you upload the database file (.sqlite, .db, or .duckdb). Catalyst stores it privately on the server and queries it read-only.
- Oracle connects by service name (preferred) or SID.
Use Test connection before you save. Catalyst connects, reads back a few table names, and reports how many tables it found — so you know the credentials and network path actually work.
Save. Catalyst connects once, introspects the schema — tables, views, columns, types, primary and foreign keys — and caches a compact summary so the assistant is ready to answer questions without re-scanning your database every time.

If the schema changes on your side — a new table, a renamed column — use Verify & refresh schema on the connection to re-introspect and update the cached summary.

Scoping to one schema

A connection has an optional default schema (under Advanced options). It controls which schema the source exposes:

Set it and the source lists only that schema. This is essential for a huge multi-schema database — an Oracle warehouse with thousands of daily-snapshot schemas, for instance — where listing everything would be overwhelming and slow.
Leave it unset and the source lists all non-system schemas — so a Postgres database still surfaces public, auth, and any others, while catalog/system schemas are filtered out automatically.

You can also set a per-connection row limit here (default 1,000) — the cap applied to every query against this source.

What “read-only” actually guarantees

These aren’t settings you can turn off — they’re how the system is built. Every query, from the editor’s Run button to anything the AI runs, funnels through one path, and that path always:

Rejects anything that isn’t a read. Only SELECT (and read-only relatives like WITH … SELECT, EXPLAIN, SHOW, DESCRIBE) is allowed. INSERT, UPDATE, DELETE, DROP, CREATE, ALTER — and the same hidden inside a subquery or a CTE — are refused before the query ever reaches your database. Catalyst checks for allowed statement types rather than blocklisting dangerous keywords, so an unfamiliar construct fails closed.
Connects read-only. On top of the statement check, the connection itself is opened in a read-only transaction wherever the database supports it.
Caps every result automatically. A query with no LIMIT, or one larger than the connection’s row limit, is clamped — you cannot pull an entire large table. The default cap is 1,000 rows (settable per connection), and the IDE tells you when a result was trimmed rather than silently dropping rows.
Times out. A query that runs too long is cancelled, and you get an actionable message (“narrow it with a WHERE clause or a smaller LIMIT”) instead of a hung session.

Catalyst keeps no copy of your data: it runs queries against your database and shows you the rows. The only things it stores are the connection (encrypted) and the compact schema summary.

Next: query the database — in the IDE or by chatting.