AT THE FRONTIER OF AGENTIC WORKFLOWS

We gave our team's agents direct database access. Now you get the same.

Most "AI for business" products bolt a chat box onto a dashboard and call it a day. We're somewhere else on the curve. ThoughtLeaders rebuilt itself around AI in 2026, and along the way we learned something about how AI agents actually create value: it's freedom - give them access to data and let them fly.

01 · We watched the work

Agents are good at SQL. They are bad at clicking.

We instrumented every internal AI workflow for six months. The pattern was overwhelming: the moment a question got non-trivial — a join, a window function, an unusual filter — the agent wanted to write SQL, not navigate an API. Every layer of abstraction between the agent and the data was a tax. Endpoints, pagination, helper commands — they all got in the way of the question the agent was actually trying to ask.

02 · We made the call

Almost-direct read access to the database.

So we gave it to them. The agent writes a SELECT; the server enforces the guardrails (SELECT-only, plan-scoped column access, LIMIT caps, AST depth limits). No structured endpoints to wrap, no SDK to keep in sync with the schema, no "you must request access to field X" friction. The agent reads the schema once at the start of the conversation and from then on it asks the database directly — in PostgreSQL, in Elasticsearch DSL, or in Firebolt SQL, depending on which moat it needs.

03 · We rolled it out internally

Every account manager. Every analyst. Every day.

The internal CLI has been the daily driver for TL's account managers, analysts, and leadership since Q1 2026. Emma's rebooking-opportunity agent — one account manager's homebrew workflow built on top of tl db pg and tl db es — surfaces renewal pitches no dashboard could have surfaced and drafts the outreach email beside them. Multiply that by the team. The bet paid off: questions that used to be "ask data eng next sprint" are now answered in the agent's next turn.

04 · We're opening it

The same three commands. The same data. Now external.

ThoughtLeaders.AI productizes that exact setup. Your agent gets read access to the same three databases TL's own agents use — seven years of matched sponsorship outcomes, 1.3M tracked channels, every transcript and brand mention indexed and joined. The schema below is the whole API surface. There isn't a second one hidden behind it.

THE THREE COMMANDS

All you need is `tl db`.

Three databases sit behind ThoughtLeaders, each tuned for a different shape of question. The CLI exposes them as three thin commands that take a query, run it, and stream the result. Same auth, same response envelope, same output flags (--json, --csv, --md, --toon) on all three.

Command	Engine	What it does	Why use it
tl db pg	PostgreSQL	Executes one SELECT against the source-of-truth Postgres. Deals, channels, brands, profiles, organizations. Joins, GROUP BY, window functions, percentiles — all accepted. SELECT-only, LIMIT ≤ 500.	Use it for any analytical question that needs structured joins — pipeline math, deal counts by brand, channel cohorts, who-sold-what-to-whom. This is where seven years of matched outcomes live.
tl db es	Elasticsearch	Executes one Elasticsearch DSL query against the content index. Videos, titles, descriptions, full transcripts, brand mentions (paid + organic), demographic and reach fields on channel docs. Query types and shapes are an allowlist.	Use it for content discovery and full-text work — "where was brand X mentioned in the last 6 months", "videos whose transcripts discuss Y", "find every channel running a sponsored read for category Z". This is the brand-extractor moat.
tl db fb	Firebolt	Executes one single-table SELECT against the historical time-series store. 8.5B rows across `article_metrics` (per-video daily snapshots) and `channel_metrics` (per-channel daily snapshots). The leading index column must appear in WHERE.	Use it when you need a value at a point in time that's no longer in the current snapshot — view curves, evergreen multipliers, channel growth trajectories, historical CPM context. Anything "what did this look like on date D".

See the commands in 18 worked use-cases → — or keep reading for the full schema reference.

01 · POSTGRESQL

The Postgres tables you'll actually query.

`thoughtleaders_adlink`

150K rows

The deals table. Every match, proposal, sold ad, and rejection is one row. Internally called "AdLink" — exposed as sponsorships in the CLI.

Column	Type	Notes
id	int	Primary key
publish_status	int	3 Sold · 4 Brand-rej · 5 Pub-rej · 7 Matched · 9 Agency-rej · 10 Open. An Open deal's progress is tracked by the three approval columns below, not by status.
price	numeric	Deal price (USD). Visible to media-buyer accounts.
cost	numeric	TL cost. Visible to media-seller accounts.
weighted_price	numeric	price × pipeline weight. Stored, not computed.
brand_approval_status	int	Brand sign-off on an Open deal: 1 Pending · 2 Approved · 3 Finished · NULL. Visible to brand-side accounts.
channel_approval_status	int	Channel sign-off on an Open deal: 1 Pending · 2 Approved · 3 Finished · NULL. Visible to publisher-side accounts.
agency_approval_status	int	TL/agency sign-off on an Open deal: 1 Pending · 2 Approved · 3 Finished · NULL. Superuser-only.
first_contacted_party	int	Which side opened the deal: 1 Brand · 2 Channel · NULL. Superuser-only.
ad_spot_id	int FK	→ thoughtleaders_adspot.id (your join hop to channel)
creator_profile_id	int FK	→ thoughtleaders_profile.id (your hop to brand)
owner_advertiser_id	int FK	→ auth_user (brand-side owner)
owner_publisher_id	int FK	→ auth_user (channel-side owner)
owner_sales_id	int FK	→ auth_user (TL rep)
created_at	timestamptz	When the row was created. Use for "all flow over time."
purchase_date	timestamptz	When the deal was sold. NULL if never sold. Use for revenue.
send_date	timestamptz	Scheduled video drop date.
publish_date	timestamptz	Actual video drop date. NULL until live.
outreach_date	timestamptz	When outreach went out (sparse).
rejection_reason	int	1–24 codes. 1–9 brand-side, 10–17 publisher-side, 18 demo-mismatch, 19/21/22/24 quality, 20 brand-history, 23 cadence.
payment_status	int	0 Unpaid · 1 Paid
performance_grade	int	TL-internal performance rating
article_id	varchar	<channel_id>:<youtube_id> — joins to ES _id and slices Firebolt
dashboard_campaign_id	int FK	Campaign grouping
tx_data	jsonb	Transaction metadata

`thoughtleaders_channel`

1.3M rows

Every YouTube channel TL has ever observed. The discovery table.

Column	Type	Notes
id	int	Primary key
channel_name	varchar	Display name (NOT `name`)
external_channel_id	varchar	YouTube channel ID (UCxxxxx)
url	varchar	Channel URL
reach	bigint	Subscriber count (NOT `subscribers`)
impression	numeric	Projected views per upload — used for forward-looking CPM
media_selling_network_join_date	date	MSN membership = NOT NULL. Scrubbed from the advertiser sandbox view.
is_tl_channel	boolean	TPP partner — the elite ~170 channels TL manages directly
is_active	boolean	Always include is_active=TRUE unless you specifically want archived rows
content_category	int	1–22 enum. Use the recommender for ranked discovery; this column is best-effort.
country	varchar	ISO 3166-1 alpha-2. ~10% NULL.
language	varchar	ISO 639-1 (mostly 2-letter), occasionally 3-letter long-tail. NOT BCP-47.
last_published	date	Channel's most recent upload date. Use for "is the channel still alive?"
sponsorship_score	double	TL-internal channel-quality score. Useful tiebreaker.
description	text	LLM-generated channel description (regex-able for thematic filtering)
evergreenness	float	Cached evergreen multiplier
demographic_usa_share	int 0–100	% of audience in the US
demographic_male_share	int 0–100	% male audience
demographic_age	jsonb	{"18-24": 30, "25-34": 45, ...}
demographic_device	jsonb	{"mobile": 60, "desktop": 28, "tv": 12}
demographic_geo	jsonb	{"US": 60, "GB": 15, ...}
demographic_device_primary	varchar	Materialized top-device convenience field
total_views	bigint	All-time channel views
demographics_updated_at	timestamptz	When demographic screenshots last processed

`thoughtleaders_brand`

61K rows

Every brand TL has ever observed advertising or being mentioned. The brand intelligence table.

Column	Type	Notes
id	int	Primary key
name	varchar	Brand name
description	text	Brand description
creator_id	int FK	User who created the row

`thoughtleaders_profile`

20K rows

User profile. The bridge between auth_user and brands. Each profile attaches one brand in practice.

Column	Type	Notes
id	int	Primary key
organization_id	int FK	→ thoughtleaders_organization.id
persona	int	User persona code (advertiser / publisher / agency / TL)
media_buying_network_join_date	date	MBN membership = NOT NULL

`thoughtleaders_adspot`

43K rows

Buyable ad placements. Channel × seller × integration type. List prices live here; actual deal prices live on the adlink.

Column	Type	Notes
id	int	Primary key
channel_id	int FK	→ thoughtleaders_channel.id
price	numeric	List/catalogue price
cost	numeric	List/catalogue cost
integration	int	1 = YouTube Mention (live read). One active mention adspot per channel.
is_active	boolean	Active flag
publisher_id	int FK	→ auth_user.id (NOT profile)

`thoughtleaders_organization`

19K rows

Top-level organization. Brands and publishers each belong to one. Permission scoping happens here.

Column	Type	Notes
id	int	Primary key
name	varchar	Org name
plan	varchar	Plan tier — gates intelligence access

`auth_user`

20K rows

Standard Django users. Owners of deals (advertiser, publisher, sales) all point here.

Column	Type	Notes
id	int	Primary key
first_name	varchar	First name
last_name	varchar	Last name
email	varchar	Email

`Junction tables`

— rows

Many-to-many bridges between the core entities.

Column	Type	Notes
thoughtleaders_profile_brands	(profile_id, brand_id)	Profile ↔ Brand. In practice 1:1.
thoughtleaders_brand_brands	(from_brand_id, to_brand_id)	Self-referential brand similarity

$ tl db pg "
  SELECT b.name, COUNT(*) AS deals
  FROM thoughtleaders_adlink a
  JOIN thoughtleaders_profile p  ON a.creator_profile_id = p.id
  JOIN thoughtleaders_profile_brands pb ON p.id = pb.profile_id
  JOIN thoughtleaders_brand b    ON pb.brand_id = b.id
  WHERE a.publish_status = 3
    AND a.purchase_date >= NOW() - INTERVAL '12 months'
  GROUP BY b.name
  ORDER BY deals DESC
  LIMIT 20 OFFSET 0"

Top brands by sold-deal count, last 12 months. Three joins. One round-trip.

02 · ELASTICSEARCH

Video content, transcripts, and brand mentions.

Index is fixed server-side. The only time-window knob is a publication_date range filter inside the body. Article docs and channel docs share the index via a doc_type join — most filtering happens against article docs.

Field	Type	Notes
id	keyword	Compound <channel_id>:<youtube_id> — matches PG adlink.article_id
title	text	Video title
description	text	Video description
transcript	text	Full transcript text
summary	text	AI-generated video summary
publication_date	date	When video was published — your only time-window knob
views / total_views / projected_views	long	Current view counts
likes	long	Like count
comments	integer	Comment count
duration / duration_longform / duration_shorts	integer	Seconds. <61 = Short.
content_type	keyword	longform / short / live
content_category	keyword	Category label
language / country / hashtags / format	keyword	Categorical filters
face_on_screen	boolean	Is the host on screen?
sponsored_brand_mentions	keyword[]	IDs of brands paid to be mentioned in this video
organic_brand_mentions	keyword[]	IDs of brands mentioned without payment
all_brand_mentions	keyword[]	Union of the above
brand_mentions	nested	Full mention objects with timestamp/sentiment
channel.id / channel.country / channel.language / channel.content_category / channel.format	keyword	Embedded channel SUBSET on article docs (only these fields)
reach / impression / impression_live / impression_shorts	long	Channel-doc only. NOT on the embedded channel.* on article docs.
is_tl_channel / is_active / has_outreach_email	boolean	Channel-doc fields
media_selling_network_join_date	date	Channel-doc field — MSN status

$ tl db es '{"size": 50,
  "track_total_hits": true,
  "query": {"bool": {"must": [
    {"term": {"sponsored_brand_mentions": "5612"}},
    {"range": {"publication_date": {"gte": "now-6M"}}}
  ]}},
  "_source": ["id","title","channel.id","publication_date","views"]}'

Find every video where Surfshark (brand id 5612) was a paid sponsor in the last 6 months.

Not accepted: query_string, regexp, wildcard, fuzzy, more_like_this, parent/child joins, anything containing script, multiple aggregations in one body, deep scroll/pit pagination. Use search_after + sort to walk past 10,000.

03 · FIREBOLT

Historical metrics. 8.5 billion rows. Two tables.

Use Firebolt only when you need a value at a point in time that's no longer in the current ES snapshot — view curves, evergreen multipliers, growth trajectories. For "current views/subs", use ES.

`article_metrics`

7.4B rows · 159 GiB

Video-level snapshots. Each row = one scrape of one video on one date.

Column	Type	Notes
id	TEXT	Bare YouTube video ID — NOT the compound form
channel_id	INT	TL channel ID. MUST appear in WHERE.
publication_date	DATE	When the video published
scrape_date	DATE	When this row was captured
age	INT	Days since publication = scrape_date − publication_date
view_count	INT	Cumulative views at time of scrape
like_count	INT	Cumulative likes
comment_count	INT	Cumulative comments
duration	INT	Video duration in seconds

PRIMARY (channel_id, id) — channel_id MUST be equality- or IN-filtered.

`channel_metrics`

1.1B rows · 6.9 GiB

Channel-level snapshots. Subscriber and total-view trajectory.

Column	Type	Notes
id	INT	TL channel ID. MUST appear in WHERE.
scrape_date	DATE	When this row was captured
reach	INT	Subscriber count at time of scrape
total_views	INT	Channel total views at time of scrape

PRIMARY (id) — id MUST be equality- or IN-filtered.

$ tl db fb "
  SELECT age, view_count, like_count
  FROM article_metrics
  WHERE channel_id = 12465 AND id = 'dQw4w9WgXcQ'
  ORDER BY age"

The view curve for one video. Composite key required (channel_id is the leading index).

Joining across the three sources.

You join in jq or duckdb, not in SQL — the three engines are physically separate. The keys you stitch on are stable:

Postgres	Elasticsearch	Firebolt
channel.id	channel.id (article docs)	article_metrics.channel_id · channel_metrics.id
adlink.article_id <channel_id>:<youtube_id>	_id (same compound form)	article_metrics.id (strip the channel_id: prefix)
brand.id	sponsored_brand_mentions[] · organic_brand_mentions[]	—

Server-side guardrails (so your agent can't blow itself up).

SELECT only. No DDL, DML, transactions, SET, COPY, MERGE.
LIMIT ≤ 500 on every PG query (or the server fills in 50).
OFFSET ≥ 10,000 rejected — paginate via the response's next_offset.
SQL ≤ 50K chars, AST depth ≤ 64, ≤ 5,000 nodes.
RLS-style scoping on the user's plan: media buyers see price not cost; sellers see cost not price.

ES allowlist of query types — see the not-accepted list above.
Firebolt single-table only, no JOIN/CTE/subquery; leading index column required in WHERE.
Aggregation queries bill on min(hits.total, 200), not on len(hits).
Stable response envelope: { results, total, usage, _breadcrumbs }.
Bearer-token auth via PKCE OAuth or scoped API key. Same on every endpoint.

The schema is the product. The CLI is just one client.

See 18 end-to-end workflows → Build your own client →

We gave our team's agents direct database access. Now you get the same.

Agents are good at SQL. They are bad at clicking.

Almost-direct read access to the database.

Every account manager. Every analyst. Every day.

The same three commands. The same data. Now external.

All you need is tl db.

The Postgres tables you'll actually query.

thoughtleaders_adlink

thoughtleaders_channel

thoughtleaders_brand

thoughtleaders_profile

thoughtleaders_adspot

thoughtleaders_organization

auth_user

Junction tables