Practical AI Augmented Data Engineering — Part 2
2026-04-25

30+ Industry Certifications
rapiddata.dev – Discover Cloud Based Services
Transforming businesses through AI and data solutions
databricks.news – Unofficially Essential
The must-read Databricks newsletter scanning hundreds of sources weekly
dailydatabricks.tips – Tips, Tricks & Hacks
Small actionable pieces of information. Document the Undocumented. D-R-Y IRL
myyearindata.com – Data Engineering & AI Insights
Practical perspectives on building intelligent data systems


Microsoft Certified Trainer Fast Track Power BI Solution Architect
onedaybi.com | selfservicebi.co.uk | hello@onedaybi.com
Eight moving parts. One pipeline to build.
Now watch what happens when we use all eight — at once — on one real pipeline.
📦 dbt-pipeline-toolkit 🚧 Work in Progress
$/plugin marketplace add KavasiMihaly/AI-plugins
👤 Product Owner touchpoint 🔒 Quality gate ✨ Plugin bonus — beyond the SOP
Demo 3.1
Live: orchestrator agent kicks off the pipeline build. It never writes SQL itself — it hands every unit of work to a specialist sub-agent.
business-analyst, data-explorer, dbt-staging-builder, dbt-fact-builder…The manager doesn’t write the code. The specialists do.

Demo 3.2
Live: two agents in flight at once. One waits on a human answer; the other crunches 1.2M rows and pings us when it’s done.
⚠️ Background agents must run without approval prompts. If a tool needs a human click and no one is watching, the agent sits blocked forever. Pre-allow every tool a background agent uses — or keep it in the foreground.
Block only for the work you actually need to see.


Demo 3.3
Live: three builder agents run at once — dim_customer, dim_product, dim_date — each in its own git worktree. No lock contention. No stepping on each other. One merge at the end.
Parallelism scales with worktrees, not tabs.


Demo 3.4
Live: the plugin interrupts the human exactly twice — once to collect requirements (Step 4), once to validate the proposed model design (Step 6). Everything else runs unattended.
business-analyst runs a structured AskUserQuestion covering business goals, KPIs, grain, rules. Answers land in Section 1 of the design doc.Two interruptions. One to understand the human. One to get their sign-off.


Demo 3.5
Live: every agent reads and writes to one markdown file — pipeline-design.md. Business analyst fills Section 1. Data explorer fills Section 3. The builder reads both and never re-asks the user.
Memory lives in the repo, not the chat.

Demo 3.6
Live: the agent reaches out through an MCP server to SQL Server — reading schemas, running profiles, executing dbt commands. Every call is explicit, inspectable, and replayable.
sql-server-reader, sql-executor, dbt-runner. Not raw ODBC; not bash pipes.Structured tool calls — not bash-and-pray.

Demo 3.7
Goal: zero approval prompts. How: every Bash call is atomic → matched by the PreToolUse allowlist → runs silently. Compounds fall through → prompt → background agents hang.
Forbidden in one Bash call
&& || ; chains · | |& pipes · & bg · (…) subshells · `…` $(…) substitution · <<EOF heredocs · cd <path> && → use git -C <path> …
If you need multi-step
▸ N separate Bash calls — read each result, decide next.
▸ A script in skills/<name>/scripts/ — logic in code, invoked atomically.
Atomic → pre-allowed → no prompts → agents run unattended.

Demo 3.8
Live: the agent writes a staging model. A PreToolUse hook auto-runs dbt parse. Syntax error → write rejected before it reaches disk. No human needed. No broken build.
PreToolUse — runs before a tool fires. Block bad writes at the edge.PostToolUse — runs after. Auto-format, auto-test, auto-lint.Policy as code. Not as conversation.

01 · Intent 02 · Context 03 · Instructions 04 · MCP 05 · Agents 06 · Skills 07 · Hooks 08 · Memory
3.1 · Multi-agent framework 3.2 · Foreground vs background 3.3 · Parallel orchestration 3.4 · User touch points 3.5 · Shared memory 3.6 · Tool calls (MCP) 3.7 · Atomic Bash → zero approvals 3.8 · Hooks for validation
Try the plugin today /plugin marketplace add KavasiMihaly/AI-plugins
A hands-on day that takes everything from this 60 minutes and turns it into something you ship at work. You’ll leave with a working plugin pointed at your stack.
Five patterns moving from solo developer — to team — to platform.
🤝
Structured collaboration between specialist agents, not just delegation. Roles, hand-offs, and accountability — closer to an org chart than a function call.
Team
🌳
Isolated git worktrees per agent task. Parallel experimentation without stepping on each other. You saw the primitive in 3.8 — this goes wider.
Solo → Team
🐝
Platform-level agents that coordinate across projects, not per-repo. Shared observability, shared hooks, shared memory — across the data estate.
Platform
📥
Agents bringing new sources into the platform SOP-first, not ad-hoc. The SOP becomes executable — each new dataset runs the same 14-step gate.
Platform
🕸
Flat MEMORY.md becomes a graph. Facts have relationships — this decision depended on that requirement, which came from that stakeholder. Retrieval over structure beats retrieval over blobs.
The hardest open problem
Things are moving so fast, this is probably outdated by the end of the day!
SQLBits 2026 · Building Context, Not Vibes · Part 2