Building Context, Not Vibes

Practical AI Augmented Data Engineering — Part 2

Scott Bell

Mihaly Kavasi

2026-04-25

Building Context, Not Vibes

Practical AI Augmented Data Engineering — Part 2

About Scott

Grid of 30+ Azure, Databricks, and GitHub certification badges — Scott’s industry certifications

30+ Industry Certifications

Principal Data & AI Consultant at RapidData
Former Avanade Databricks SME & Altius Consultant
MSc Computer Science – Secure Machine Learning
Interests: Data Platforms, Intelligent Apps, AI Security, Architecture & Design Patterns
Passionate about Beer & Rugby League

My Projects

rapiddata.dev – Discover Cloud Based Services

Transforming businesses through AI and data solutions

databricks.news – Unofficially Essential

The must-read Databricks newsletter scanning hundreds of sources weekly

dailydatabricks.tips – Tips, Tricks & Hacks

Small actionable pieces of information. Document the Undocumented. D-R-Y IRL

myyearindata.com – Data Engineering & AI Insights

Practical perspectives on building intelligent data systems

About Mihaly

Mihaly Kavasi headshot

Microsoft Certified Trainer Fast Track Power BI Solution Architect

Founder of One Day BI – Microsoft analytics consultancy
Helps customers define optimal governance and implement the right mix of governed self-service BI
Advises on security, performance optimization and managing large-scale Power BI deployments
Nurtures the next generation of analysts with an emphasis on user needs and UX
Microsoft Certified Trainer since 2018
Fast Track Recognized Solution Architect for Power BI since 2021
Shares practical patterns for data transformation at selfservicebi.co.uk

Part 1 — Recap

Eight moving parts. One pipeline to build.

The Full Picture — one more time

Intent

Stop gaming the metric

▶

Context

Assemble what it sees

▶

Instructions

Teach the rules once

▶

MCP

Access to the world

▶

Agents

Specialist sub-workers

▶

Skills

Consistent outputs

▶

Hooks

Policy as enforcement

▶

Memory

Experience carries over

Now watch what happens when we use all eight — at once — on one real pipeline.

The Process the Plugin Replicates

📦 dbt-pipeline-toolkit 🚧 Work in Progress

$/plugin marketplace add KavasiMihaly/AI-plugins

APreparationsteps 1–3

1Verify environment🔒

2Source access🔒

3Profile sources

BDesign & Approvalsteps 4–6

4Requirements👤

5Draft design

6Design approval👤 🔒

CBuildsteps 7–12

7Workspace

8Load data

9Staging🔒

10Dimensions🔒

11Facts🔒

12Tests🔒

DValidation & Handoffsteps 13–14

13Validate end-to-end🔒

14Handoff

+Bonusbeyond SOP

15Power BI Project✨

16Data sources wired in✨

👤 Product Owner touchpoint 🔒 Quality gate ✨ Plugin bonus — beyond the SOP

14 SOP steps · 4 phases · 2 human gates — plus a ready-to-open Power BI Project.

The Demo

Demo 3.1 — Multi-Agent Framework

Demo 3.1

Live: orchestrator agent kicks off the pipeline build. It never writes SQL itself — it hands every unit of work to a specialist sub-agent.

Watch for

The orchestrator plans, delegates, reviews — it doesn’t code.
Each sub-agent has its own fresh context: business-analyst, data-explorer, dbt-staging-builder, dbt-fact-builder…
Returns come back as summaries, not transcripts — the manager stays clean.

The manager doesn’t write the code. The specialists do.

Orchestrator delegation call in code

Demo 3.2 — Foreground vs Background

Demo 3.2

Live: two agents in flight at once. One waits on a human answer; the other crunches 1.2M rows and pings us when it’s done.

Watch for

Foreground — blocks the turn. Use for anything that needs you: Q&A, discovery, approvals.
Background — fires and forgets. Notification on completion. Use for long-running reads, profiling, test runs.
The orchestrator spawns in parallel when tasks are independent — not sequentially.

⚠️ Background agents must run without approval prompts. If a tool needs a human click and no one is watching, the agent sits blocked forever. Pre-allow every tool a background agent uses — or keep it in the foreground.

Block only for the work you actually need to see.

Background agent launched

Background agent completion notification

Demo 3.3 — Parallel Orchestration

Demo 3.3

Live: three builder agents run at once — dim_customer, dim_product, dim_date — each in its own git worktree. No lock contention. No stepping on each other. One merge at the end.

Watch for

One worktree per agent — isolated branch, isolated file tree, isolated context.
All three write simultaneously. Orchestrator collects, runs tests, merges.
Scales to whatever your machine can handle — not bounded by a single linear chat.

Parallelism scales with worktrees, not tabs.

Parallel agents launched into worktrees

Parallel agents converging on merge

Demo 3.4 — User Touch Points

Demo 3.4

Live: the plugin interrupts the human exactly twice — once to collect requirements (Step 4), once to validate the proposed model design (Step 6). Everything else runs unattended.

Watch for

Touch point 1 — Collect. business-analyst runs a structured AskUserQuestion covering business goals, KPIs, grain, rules. Answers land in Section 1 of the design doc.
Touch point 2 — Validate. Coordinator presents the semantic model first (“does this answer the right business questions?”), then the physical plan. Approve, revise, or abort.
Structured, not conversational. Pre-defined question sets + explicit approval events — no ad-hoc “what do you think?” chats mid-run.

Two interruptions. One to understand the human. One to get their sign-off.

Discovery Q&A with AskUserQuestion

Design approval event — semantic model summary

Demo 3.5 — Shared Agent Memory

Demo 3.5

Live: every agent reads and writes to one markdown file — pipeline-design.md. Business analyst fills Section 1. Data explorer fills Section 3. The builder reads both and never re-asks the user.

Watch for

One single source of truth — not 8 conversations each holding a fragment.
Structured sections — requirements, source profile, dimensional design, validation results.
Memory lives in the repo. Git-versioned. Diffable. Reviewable.

Memory lives in the repo, not the chat.

pipeline-design.md shared by agents

Demo 3.6 — Tool Calls (Live MCP)

Demo 3.6

Live: the agent reaches out through an MCP server to SQL Server — reading schemas, running profiles, executing dbt commands. Every call is explicit, inspectable, and replayable.

Watch for

Curated surface — sql-server-reader, sql-executor, dbt-runner. Not raw ODBC; not bash pipes.
Each call shows tool name + arguments — audit-ready by default.
Tool descriptions tell the model when to use each one. No “which call?” confusion.

Structured tool calls — not bash-and-pray.

Live MCP tool call with arguments

Demo 3.7 — Atomic Bash → Zero Approvals

Demo 3.7

Goal: zero approval prompts. How: every Bash call is atomic → matched by the PreToolUse allowlist → runs silently. Compounds fall through → prompt → background agents hang.

Forbidden in one Bash call

&& || ; chains · | |& pipes · & bg · (…) subshells · `…` $(…) substitution · <<EOF heredocs · cd <path> && → use git -C <path> …

If you need multi-step

▸ N separate Bash calls — read each result, decide next.
▸ A script in skills/<name>/scripts/ — logic in code, invoked atomically.

Atomic → pre-allowed → no prompts → agents run unattended.

Atomic Bash call auto-approved by PreToolUse allowlist

Demo 3.8 — Hooks for Validation

Demo 3.8

Live: the agent writes a staging model. A PreToolUse hook auto-runs dbt parse. Syntax error → write rejected before it reaches disk. No human needed. No broken build.

Watch for

PreToolUse — runs before a tool fires. Block bad writes at the edge.
PostToolUse — runs after. Auto-format, auto-test, auto-lint.
Hooks are shell commands triggered by the harness — not instructions the model can forget.

Policy as code. Not as conversation.

PreToolUse hook rejecting a bad write

What you’ve just seen

Part 1 · Eight concepts

01 · Intent 02 · Context 03 · Instructions 04 · MCP 05 · Agents 06 · Skills 07 · Hooks 08 · Memory

Part 2 · Eight demo beats

3.1 · Multi-agent framework 3.2 · Foreground vs background 3.3 · Parallel orchestration 3.4 · User touch points 3.5 · Shared memory 3.6 · Tool calls (MCP) 3.7 · Atomic Bash → zero approvals 3.8 · Hooks for validation

Takeaway 1

AI amplifies good engineering.

Work with it the right way and a strong engineer ships faster, safer, and further than before.

Takeaway 2

The harness is the system.

Build around the model: context · tools · agents · hooks · memory. That’s what makes it reliable.

Takeaway 3

Own your agents.

Understand how they work and how they fail — deep understanding compounds with every build.

Try the plugin today /plugin marketplace add KavasiMihaly/AI-plugins

Want to build your own?

Full-day workshop

Deep-dive training

📅 Mon · 1 June 2026

Build your own agentic data pipeline

A hands-on day that takes everything from this 60 minutes and turns it into something you ship at work. You’ll leave with a working plugin pointed at your stack.

Design your own orchestrator + specialist sub-agents
Wire up MCP servers for your warehouse & BI tools
Author hooks, skills, and atomic-command policies
Ship a validated, tested pipeline by end of day

dataplatformnextstep.com/training-sessions

Where this is going

Five patterns moving from solo developer — to team — to platform.

Emerging Patterns

🤝

Agent Teams

Structured collaboration between specialist agents, not just delegation. Roles, hand-offs, and accountability — closer to an org chart than a function call.

Team

🌳

Work Trees

Isolated git worktrees per agent task. Parallel experimentation without stepping on each other. You saw the primitive in 3.8 — this goes wider.

Solo → Team

🐝

Agentic Platform Swarms

Platform-level agents that coordinate across projects, not per-repo. Shared observability, shared hooks, shared memory — across the data estate.

Platform

📥

Agentic Dataset Onboarding

Agents bringing new sources into the platform SOP-first, not ad-hoc. The SOP becomes executable — each new dataset runs the same 14-step gate.

Platform

🕸

Better Agentic Memory / Context Graphs

Flat MEMORY.md becomes a graph. Facts have relationships — this decision depended on that requirement, which came from that stakeholder. Retrieval over structure beats retrieval over blobs.

The hardest open problem

Things are moving so fast, this is probably outdated by the end of the day!