<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Data, Engineering, and Beyond]]></title><description><![CDATA[Exploring the intersection of engineering, data, and people — building systems and teams that scale.]]></description><link>https://blog.dativo.io</link><image><url>https://substackcdn.com/image/fetch/$s_!SVdI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bfbcde6-87b3-4b8c-9b38-3d1b82408e62_800x800.png</url><title>Data, Engineering, and Beyond</title><link>https://blog.dativo.io</link></image><generator>Substack</generator><lastBuildDate>Sat, 20 Jun 2026 18:34:34 GMT</lastBuildDate><atom:link href="https://blog.dativo.io/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Sergey]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[sergeyenin@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[sergeyenin@substack.com]]></itunes:email><itunes:name><![CDATA[Sergey]]></itunes:name></itunes:owner><itunes:author><![CDATA[Sergey]]></itunes:author><googleplay:owner><![CDATA[sergeyenin@substack.com]]></googleplay:owner><googleplay:email><![CDATA[sergeyenin@substack.com]]></googleplay:email><googleplay:author><![CDATA[Sergey]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Your compliance records are missing your AI traffic]]></title><description><![CDATA[You have log entries for HR and CRM, but not the prompts going to OpenAI/Anthropic or Mistral. But your team may sends EU customer data to OpenAI every day. Where's the record?]]></description><link>https://blog.dativo.io/p/your-compliance-records-are-missing</link><guid isPermaLink="false">https://blog.dativo.io/p/your-compliance-records-are-missing</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Thu, 11 Jun 2026 21:18:26 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>TL;DR</strong>: Every European / operating in EU company shall maintain a Record of Processing Activities. Almost none of them have an entry for the prompts their teams send to OpenAI,  Mistral or Deepseek every day &#8212; the fastest-growing processing activity in the building. You declare ten lines of YAML once in <a href="https://github.com/dativo-io/talon">Dativo Talon</a>; everything else is derived from records a consultant cannot fabricate. There&#8217;s a <a href="https://github.com/dativo-io/talon/tree/main/examples/auditor-pack">downloadable sample</a> pack you can hand to a reviewer today.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="6000" height="4000" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4000,&quot;width&quot;:6000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Smartphone screen displays ai assistant options.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Smartphone screen displays ai assistant options." title="Smartphone screen displays ai assistant options." srcset="https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1762330467572-5199bc772a20?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MHx8b3BlbmFpfGVufDB8fHx8MTc4MTA1NTA3OXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@zulfugarkarimov">Zulfugar Karimov</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>A friend of mine runs platform engineering at a ~200-person B2B company in Germany. Earlier this year they were closing their biggest deal to date - enterprise customer, everything agreed except the security review.</p><p>Question 47 of the enterprise questionnaire shared in long excel: <em>&#8220;Describe how personal data in AI/LLM workflows is governed, including records of processing, sub-processors, and third-country transfers.&#8221;</em></p><p>He took it to their CTO. The CTO opened their information systems register, the register every European company is required to keep under GDPR Article 30( called there Record of Processing Activities - <a href="https://gdpr-info.eu/art-30-gdpr/">RoPA</a>) - it can be as simple as confluence, and found entries for the HR system, the CRM, the email marketing tool. Nothing about the support bot that had been summarizing customer tickets through AI for eight months. Oops.</p><p>&#8220;Where do our prompts go?&#8221; CTO asked. In theory , the knew that the support bot they purchased use OpenAI&#8217;s ChatGPT, but nobody in the room could answer with evidence where the prompts went and what was inside the prompts. There were logs, somewhere, spread across three SaaS dashboards. There was no record.</p><p>The deal closed five weeks late - five weeks of a platform engineering and a CTO reverse-engineering their own AI usage so they could write it down and have legal sign it. That&#8217;s the moment AI governance stops being a legal abstraction and becomes a sales blocker.</p><p></p><h2>Why AI traffic belongs in your Record of Processing Activities ( EU focused part)</h2><p>If you&#8217;re the engineer, focusing on EU, who has to answer question 47, here&#8217;s the 90-second version of what&#8217;s being asked.</p><p><strong>The Record of Processing Activities - RoPA (GDPR Art. 30)</strong> is a register that answers, per processing activity: what personal data do we process, why, about whom, who receives it, does it leave the EU, how long do we keep it, and how is it protected? It&#8217;s <em>mandatory</em> for nearly every European company. There&#8217;s a nominal under-250-employee exemption, but it doesn&#8217;t apply when processing is &#8220;not occasional&#8221; &#8212; and a support bot running every day is by definition not occasional. The RoPA is the first document requested in a regulator inquiry, an ISO 27001 surveillance audit, and most enterprise security reviews. Art. 30 violations sit in the fine tier of up to EUR 10M or 2% of global turnover.</p><p><strong>Annex IV (EU AI Act)</strong> is the technical documentation required for high-risk AI systems: what the system is, how it&#8217;s monitored and controlled, how risks are managed, how humans oversee it. High-risk obligations apply from <strong><mark data-color="#d9ead3" style="background-color: rgb(217, 234, 211); color: rgb(0, 0, 0);">August 2, 2026</mark></strong>. Most mid-size companies are <em>deployers</em> rather than providers, which means lighter obligations &#8212; but the documentation demand flows down through contracts anyway. Your enterprise customers will ask you to evidence your usage controls, oversight, and logging regardless of your formal role. Documentation-tier violations run up to EUR <em>15M or 3% of turnover</em>.</p><p>AI traffic is the gap in both documents. Companies have RoPA entries for systems built ten years ago, while prompts flowing to a US model provider &#8212; a new processing activity, with a new recipient, and often a third-country transfer &#8212; go unrecorded. It&#8217;s exactly the thing CTO/Legal/Compliance are now being asked about, and exactly the thing they can&#8217;t answer from scattered logs.</p><p>Here&#8217;s what I realized while building <a href="https://github.com/dativo-io/talon">Talon&#8217;s</a> gateway: <strong>the network layer already knows the answers.</strong> Which PII categories were observed in prompts. Which provider received them, in which region. What was redacted, what was blocked, what it cost. Talon records all of that per request, HMAC-signed at write time.</p><p>Some consultant writing your RoPA <em>guesses</em> at these facts. The gateway <em>proves</em> them.</p><p></p><h2>Declared + derived: ten lines of YAML, then evidence does the rest</h2><p><strong>Declared facts</strong> are business statements no log can know: who the controller is, why you process data, how long you keep it. Your DPO writes them once. Org-level identity goes in <code>talon.config.yaml</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;55330662-24e6-4de2-a00d-2bba1c1bf6fa&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">compliance:
  controller:
    name: "Example GmbH"
    contact: "privacy@example.eu"
    dpo_contact: "dpo@example.eu"
    address: "Examplestr. 1, 10115 Berlin, Germany"</code></pre></div><p>Per-agent declarations live in <code>agent.talon.yaml</code>, next to the policy that governs the agent:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;57ec4680-5af3-4655-996e-50960c82174f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">compliance:
  frameworks: [gdpr, eu-ai-act]
  data_residency: eu
  declarations:
    processing:                      # GDPR Art. 30(1) facts
      purposes:
        - "customer support ticket triage"
      data_subject_categories:
        - "customers"
      personal_data_categories:
        - "contact details"
        - "payment identifiers"
        - "support ticket content"
      retention_period: "90 days after ticket closure"
      legal_basis: "contract (Art. 6(1)(b))"
      safeguards: "Role-based access; vendor DPAs on file; signed evidence retained for audit review"
    system:                          # EU AI Act Annex IV facts
      system_description: "Gateway-governed LLM assistant for support ticket triage"
      intended_purpose: "Summarize and route inbound support tickets"
      oversight_description: "Support lead reviews flagged tickets daily"</code></pre></div><p><strong>Derived facts</strong> come from the signed evidence store, and you never write them by hand: which processing activities actually ran (per tenant, per agent, first seen, last seen), which personal-data identifiers were actually observed, which recipients received data and in which region, which requests were third-country transfers, which policy denials fired, and which requests went through a human plan-review gate.</p><p>One design decision matters more than it looks: <strong>every governed request records where it went &#8212; not just the ones where PII was detected.</strong> A recipient list that depends on a classifier&#8217;s hit rate is a recipient list with silent holes; a missed identifier should never make a US provider disappear from your transfer table. Talon records the prompt &#8594; destination flow (provider, model, region) for all traffic &#8212; gateway requests, CLI and scheduled agent runs, MCP tool calls, and externally orchestrated graph runs &#8212; and layers the sensitivity classification on top. The rule <em>&#8220;a record claiming a model call must say where the data went&#8221;</em> is enforced as a runtime invariant on every evidence write, in CI by a parity contract test, and in the release smoke suite. The full controls-per-path breakdown is in the <a href="https://dativo.io/talon/docs/governance-control-matrix/">governance control matrix</a>.</p><p>Then one command merges declared and derived facts into each document:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;dc51c99c-bc14-41e0-a00d-bbfa5cf88c34&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">talon compliance ropa --format html --output ropa.html
talon compliance annex-iv --format html --output annex-iv.html</code></pre></div><p>The HTML is print-to-PDF-ready. The JSON variant is machine-checkable, for the security reviewers who want to diff it between quarters.</p><h2>Missing declarations are flagged, not hidden</h2><h1>Your GDPR RoPA Is Missing Your AI Traffic &#8212; Here&#8217;s How to Fix It With Runtime Evidence</h1><p><strong>TL;DR:</strong> Every European company maintains a Record of Processing Activities. Almost none of them have an entry for the prompts their teams send to OpenAI every day &#8212; the fastest-growing processing activity in the building. <a href="https://github.com/dativo-io/talon">Dativo Talon</a>, an open-source AI governance gateway, now generates a GDPR Art. 30 RoPA and an EU AI Act Annex IV documentation pack directly from HMAC-signed runtime evidence: <code>talon compliance ropa</code> and <code>talon compliance annex-iv</code>. Your compliance officer declares ten lines of YAML once; everything else is derived from records a consultant cannot fabricate. There&#8217;s a <a href="https://github.com/dativo-io/talon/tree/main/examples/auditor-pack">downloadable sample pack</a> you can hand to a reviewer today.</p><div><hr></div><p>A friend of mine runs platform engineering at a ~400-person B2B company in Germany. Earlier this year they were closing their biggest deal to date &#8212; six figures, enterprise customer, everything agreed except the security review.</p><p>Question 47 of the questionnaire: <em>&#8220;Describe how personal data in AI/LLM workflows is governed, including records of processing, sub-processors, and third-country transfers.&#8221;</em></p><p>He took it to the compliance officer, who opened the company&#8217;s RoPA &#8212; the register every European company is required to keep under GDPR Article 30 &#8212; and found entries for the HR system, the CRM, the email marketing tool. Nothing about the support bot that had been summarizing customer tickets through GPT-4 for eight months.</p><p>&#8220;Where do our prompts go?&#8221; the compliance officer asked. Nobody in the room could answer with evidence. There were logs, somewhere, spread across three SaaS dashboards. There was no record.</p><p>The deal closed five weeks late &#8212; five weeks of a platform engineer and a compliance officer reverse-engineering their own AI usage so they could write it down and have legal sign it. That&#8217;s the moment AI governance stops being a legal abstraction and becomes a sales blocker.</p><h2>Why AI traffic belongs in your Record of Processing Activities</h2><p>If you&#8217;re the engineer who has to answer question 47, here&#8217;s the 90-second version of what&#8217;s being asked.</p><p><strong>The RoPA (GDPR Art. 30)</strong> is a register that answers, per processing activity: what personal data do we process, why, about whom, who receives it, does it leave the EU, how long do we keep it, and how is it protected? It&#8217;s mandatory for nearly every European company. There&#8217;s a nominal under-250-employee exemption, but it doesn&#8217;t apply when processing is &#8220;not occasional&#8221; &#8212; and a support bot running every day is by definition not occasional. The RoPA is the first document requested in a regulator inquiry, an ISO 27001 surveillance audit, and most enterprise security reviews. Art. 30 violations sit in the fine tier of up to EUR 10M or 2% of global turnover.</p><p><strong>Annex IV (EU AI Act)</strong> is the technical documentation required for high-risk AI systems: what the system is, how it&#8217;s monitored and controlled, how risks are managed, how humans oversee it. High-risk obligations apply from <strong>August 2, 2026</strong>. Most mid-size companies are <em>deployers</em> rather than providers, which means lighter obligations &#8212; but the documentation demand flows down through contracts anyway. Your enterprise customers will ask you to evidence your usage controls, oversight, and logging regardless of your formal role. Documentation-tier violations run up to EUR 15M or 3% of turnover.</p><p>AI traffic is the gap in both documents. Companies have RoPA entries for systems built ten years ago, while prompts flowing to a US model provider &#8212; a new processing activity, with a new recipient, and often a third-country transfer &#8212; go unrecorded. It&#8217;s exactly the thing compliance officers are now being asked about, and exactly the thing they can&#8217;t answer from scattered logs.</p><p>Here&#8217;s what I realized while building Talon&#8217;s gateway: <strong>the network layer already knows the answers.</strong> Which PII categories were observed in prompts. Which provider received them, in which region. What was redacted, what was blocked, what it cost. Talon records all of that per request, HMAC-signed at write time.</p><p>A consultant writing your RoPA <em>guesses</em> at these facts. The gateway <em>proves</em> them.</p><h2>Declared + derived: ten lines of YAML, then evidence does the rest</h2><p>Every auditor document splits into two kinds of facts, and Talon&#8217;s design keeps them strictly separate.</p><p><strong>Declared facts</strong> are business statements no log can know: who the controller is, why you process data, how long you keep it. Your compliance officer writes them once. Org-level identity goes in <code>talon.config.yaml</code>:</p><pre><code><code>compliance:
  controller:
    name: "Example GmbH"
    contact: "privacy@example.eu"
    dpo_contact: "dpo@example.eu"
    address: "Examplestr. 1, 10115 Berlin, Germany"</code></code></pre><p>Per-agent declarations live in <code>agent.talon.yaml</code>, next to the policy that governs the agent:</p><pre><code><code>compliance:
  frameworks: [gdpr, eu-ai-act]
  data_residency: eu
  declarations:
    processing:                      # GDPR Art. 30(1) facts
      purposes:
        - "customer support ticket triage"
      data_subject_categories:
        - "customers"
      personal_data_categories:
        - "contact details"
        - "payment identifiers"
        - "support ticket content"
      retention_period: "90 days after ticket closure"
      legal_basis: "contract (Art. 6(1)(b))"
      safeguards: "Role-based access; vendor DPAs on file; signed evidence retained for audit review"
    system:                          # EU AI Act Annex IV facts
      system_description: "Gateway-governed LLM assistant for support ticket triage"
      intended_purpose: "Summarize and route inbound support tickets"
      oversight_description: "Support lead reviews flagged tickets daily"</code></code></pre><p><strong>Derived facts</strong> come from the signed evidence store, and you never write them by hand: which processing activities actually ran (per tenant, per agent, first seen, last seen), which personal-data identifiers were actually observed, which recipients received data and in which region, which requests were third-country transfers, which policy denials fired, and which requests went through a human plan-review gate.</p><p>One design decision matters more than it looks: <strong>every governed request records where it went &#8212; not just the ones where PII was detected.</strong> A recipient list that depends on a classifier&#8217;s hit rate is a recipient list with silent holes; a missed identifier should never make a US provider disappear from your transfer table. Talon records the prompt &#8594; destination flow (provider, model, region) for all traffic &#8212; gateway requests, CLI and scheduled agent runs, MCP tool calls, and externally orchestrated graph runs &#8212; and layers the sensitivity classification on top. The rule <em>&#8220;a record claiming a model call must say where the data went&#8221;</em> is enforced as a runtime invariant on every evidence write, in CI by a parity contract test, and in the release smoke suite. The full controls-per-path breakdown is in the <a href="https://github.com/dativo-io/talon/blob/main/docs/reference/governance-control-matrix.md">governance control matrix</a>.</p><p>Then one command merges declared and derived facts into each document:</p><pre><code><code>talon compliance ropa --format html --output ropa.html
talon compliance annex-iv --format html --output annex-iv.html</code></code></pre><p>The HTML is print-to-PDF-ready. The JSON variant is machine-checkable, for the security reviewers who want to diff it between quarters.</p><h2>Missing declarations are flagged, not hidden</h2><p>This is the part I care most about as an engineer, and the part I would never bury in a demo screenshot.</p><p>If a declaration is missing, the command doesn&#8217;t fail &#8212; it renders a flagged <code>DECLARATION MISSING</code> section and prints exactly which YAML field to set. The document itself becomes the to-do list for your compliance officer. Talon fills what can be proven from signed evidence and clearly flags what must be declared by your organisation. That&#8217;s a far more trustworthy compliance story than &#8220;one-click compliance,&#8221; and it&#8217;s the kind of behaviour a compliance officer will actually trust after the first review.</p><p>The same discipline runs through the evidence-derived sections, in both directions:</p><ul><li><p><strong>No understatement.</strong> If no data-flow evidence exists yet, the transfers section says transfers <em>&#8220;cannot be assessed yet&#8221;</em> &#8212; it never converts absence of evidence into a comforting &#8220;no transfers&#8221; finding.</p></li><li><p><strong>No overstatement.</strong> When a policy blocks a request, the blocked attempt stays in the signed evidence, but the destination is <em>not</em> listed as a recipient &#8212; blocked data never reached anyone. A recipient table that counted blocked traffic would overstate your processing, and a compliance officer would catch it in the first review.</p></li><li><p><strong>Redaction is part of the record.</strong> An identifier type that was redacted in <em>every</em> flow to a destination is annotated <em>&#8220;redacted before egress&#8221;</em>; if it ever went through raw, even once, the annotation is withheld. &#8220;OpenAI received email addresses&#8221; and &#8220;OpenAI received placeholders where email addresses used to be&#8221; are very different statements to a compliance officer, and the document refuses to blur them.</p></li></ul><p>The document also checks your declarations against reality. If your agent declares <code>data_residency: eu</code> but routing still allows US providers and the evidence shows data actually flowing there, the RoPA prints a <strong>consistency warning</strong> with the two honest ways out: enforce <code>eu_strict</code> routing, or document the transfer mechanism with your compliance officer. Your own compliance export catches your config drift before an auditor does.</p><p>I watched all of this fire on a fresh install, unstaged. The first request I sent was denied at the routing stage &#8212; the agent&#8217;s policy pointed at a provider that wasn&#8217;t configured, so nothing ever left the machine. The denial landed in the signed evidence (<code>POLICY_DENIED_ROUTING</code>, with the fix spelled out in the record), but the RoPA generated from it listed <strong>no recipients</strong> and said transfers <em>&#8220;cannot be assessed yet&#8221;</em> &#8212; blocked data never reached anyone, so the document refused to invent a recipient. The second request succeeded against OpenAI, and the regenerated document did three things at once: put <code>openai / US</code> in the recipient table, flagged the third-country transfer with the SCC note, and opened with the consistency warning &#8212; because the config declared <code>data_residency: eu</code> while the evidence showed traffic reaching a US region:</p><blockquote><p><em>consistency: compliance.data_residency is declared &#8220;eu&#8221; but 1 destination(s) outside EU/LOCAL appear in data-flow evidence (Section 6) &#8212; set llm.routing.data_sovereignty_mode: eu_strict to enforce EU routing, or document the transfer mechanism (SCCs, adequacy decision) with your DPO</em></p></blockquote><p>That&#8217;s the document doing its job on run two of a brand-new install: catching a real residency gap I hadn&#8217;t noticed, and telling me exactly how to close it.</p><h2>The same evidence answers your security review, not just your compliance officer</h2><p>Question 47 rarely arrives alone. The same questionnaire that asks about records of processing asks how you prevent data leakage to AI providers, whether your audit logs can be tampered with, and what stops an AI agent from doing something it shouldn&#8217;t. Compliance documentation and infosec controls are usually owned by different people and answered from different tools &#8212; which is exactly why the answers contradict each other in review.</p><p>The reason a gateway can generate your RoPA is the same reason it closes the security gaps: it sits on the network path and enforces policy <em>before</em> data leaves, instead of reporting on it afterwards. Each control produces the same signed evidence the compliance documents are built from:</p><ul><li><p><strong>Shadow AI and unsanctioned usage.</strong> Talon gives AI traffic a single egress point. Every governed request is recorded with tenant, agent, destination, and region &#8212; whether or not PII was detected &#8212; so &#8220;what AI tools is the company actually using?&#8221; is a database query, not a survey. Traffic that bypasses the gateway is out of scope by definition, which is also your network team&#8217;s argument for routing it through.</p></li><li><p><strong>Data leakage to third parties.</strong> PII is detected and redacted <em>before</em> egress, and the evidence records both facts separately &#8212; input redaction and output redaction are independent claims. API keys never sit in prompts or agent code: they live in an encrypted vault (AES-256-GCM), scoped by per-agent ACLs, and every retrieval is itself an audit record.</p></li><li><p><strong>Prompt injection via attachments.</strong> File content (PDF, DOCX, HTML) is treated as untrusted by default &#8212; wrapped in isolation delimiters, scanned for embedded instructions, and blocked or flagged per policy. Detected injection attempts generate evidence even when the request is blocked.</p></li><li><p><strong>Excessive agency.</strong> Agents declare allowed tools in <code>agent.talon.yaml</code>; the policy engine filters the tool list before the model ever sees it, and every MCP tool call passes through policy evaluation before execution. A dangerous tool the model never learned about is a class of incident that can&#8217;t happen.</p></li><li><p><strong>Runaway spend.</strong> Per-request, daily, and monthly budgets are enforced at the gateway &#8212; a request over budget is denied before any cost is incurred, and the denial is recorded.</p></li><li><p><strong>Log tampering.</strong> Evidence records are HMAC-signed at write time. A reviewer &#8212; or your own incident-response team &#8212; can verify the export offline with <code>talon audit verify</code>. An attacker who modifies a record breaks its signature; &#8220;the logs say X and we can prove the logs weren&#8217;t altered&#8221; is a materially stronger incident report.</p></li></ul><p>This is also where the ISO 27001 and NIS2 story gets simple. The evidence store gives you logging and monitoring records (A.8.15), the vault covers cryptographic controls for credentials (A.8.24), the denial trail supports incident management (A.5.24&#8211;A.5.26), and NIS2 Art. 21&#8217;s risk-management measures get the same answer the compliance officer got: not a policy PDF, but signed records of the controls actually firing. <code>talon compliance report</code> prints the framework-to-control mappings alongside the runtime numbers, so your security lead and your compliance officer are quoting the same document for once.</p><h2>What manual AI compliance documentation actually costs</h2><p>If you&#8217;re the internal promoter &#8212; the platform engineer or CTO who has to convince a compliance officer, a board, or a customer&#8217;s security reviewer &#8212; here are the reference ranges (EU Commission impact assessment via CEPS, industry cost surveys, 2024&#8211;2026; all estimates):</p><ul><li><p><strong>Annex IV documentation done manually:</strong> EUR 15,000&#8211;60,000 per high-risk system; consultant-led packages run EUR 50,000&#8211;150,000+ over 3&#8211;6 months.</p></li><li><p><strong>DIY without tooling:</strong> EUR 30,000&#8211;80,000 in internal staff time &#8212; senior engineers writing documentation instead of product.</p></li><li><p><strong>RoPA maintenance:</strong> compliance-officer and privacy-consultant rates around EUR 800&#8211;1,500/day, days per review cycle, and the document goes stale the moment it&#8217;s written. Talon regenerates it from live evidence in one command.</p></li><li><p><strong>Compliance automation tools:</strong> EUR 7,500&#8211;80,000/yr &#8212; and none of them sit on the network path, so they produce templates, not evidence.</p></li></ul><p>But the biggest number isn&#8217;t on that list. For a 200&#8211;1,000-employee B2B company, the costliest compliance event isn&#8217;t a fine &#8212; it&#8217;s a <strong>stalled enterprise deal</strong>. &#8220;How is your AI usage governed?&#8221; is now a standard security-review question. Handing over a generated, independently verifiable evidence pack converts a five-week back-and-forth into an email attachment. One accelerated deal dwarfs every other line in this analysis.</p><p>The honest framing on fines: nobody can promise you &#8220;no fines,&#8221; and you should walk away from any vendor who does. What I can say is that when a regulator or auditor asks, the organization that produces organized, signed records in minutes is treated very differently from the one that produces nothing in weeks.</p><h2>What this does not do</h2><p>Claims discipline matters more in this domain than in any other:</p><ul><li><p>The output is <strong>supporting records for GDPR Art. 30 and EU AI Act Annex IV review</strong> &#8212; not a completed legal filing, a certification, or &#8220;compliance in a box.&#8221; Every document says so in its footer.</p></li><li><p>Talon only sees the traffic that flows through it. AI usage that bypasses the gateway isn&#8217;t in the evidence &#8212; the RoPA covers what Talon governs.</p></li><li><p>When an external agent framework (LangGraph, LangChain) is governed via Talon&#8217;s event API, the content itself never transits Talon &#8212; so those flows are recorded as exactly what they are: <em>orchestrator-reported</em>, model named, region <code>unknown</code>. Talon never guesses a jurisdiction. That unresolved region shows up in your transfer table on purpose.</p></li><li><p>You still need your compliance officer or counsel to review purposes, legal bases, and transfer mechanisms. The tool eliminates the <em>assembly</em> work &#8212; the weeks of reverse-engineering what your AI usage actually is &#8212; not the legal judgment.</p></li></ul><p>If a consultant told you a tool could replace them entirely, they&#8217;d be lying. If they told you assembling AI processing records by hand is a good use of EUR 1,200/day, they&#8217;d also be lying.</p><h2>Generate your first AI RoPA in ten minutes</h2><p>The whole point of Talon is that the proof path is short:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;90d969b1-a833-4992-bd13-bad799c2b3b7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash"># 1. Init (wizard writes both config files, ~2 min)
talon init

# 2. Route a request through the gateway &#8212; this creates signed evidence
talon serve --gateway &amp;
curl -s http://localhost:8080/v1/proxy/openai/v1/chat/completions \
  -H "Authorization: Bearer $TALON_TENANT_KEY" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"My IBAN is DE89370400440532013000"}]}'

curl -s http://localhost:8080/v1/proxy/openai/v1/chat/completions \
  -H "Authorization: Bearer $TALON_TENANT_KEY" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello, how are you?"}]}'


# 3. Generate the documents your compliance officer has been asked for
talon compliance report --format html --output compliance-report.html
talon compliance ropa --format html --output ropa.html
talon compliance annex-iv --format html --output annex-iv.html

# 4. Let the reviewer verify the evidence themselves
talon audit export --format signed-json --output evidence.signed.json
talon audit verify --file evidence.signed.json</code></pre></div><p>Open <code>ropa.html</code>. Section 4 already lists the IBAN identifier the classifier caught. Section 5 names the recipient and region &#8212; and if redaction was on, marks the IBAN <em>&#8220;redacted before egress&#8221;</em>: the provider got a placeholder, not the account number. Section 6 flags the third-country transfer with a note to document your SCC or adequacy mechanism. And if your declared residency disagrees with where the evidence shows traffic going, the document opens with the consistency warning from earlier.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NnYc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NnYc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 424w, https://substackcdn.com/image/fetch/$s_!NnYc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 848w, https://substackcdn.com/image/fetch/$s_!NnYc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 1272w, https://substackcdn.com/image/fetch/$s_!NnYc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NnYc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png" width="1456" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:357537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/201630331?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NnYc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 424w, https://substackcdn.com/image/fetch/$s_!NnYc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 848w, https://substackcdn.com/image/fetch/$s_!NnYc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 1272w, https://substackcdn.com/image/fetch/$s_!NnYc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa60db9b6-71c2-47a9-aa42-5013ca7a508e_3364x1552.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iKw1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iKw1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 424w, https://substackcdn.com/image/fetch/$s_!iKw1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 848w, https://substackcdn.com/image/fetch/$s_!iKw1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 1272w, https://substackcdn.com/image/fetch/$s_!iKw1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iKw1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png" width="1456" height="718" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:718,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:411579,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/201630331?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iKw1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 424w, https://substackcdn.com/image/fetch/$s_!iKw1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 848w, https://substackcdn.com/image/fetch/$s_!iKw1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 1272w, https://substackcdn.com/image/fetch/$s_!iKw1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3612a2bc-833f-478a-affe-0d6303011930_3394x1674.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>If anyone asks about a single request, the answer is one record away &#8212; <code>talon audit show &lt;id&gt;</code> prints the per-request data flow in plain text:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;a1710738-e76f-428e-b26c-85cc79dc3eef&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">PII Detected:  iban
PII Redacted:  input=true output=false
...
Data Flow
  Detector:    talon-regex
  prompt -&gt; llm_provider:openai model=gpt-4o region=US | redacted | tier 1 | iban</code></pre></div><p>Source, destination, region, what was detected, and whether it was redacted before it left &#8212; the RoPA&#8217;s recipient table, at the granularity of one request. Your compliance officer&#8217;s job goes from &#8220;reconstruct eight months of AI usage&#8221; to &#8220;review a document and fill in the fields the export flags.&#8221;</p><p>Don&#8217;t have the gateway running? Even the simplest path works: <code>talon run "hello"</code> against OpenAI is enough to put the provider in Section 5 and the US transfer in Section 6 &#8212; no PII required, because data movement is evidence regardless of what the data contained.</p><h2>FAQ</h2><p><strong>Does GDPR require a RoPA entry for AI and LLM usage?</strong><br>If your AI workflow processes personal data &#8212; support tickets, CRM notes, HR documents &#8212; it&#8217;s a processing activity under GDPR Art. 30 and belongs in your RoPA like any other system. The under-250-employee exemption doesn&#8217;t apply to processing that is &#8220;not occasional,&#8221; which rules out any AI feature running daily.</p><p><strong>Are prompts sent to OpenAI a third-country transfer?</strong><br>If the prompt contains personal data and the provider processes it outside the EU, yes &#8212; and your RoPA needs to record the recipient, region, and transfer mechanism (SCCs or an adequacy decision). This is exactly the section most companies cannot fill from scattered logs, and the one Talon derives from per-request data-flow evidence.</p><p><strong>Do SMBs need EU AI Act Annex IV documentation?</strong><br>Formally, Annex IV applies to providers of high-risk AI systems, and most SMBs are deployers with lighter obligations. In practice, enterprise customers push documentation demands down through contracts &#8212; you&#8217;ll be asked to evidence your usage controls, oversight, and logging in security reviews well before any regulator asks.</p><p><strong>Is a generated RoPA legally sufficient?</strong><br>It&#8217;s a supporting record, not a legal filing. Talon assembles the runtime facts (recipients, regions, identifiers observed, redaction status, denials) and flags the declarations only your organisation can make (legal basis, retention, purposes). Your compliance officer reviews and signs off &#8212; but starts from evidence instead of archaeology.</p><p><strong>Does this help with ISO 27001 and NIS2, or only GDPR?</strong><br>The same evidence store backs both. Signed per-request records support ISO 27001 logging and monitoring controls (A.8.15), the encrypted secrets vault maps to cryptographic controls (A.8.24), and the policy-denial trail supports incident management &#8212; which is also what NIS2 Art. 21 risk-management measures ask for. <code>talon compliance report</code> prints the framework-to-control mappings next to the runtime numbers.</p><p><strong>Where does the evidence live?</strong><br>In your infrastructure. Talon is a single Go binary, self-hosted and open source. Prompts, evidence records, and generated documents never leave your environment.</p><div><hr></div><ul><li><p>Repo: <a href="https://github.com/dativo-io/talon">github.com/dativo-io/talon</a> &#8212; single Go binary, self-hosted, open source</p></li><li><p>Runbook: <a href="https://dativo.io/talon/docs/compliance-export-runbook/">How to export evidence for auditors</a></p></li><li><p>Declarations guide: <a href="https://dativo.io/talon/docs/ropa-declarations/">How to clear DECLARATION MISSING blocks in RoPA exports</a></p></li><li><p><a href="https://gdpr-info.eu/art-30-gdpr/">GDPR Article 30 text</a> &#183; <a href="https://ai-act-service-desk.ec.europa.eu/en/ai-act/timeline/timeline-implementation-eu-ai-act">EU AI Act implementation timeline</a></p></li></ul><p>August 2, 2026 is on the calendar whether your RoPA is ready or not. The version of you that gets asked question 47 next quarter will be glad the answer is one command.</p>]]></content:encoded></item><item><title><![CDATA[Controlling LangGraph Tool Calls]]></title><description><![CDATA[Prompting Is Not Governance:]]></description><link>https://blog.dativo.io/p/controlling-langgraph-tool-calls</link><guid isPermaLink="false">https://blog.dativo.io/p/controlling-langgraph-tool-calls</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Tue, 02 Jun 2026 14:03:16 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Most AI agent demos stop precisely at the moment the agent &#8220;works.&#8221;</p><p>It can reason. It can choose tools. It can complete a linear workflow. For a proof of concept, this is a milestone. For enterprise production, it is barely the starting line.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="1080" height="607" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;black traffic light with red light&quot;,&quot;title&quot;:&quot;black traffic light with red light&quot;,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="black traffic light with red light" title="black traffic light with red light" srcset="https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1630783204535-cb30ffb3c0a1?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2MXx8dHJhZmZpYyUyMGxpZ2h0JTIwcG9saWNlbWVufGVufDB8fHx8MTc4MDQwNzIyMHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving from prototype to production forces engineering teams to confront a fundamentally different architecture and risk question: <strong>What is this autonomous agent actually allowed to do?</strong></p><p>When an application switches from deterministic code to dynamic LLM runtime loops, traditional security models break down. A hallucinating chatbot can give a bad answer; an un-governed agent with access to tool arrays can inadvertently alter database states, leak source data, or trigger destructive cascade workflows.</p><p>To bridge this gap, engineers need to step away from fragile system prompts and build hard governance boundaries. In this walkthrough, we deploy a minimal <strong>LangGraph</strong> agent and introduce <strong><a href="https://github.com/dativo-io/talon">Talon</a></strong>&#8212;an OpenAI-compatible governance gateway&#8212;as a strict proxy between LangChain and the underlying LLM provider.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oK6U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oK6U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 424w, https://substackcdn.com/image/fetch/$s_!oK6U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 848w, https://substackcdn.com/image/fetch/$s_!oK6U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 1272w, https://substackcdn.com/image/fetch/$s_!oK6U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oK6U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png" width="830" height="780" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:780,&quot;width&quot;:830,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:105347,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/200296967?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oK6U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 424w, https://substackcdn.com/image/fetch/$s_!oK6U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 848w, https://substackcdn.com/image/fetch/$s_!oK6U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 1272w, https://substackcdn.com/image/fetch/$s_!oK6U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa341bae2-c136-4c5a-8344-d0d966a4f64e_830x780.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Our objective is to stress-test an essential architectural pattern: Can an infrastructure gateway inspect, filter, or block tool definitions downstream before the model ever encounters them?</p><p>TL;DR -  yes. And understanding why this is necessary requires exposing the core illusion of prompt-based guardrails.</p><h2>The Illusion of System Prompt &#8220;Enforcement&#8221;</h2><p>A pervasive anti-pattern in agent design is treating instructions as firewall rules. Developers routinely pass heavy system prompts down to the execution graph expecting deterministic obedience:</p><blockquote><p>You are a highly restricted enterprise assistant.</p><ul><li><p>Under no circumstances should you delete account records.</p></li><li><p>Do not expose or export sensitive PII or raw tables.</p></li><li><p>Always prompt the user for manual approval before mutating data.</p></li><li><p>Make no mistakes ;) LOL</p></li></ul></blockquote><p>While valuable for cognitive alignment, <strong>this is direction, not enforcement.</strong> If your backend payload still registers the schema definitions for <code>delete_record</code>, <code>export_data</code>, or <code>admin_override</code>, those tools are fully visible to the model context window. At that exact moment, your system security relies entirely on the probability that a non-deterministic token predictor will choose to follow instructions under every edge case, prompt injection vulnerability, or state variance.</p><div class="callout-block" data-callout="true"><p><strong>The Architectural Rule:</strong> If a tool schema is exposed to the model, the model can execute it. True governance dictates that restricted tools are stripped out at the gateway layer based on caller identity, making it physically impossible for the model to invoke what it cannot see.</p></div><p></p><h2>System Architecture &amp; Integration Setup</h2><p>Injecting a governance layer shouldn&#8217;t mean re-architecting your entire LangGraph state machine. By utilizing an OpenAI-compatible gateway like Talon, the code modifications are restricted to changing the initialization parameters of the LangChain client wrapper.</p><p>Instead of hitting the provider&#8217;s endpoint directly, we re-route traffic through our local or distributed proxy gateway:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;6ef46583-4f66-4fdc-a122-e98646b940e3&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from langchain_openai import ChatOpenAI

# Gateway-Routed LLM Client Configuration
llm = ChatOpenAI(
    model="gpt-4o-mini",
    base_url="http://localhost:18080/v1/proxy/openai/v1",  # Points to Talon Gateway
    api_key=TALON_CALLER_KEY,                             # Cryptographic Caller Identity Key
    temperature=0,
    max_tokens=120,
)</code></pre></div><h3>Key Security Mechanics:</h3><ol><li><p><strong>Abstraction of Secrets:</strong> The application container never handles the real <code>OPENAI_API_KEY</code>. It maintains a localized <code>TALON_CALLER_KEY</code>. The true downstream provider keys reside safely within Talon&#8217;s secure vault.</p></li><li><p><strong>Identity-Aware Routing:</strong> The gateway maps the inbound caller key to a specific tenant profile, resolves the associated governance policy, sanitizes the payload, and signs the outbound request to the provider.</p></li></ol><p></p><h2>Tool Inventories and the Gateway Policy</h2><p>To validate how the gateway behaves when handling complex real-world operations, our demonstration registers a blend of safe operational tools and highly sensitive data-access primitives:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qd7t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qd7t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 424w, https://substackcdn.com/image/fetch/$s_!Qd7t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 848w, https://substackcdn.com/image/fetch/$s_!Qd7t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 1272w, https://substackcdn.com/image/fetch/$s_!Qd7t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qd7t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png" width="628" height="166" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:166,&quot;width&quot;:628,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40548,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/200296967?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qd7t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 424w, https://substackcdn.com/image/fetch/$s_!Qd7t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 848w, https://substackcdn.com/image/fetch/$s_!Qd7t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 1272w, https://substackcdn.com/image/fetch/$s_!Qd7t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3eab335-ad4b-42fb-9588-d666de3bf8aa_628x166.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><p>When using standard LangGraph nodes, tools are bound directly via <code>.bind_tools(tools)</code>. LangChain automatically serializes these tool definitions into OpenAI-compliant JSON schemas.</p><p>Instead of relying on hardcoded static lists inside the code repository, we declare an external infrastructure policy in Talon (<code>policy.yaml</code>):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;fb422e77-ac77-4c2f-88d3-e862c9121ac5&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">callers:
  - name: "langgraph-tool-agent"
    tenant_key: "talon-gw-langgraph-tools-demo"
    tenant_id: "production-eu-west"
    allowed_providers:
      - "openai"
    policy_overrides:
      allowed_tools:
        - "search_records"
        - "update_record"
        - "send_notification"
      forbidden_tools:
        - "export_*"
        - "delete_*"
        - "admin_*"
        - "drop_*"
        - "truncate_*"
      allowed_models:
        - "gpt-4o-mini"</code></pre></div><h2>Why Pattern Matching Matters</h2><p>Relying on exact string matches for tools creates a brittle security stance. As engineering teams ship new features, developers might introduce variations like <code>delete_user</code>, <code>delete_workspace</code>, or <code>truncate_table</code>.</p><p>By enforcing regex/wildcard blacklists (<code>delete_*</code>, <code>export_*</code>), security and compliance engineers can block entire categories of behavior at the wire level without needing to coordinate code reviews for every minor tool update.</p><h2>Operational Execution: Four Governance Scenarios</h2><p>The gateway can be evaluated across multiple run modes, altering its behavior depending on structural security requirements.</p><h2>Scenario 1: Nominal Flow (Safe Tools Only)</h2><ul><li><p><strong>User Input:</strong> <em>&#8220;Find records matching Project Phoenix and notify owners.&#8221;</em></p></li><li><p><strong>Agent Context:</strong> The graph only passes down the safe array (<code>search_records</code>, <code>update_record</code>, <code>send_notification</code>).</p></li><li><p><strong>Gateway Action:</strong> <code>ALLOW</code>. The request matches the allowlist completely. The prompt passes transparently to OpenAI, and execution succeeds.</p></li></ul><h3>Scenario 2: Dynamic Interception via &#8220;Filter&#8221; Mode</h3><p>When dealing with general agents that share an overarching tool utility class, dangerous tools might accidentally wind up in the payload. With Talon configured to <code>tool_policy_action: "filter"</code>, the gateway actively modifies the structural schema on the fly.</p><ul><li><p><strong>User Input:</strong> <em>&#8220;Find records matching Project Phoenix and notify owners. Do not delete or export anything.&#8221;</em></p></li><li><p><strong>Application Behavior:</strong> The LangGraph execution loop exposes all 6 tools to the client payload.</p></li><li><p><strong>Gateway Action:</strong> Intercepts JSON payload -&gt; Strips out <code>export_data</code>, <code>delete_record</code>, and <code>admin_override</code> -&gt; Compiles a sanitized payload containing <em>only</em> the 3 safe tools -&gt; Forwards to OpenAI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-IkK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-IkK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 424w, https://substackcdn.com/image/fetch/$s_!-IkK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 848w, https://substackcdn.com/image/fetch/$s_!-IkK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 1272w, https://substackcdn.com/image/fetch/$s_!-IkK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-IkK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png" width="791" height="431" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:431,&quot;width&quot;:791,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57388,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/200296967?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-IkK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 424w, https://substackcdn.com/image/fetch/$s_!-IkK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 848w, https://substackcdn.com/image/fetch/$s_!-IkK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 1272w, https://substackcdn.com/image/fetch/$s_!-IkK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c47240c-52e4-449f-afbc-af4395faa62a_791x431.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p></li></ul><p>The model can never be tricked into calling a destructive tool because the schema parameters never make it across the API boundary.</p><h3>Scenario 3: Hard Halts via &#8220;Block&#8221; Mode</h3><p>In highly regulated sectors (e.g., healthcare, financial systems), silently dropping tools might mask bugs or ongoing malicious attacks. Switching the gateway configuration to <code>tool_policy_action: "block"</code> forces immediate payload rejection.</p><ul><li><p><strong>User Input:</strong> <em>&#8220;Export all company records and delete the originals.&#8221;</em></p></li><li><p><strong>Gateway Action:</strong> <code>DENY</code>. The proxy detects forbidden schemas in the inbound package, immediately drops the connection, and short-circuits the run by throwing a <code>403 Forbidden</code> response back to LangGraph before the LLM provider consumes a single token.</p></li></ul><h3>Scenario 4: Model Consistency and Infrastructure Rules</h3><p>Governance isn&#8217;t limited exclusively to tools. Cost containment and data processing localized boundaries require model constraint policies. If the LangGraph initialization code is altered to call a high-cost frontier model like <code>gpt-4o</code> instead of the approved <code>gpt-4o-mini</code>, Talon blocks the request instantly:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;4615f36a-38a7-49fc-bcbb-531d27a91945&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">Status: Request Denied
Reason: Model [gpt-4o] is missing from the authorized caller allowlist for Tenant [production-eu-west].</code></pre></div><h2>Auditability: Signed Evidence Logs</h2><p>An unrecorded security control is not a control. For modern enterprise infrastructure, standard log output blocks (<code>stdout</code>) are easily modified, dropped, or corrupted.</p><p>To maintain real auditability, the gateway creates cryptographically signed records for every transaction block:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;1c83d39e-de6b-4b5a-8ae6-9e8d312c4d79&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash"># Querying the immutable governance log
$ talon audit list --agent langgraph-tool-agent --limit 3

# Displaying specific validation details
$ talon audit show ev_01HNJ8RZEWG5PA038EWRNB7M1Z</code></pre></div><p>Running <code>talon audit verify &lt;evidence-id&gt;</code> calculates a Hash-based Message Authentication Code (<strong>HMAC</strong>) signature against the data block. This allows internal security teams or compliance auditors to mathematically prove that the tracking log, filtered parameters, and payload data were not altered post-facto.</p><h2>The Strategic Takeaway for Enterprise Scale</h2><p>For technology teams deploying generative AI features into international markets or B2B enterprise customers, generic statements like <em>&#8220;we use defensive prompting and run LangSmith traces&#8221;</em> are no longer sufficient to pass rigorous security reviews.</p><p>Enterprise clients demand concrete architecture answers to critical risk vectors:</p><ul><li><p>How do you prevent your agents from executing unauthorized bulk data drops?</p></li><li><p>Where is the physical isolation layer separating prompt logic from system execution boundaries?</p></li><li><p>Where is the tamper-evident ledger tracking what your models attempted to execute?</p></li></ul><p>By decoupling <strong>orchestration</strong> from <strong>governance</strong>, you establish a resilient defense-in-depth security model:</p><ul><li><p><strong>LangGraph</strong> manages the state machine, execution graphs, memory persistence, and dynamic node routing.</p></li><li><p><strong><a href="https://dativo.io/">Talon</a> / API Gateways</strong> manage the network perimeter, secret isolation, tool schema sanitation, and cryptographic audit logging.</p></li></ul><p>This decoupling gives engineering teams the freedom to iterate rapidly on complex agent loops while giving security teams complete, granular control over the data boundaries. Prompting guides your agent&#8217;s behavior; policy enforces your system&#8217;s integrity.</p>]]></content:encoded></item><item><title><![CDATA[How to Make a LangGraph Agent GDPR-Safe]]></title><description><![CDATA[Use customer data in AI agents without losing control]]></description><link>https://blog.dativo.io/p/how-to-make-a-langgraph-agent-gdpr</link><guid isPermaLink="false">https://blog.dativo.io/p/how-to-make-a-langgraph-agent-gdpr</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Mon, 01 Jun 2026 21:02:43 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="2268" height="2442" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2442,&quot;width&quot;:2268,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;man in blue crew neck shirt under blue sky during daytime&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="man in blue crew neck shirt under blue sky during daytime" title="man in blue crew neck shirt under blue sky during daytime" srcset="https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1606775524496-8ffd63ad2a98?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyNnx8ZXVyb3BlJTIwY29udHJvbHxlbnwwfHx8fDE3ODAzMzIzNTJ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@henri0019">Henri Lajarrige Lombard</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>This is a practical walkthrough for putting a minimal LangGraph agent behind the <a href="https://dativo.io/">Talon</a> LLM gateway.</p><p>The target use case is simple: a customer-support agent receives a billing question that contains personal data. We want the agent to answer, but we do not want the LangGraph app to call OpenAI directly with raw customer data and no audit trail.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><p></p><p>The final architecture:</p><pre><code>LangGraph &#8594; Talon Gateway &#8594; OpenAI</code></pre><p>The main application change is this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;f9218104-0889-4d8a-81a7-8a2f17771061&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">llm = ChatOpenAI(
    model="gpt-4o-mini",
    base_url="http://localhost:18080/v1/proxy/openai/v1",
    api_key="talon-gw-langgraph-demo",
)</code></pre></div><p>The LangGraph app uses a Talon caller key. Talon stores the real OpenAI key, applies policy, forwards the request, scans the response, and writes signed evidence.</p><p>The companion notebook is here: <a href="https://github.com/dativo-io/talon-notebooks/blob/main/langgraph_talon_gdpr_safe_agent_colab.ipynb">Colab notebook</a></p><h2>What we are building</h2><p>We will run a minimal LangGraph workflow:</p><pre><code><code>START &#8594; support_agent &#8594; END</code></code></pre><p>No tools yet. No human approval. No memory.</p><p>That is intentional. The first thing to govern is the LLM boundary. Tool governance comes later.</p><p>The test input contains an email and an IBAN:</p><pre><code><code>My email is </code><strong>anna.kowalska@example.com</strong><code> and my IBAN is </code><strong>DE89370400440532013000</strong><code>.
I was charged twice for order </code><strong>ORD-18422</strong><code>. Can you help?</code></code></pre><p>What we want Talon to prove:<br><br>- Gateway receives the LangGraph request.<br>- Caller is identified as `langgraph-support-agent`.<br>- Model is restricted to `gpt-4o-mini`.<br>- Email address is detected.<br>- IBAN is detected.<br>- Input is redacted before the upstream provider call.<br>- Response is scanned.<br>- Evidence is written.<br>- Evidence signature verifies successfully.</p><div><hr></div><h2>Dependencies</h2><p>Python dependencies first:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;f878cc03-ff0d-46ad-9bf6-c286719f1c65&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">python -m pip install -q --upgrade pip
python -m pip install -q langgraph langchain-openai langchain-core openai requests pyyaml</code></pre></div><p><a href="https://github.com/dativo-io/talon">Talon</a> also needs to be installed. One practical issue: do not use Ubuntu&#8217;s default <code>golang-go</code> package in Colab. It can be too old for current <a href="https://github.com/dativo-io/talon">Talon</a> builds.</p><p>The notebook tries three install paths:</p><ol><li><p>Use an existing <code>talon</code> binary if available.</p></li><li><p>Download a GitHub Linux AMD64 release binary.</p></li><li><p>Install modern Go from <code>go.dev</code> and run:</p></li></ol><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;bcf85f7f-bbaf-41cf-b8ff-11623c32f3e0&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">GOBIN=/usr/local/bin \
GONOSUMDB=github.com/dativo-io/talon \
go install github.com/dativo-io/talon/cmd/talon@latest</code></pre></div><p>That avoids the common failure where Colab installs Go 1.18 and the Talon module requires a newer Go version.</p><h2>Generate runtime keys</h2><p>For this demo, I generate ephemeral Talon keys inside the notebook:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;1768f99d-be26-49d0-89a2-a3540322cd7f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import os
import secrets
from pathlib import Path

project_dir = Path("/content/talon-langgraph-demo")
project_dir.mkdir(parents=True, exist_ok=True)

os.environ.setdefault("TALON_DATA_DIR", str(project_dir / ".talon"))
os.environ.setdefault("TALON_SECRETS_KEY", secrets.token_hex(32))
os.environ.setdefault("TALON_SIGNING_KEY", secrets.token_hex(32))
os.environ.setdefault("TALON_ADMIN_KEY", secrets.token_urlsafe(32))
os.environ.setdefault("TALON_PORT", "18080")</code></pre></div><p>For production, these should come from your secret manager. For a notebook, ephemeral keys are fine.</p><p>The OpenAI key is read from Colab Secrets if available:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;5809d185-135f-439f-ad95-0c6b03f2d2d8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from google.colab import userdata

os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")</code></pre></div><p>or entered manually with <code>getpass</code>.</p><h2>Create <code>agent.talon.yaml</code></h2><p>This file describes the agent policy.</p><p>For this first example, keep it concise:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;d7e43a72-951b-4ca9-8083-135704ef16d8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">agent:
  name: langgraph-support-agent
  version: "1.0.0"
  description: GDPR-safe LangGraph support agent demo

capabilities:
  allowed_tools: []

policies:
  data_classification:
    input_scan: true
    output_scan: true
    redact_pii: true
    block_on_pii: false

  cost_limits:
    per_request: 0.10
    daily: 10.00
    monthly: 200.00

audit:
  log_level: detailed
  retention_days: 30
  log_prompts: false
  log_responses: false

compliance:
  frameworks:
    - gdpr
    - eu-ai-act
  data_residency: eu
  risk_level: low</code></pre></div><h2>Create <code>talon.config.yaml</code></h2><p>This file configures the gateway.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;cfbbcec1-2dd9-4079-9b6a-7cc0f4f39885&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">gateway:
  mode: enforce

  providers:
    openai:
      enabled: true
      base_url: "https://api.openai.com"
      secret_name: "openai-api-key"
      allowed_models:
        - "gpt-4o-mini"

  default_policy:
    default_pii_action: "redact"
    response_pii_action: "warn"
    max_daily_cost: 10.00
    max_monthly_cost: 200.00
    allowed_models:
      - "gpt-4o-mini"

  callers:
    - name: "langgraph-support-agent"
      tenant_key: "talon-gw-langgraph-demo"
      tenant_id: "demo"
      allowed_providers:
        - "openai"
      policy_overrides:
        pii_action: "redact"
        response_pii_action: "warn"
        allowed_models:
          - "gpt-4o-mini"
        max_daily_cost: 10.00
        max_monthly_cost: 200.00</code></pre></div><p>The route we use later is:</p><pre><code><code>/v1/proxy/openai/v1/chat/completions</code></code></pre><p>Talon extracts <code>openai</code> from that path, looks it up under <code>gateway.providers.openai</code>, and checks that it is enabled.</p><h2>Store the OpenAI key in Talon&#8217;s vault</h2><p>The LangGraph app should not use the real OpenAI key.</p><p>Store the upstream provider key in Talon:</p><pre><code><code>talon secrets set openai-api-key "$OPENAI_API_KEY"</code></code></pre><p>Then the application uses the Talon caller key:</p><pre><code><code>talon-gw-langgraph-demo</code></code></pre><p>That key maps to:</p><pre><code><code>tenant_id: demo
name: langgraph-support-agent</code></code></pre><p>This gives you caller-level evidence and budget attribution.</p><h2>Start Talon Gateway</h2><p>In Colab, I use port <code>18080</code> to avoid collisions with common notebook services.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;3e0ea377-8581-4825-bf24-f2ade4fd104a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">talon serve \
  --gateway \
  --gateway-config /content/talon-langgraph-demo/talon.config.yaml \
  --host 127.0.0.1 \
  --port 18080 \
  --log-level info</code></pre></div><p>The OpenAI-compatible base URL becomes:</p><pre><code><code>http://localhost:18080/v1/proxy/openai/v1</code></code></pre><p>The full chat completions route is:</p><pre><code><code>http://localhost:18080/v1/proxy/openai/v1/chat/completions</code></code></pre><h2>Smoke-test the gateway before LangGraph</h2><p>Before involving LangGraph, test the gateway directly:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;4303c02a-4c11-4643-a4c0-dcba8c275ea8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import requests

base_url = "http://localhost:18080/v1/proxy/openai/v1"
url = f"{base_url}/chat/completions"

headers = {
    "Authorization": "Bearer talon-gw-langgraph-demo",
    "Content-Type": "application/json",
}

payload = {
    "model": "gpt-4o-mini",
    "messages": [
        {
            "role": "user",
            "content": (
                "My email is anna.kowalska@example.com and my IBAN is "
                "DE89370400440532013000. I was charged twice for order ORD-18422. "
                "Can you help?"
            ),
        }
    ],
    "max_tokens": 120,
}

r = requests.post(url, headers=headers, json=payload, timeout=60)
print(r.status_code)
print(r.text[:1500])</code></pre></div><p>Expected result: <code>200</code>.</p><p>Common failures:</p><p>- `404`: wrong route. Check `/v1/proxy/openai/v1/chat/completions`.</p><p>- `unknown or disabled provider`: missing `enabled: true` under `gateway.providers.openai`.</p><p>- `401` or `403`: wrong caller key, or missing admin key for admin endpoints.</p><p>- Upstream auth error: OpenAI key was not stored in Talon&#8217;s vault, or `secret_name` does not match.</p><p>- Schema validation error: invalid `agent.talon.yaml`; commonly `audit.log_level`.</p><div><hr></div><h2>Build the LangGraph agent</h2><p>The LangGraph agent is just one node:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;8ad1b94c-d097-4dec-9276-a458b138621f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from typing import Annotated, TypedDict

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages


class SupportState(TypedDict):
    messages: Annotated[list, add_messages]


llm = ChatOpenAI(
    model="gpt-4o-mini",
    base_url="http://localhost:18080/v1/proxy/openai/v1",
    api_key="talon-gw-langgraph-demo",
    temperature=0,
)


def support_agent(state: SupportState):
    system = SystemMessage(
        content=(
            "You are a customer support assistant for a SaaS company. "
            "Help with billing questions. "
            "Do not repeat raw personal data such as email addresses or IBANs. "
            "Do not claim you performed a refund. "
            "Say that a support teammate can verify the order and duplicate charge."
        )
    )

    response = llm.invoke([system] + state["messages"])
    return {"messages": [response]}


workflow = StateGraph(SupportState)
workflow.add_node("support_agent", support_agent)
workflow.add_edge(START, "support_agent")
workflow.add_edge("support_agent", END)

graph = workflow.compile()</code></pre></div><p>The Talon-specific part is only:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;d4b1c96d-667d-4159-8a6d-c96aaf54ecc8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">base_url="http://localhost:18080/v1/proxy/openai/v1"
api_key="talon-gw-langgraph-demo"</code></pre></div><p>Everything else is standard LangGraph/LangChain code.</p><h2>Run the PII-bearing request</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;0a0f1764-07b1-448c-a69e-74296964f55a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">result = graph.invoke({
    "messages": [
        HumanMessage(
            content=(
                "My email is anna.kowalska@example.com and my IBAN is "
                "DE89370400440532013000. I was charged twice for order "
                "ORD-18422. Can you help?"
            )
        )
    ]
})

print(result["messages"][-1].content)</code></pre></div><p>At this point, the important thing is not whether the answer is amazing. This is a governance test, not a support automation benchmark.</p><p>The important question is: what did Talon record?</p><h2>Inspect evidence</h2><p>List records:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;af48d0a5-d8f4-4d45-a39e-244f0025e0bc&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">talon audit list --limit 10</code></pre></div><p>You should see records with IDs like:</p><pre><code>gw_cd438803-4d0</code></pre><p>Then inspect one:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;17566287-340c-4e4b-b6be-392e05d48537&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">talon audit show gw_cd438803-4d0</code></pre></div><p>A useful record should show:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;cd26cc28-0df0-4532-8782-e35ab6dbcb84&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">Evidence:       gw_cd438803-4d0
Tenant / Agent: demo / langgraph-support-agent
Invocation:     gateway
HMAC Signature: &#10003; VALID

Policy Decision
Allowed:        true
Action:         allow

Classification
Input Tier:     2
Output Tier:    2
PII Detected:   email, iban, person, ...
PII Redacted:   true

Execution
Model:          gpt-4o-mini
Cost:           &#8364;&lt; 0.0001
Duration:       1267ms
Tokens:         in=106 out=62
Tools Called:   (none)</code></pre></div><p>This is the useful part of the demo.</p><p>The LLM call is no longer invisible. It has a tenant, an agent identity, a model, a PII classification, a redaction result, cost, latency, token counts, and a signature.</p><h2>Verify the evidence signature</h2><p>Run:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;5b5e8276-e676-41a7-923a-89c2d547e007&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">talon audit verify gw_cd438803-4d0</code></pre></div><p>Expected:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;238f368d-06e8-4896-893f-66fb4e442fd1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10003; Evidence gw_cd438803-4d0: signature VALID</code></pre></div><p>This proves the evidence record has not been modified since Talon created it.</p><p>For a technical buyer, this is materially different from application logs. Logs are useful for debugging. Signed evidence is useful for governance and later review.</p><h2>Understand the output PII warning</h2><p>In the audit explanation, you may see something like:</p><pre><code>POLICY_DENIED_PII_OUTPUT</code></pre><p>In this demo, that does not necessarily mean the request was blocked.</p><p>Check the final policy fields:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;84b04522-5246-4005-903b-f06d3a9495ae&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">Allowed: true
Action: allow</code></pre></div><p>The reason is this config:</p><pre><code>response_pii_action: "warn"</code></pre><p>So Talon records output PII findings but still allows the response.</p><p>For the first demo, this is useful. It proves response scanning happened without making the notebook fail.</p><p>For production, pick the behavior explicitly:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;267a075f-96cb-4dd1-b367-eedad6700002&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">response_pii_action: "warn"    # record only
response_pii_action: "redact"  # mask before returning
response_pii_action: "block"   # deny response</code></pre></div><p>One product note: the compact audit-list label can look confusing here. It would be clearer if the list view showed something like:</p><pre><code>ALLOWED_WITH_OUTPUT_PII_WARNING</code></pre><p>when the final action is allow but an output PII finding was recorded.</p><h2>Test model restriction</h2><p>The config only allows:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;01f371f9-0875-4cab-a8b4-647245af16af&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">allowed_models:
  - "gpt-4o-mini"</code></pre></div><p>Test a different model:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;44acd1e2-aca8-4549-abee-1ae1fc2b9370&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">bad_payload = {
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Say hello."}],
    "max_tokens": 20,
}

r = requests.post(url, headers=headers, json=bad_payload, timeout=60)
print(r.status_code)
print(r.text[:2000])</code></pre></div><p>Expected: a denial or policy error.</p><p>This is a basic but important control. You do not want every application instance choosing arbitrary models. Model selection affects cost, vendor review, latency, and sometimes data-residency posture.</p><h2>What this gives you</h2><p>This pattern gives a small team a practical governance boundary with a small app change.</p><p>What this adds:<br><br>- PII scan and redaction for raw customer data in prompts.<br>- Provider key isolation: the app calls <a href="https://dativo.io/">Talon</a>, not OpenAI directly.<br>- Model allowlist to prevent unapproved model usage.<br>- Caller identity mapped to tenant and agent.<br>- Signed evidence records for every governed call.<br>- Cost and token recording.<br>- Output scanning for response leakage risk.</p><div><hr></div><h2>Summary</h2><p>LangGraph makes it easy to add more autonomy: tools, loops, memory, retries, human approval, and long-running workflows.</p><p>Those are exactly the places where governance becomes harder.</p><p>Starting with the model boundary is the lowest-friction control:</p><pre><code><code>change base_url + api_key</code></code></pre><p>You can keep the LangGraph workflow mostly unchanged and still get:</p><pre><code><code>policy + redaction + model restriction + evidence</code></code></pre><p>That is the right first step before giving the agent tools that can read or write business data.</p><p>The next post will build on this and govern tool calls: safe tools, forbidden tools, dry runs, and blocking destructive actions before the agent can execute them.</p>]]></content:encoded></item><item><title><![CDATA[Data quality In Delta Lake and Iceberg]]></title><description><![CDATA[Part 3: Getting practical with data quality]]></description><link>https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg-184</link><guid isPermaLink="false">https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg-184</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Fri, 29 May 2026 06:44:48 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="3872" height="2160" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2160,&quot;width&quot;:3872,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;gray mountains near pine trees at daytime&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="gray mountains near pine trees at daytime" title="gray mountains near pine trees at daytime" srcset="https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1568156318788-5c96955343a2?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw0MXx8aWNlYmVyZyUyMGxha2V8ZW58MHx8fHwxNzc5OTcyNjc1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@kenny_h">Kenneth Hargrave</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>This is the third part of series, the previous parts are - </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;cc9bdcc1-6dc3-4870-a9df-7e8ece0b1d16&quot;,&quot;caption&quot;:&quot;Most companies already run some form of data quality monitoring. They have freshness checks, null checks, schema validation, row count checks, sometimes even anomaly detection, alerting, and incident workflows.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Data quality In Delta Lake and Iceberg&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:315636,&quot;name&quot;:&quot;Sergey&quot;,&quot;bio&quot;:&quot;Hey there! I'm Sergey Enin, a seasoned professional with 16+ years of experience in the advanced data analytics space. I've worked across the globe and I'm fluent in four languages &#128187;&#127757;&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb4b8594-ff73-4b33-84a3-6ce5a2583e1c_144x144.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-28T12:27:00.008Z&quot;,&quot;cover_image&quot;:&quot;https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:199590123,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:2240131,&quot;publication_name&quot;:&quot;Data, Engineering, and Beyond&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!SVdI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bfbcde6-87b3-4b8c-9b38-3d1b82408e62_800x800.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;140bb871-d3f3-45ec-be62-0abfbb8b2afa&quot;,&quot;caption&quot;:&quot;For data engineers, &#8220;data quality&#8221; is only one part of the operational picture.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Data quality In Delta Lake and Iceberg &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:315636,&quot;name&quot;:&quot;Sergey&quot;,&quot;bio&quot;:&quot;Hey there! I'm Sergey Enin, a seasoned professional with 16+ years of experience in the advanced data analytics space. I've worked across the globe and I'm fluent in four languages &#128187;&#127757;&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb4b8594-ff73-4b33-84a3-6ce5a2583e1c_144x144.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-28T12:59:23.112Z&quot;,&quot;cover_image&quot;:&quot;https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg-e34&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:199594508,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:2240131,&quot;publication_name&quot;:&quot;Data, Engineering, and Beyond&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!SVdI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bfbcde6-87b3-4b8c-9b38-3d1b82408e62_800x800.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p><p>Imagine a daily <code>finance.orders_mart</code> table used by executives.</p><p>The <a href="https://opendatacontract.com/">data contract</a> says:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;19918f42-c3e0-4cf8-9cf2-ba97e7ca40ef&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">asset: finance.orders_mart
expected_schedule: daily
freshness_sla: available by 08:30 Europe/Warsaw
required_checks:
  - order_id_not_null
  - unique_order_id
  - revenue_non_negative
  - valid_currency
  - row_count_within_expected_range
owner: finance-data-platform
alert_route: "#data-finance-alerts"</code></pre></div><p>A pipeline run starts at 08:00.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><p></p><h2>Step 1: Check Upstream Freshness</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;5caf9d8c-4e7c-46fe-b0c9-1ab652a8c11f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">raw_orders_freshness = dq.get_freshness_status("raw.orders")

if raw_orders_freshness.status == "stale":
    stop_pipeline("raw.orders is stale")</code></pre></div><h2>Step 2: Run Transformation</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;051ca4f2-4c82-49d7-9cc8-c382da987b09&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">orders = spark.table("raw.orders")
orders_mart = build_orders_mart(orders)

write_table(orders_mart, "finance.orders_mart")</code></pre></div><h2>Step 3: Check Pipeline Health</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;1f7593f3-65ec-4876-85c3-fc1b0d6bfc46&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">pipeline_status = orchestrator.get_current_run_status()

if pipeline_status == "success":
    catalog.update_asset_metadata(
        asset="finance.orders_mart",
        pipeline_health="healthy",
        last_successful_pipeline_run_at=now()
    )</code></pre></div><h2>Step 4: Run DQ Checks</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;b33a07ce-4a0e-46e5-a5f7-32fb0db535f4&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">dq_result = dq.run_checks(
    asset="finance.orders_mart",
    checks=[
        "order_id_not_null",
        "unique_order_id",
        "revenue_non_negative",
        "valid_currency",
        "row_count_within_expected_range"
    ]
)

dq_results_table.append(dq_result)</code></pre></div><h2>Step 5: Check Output Freshness</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;7e3f3d3f-b5bc-4c6f-b1c3-2056abad4aef&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">freshness = dq.check_freshness(
    asset="finance.orders_mart",
    timestamp_column="order_created_at",
    expected_by="08:30",
    timezone="Europe/Warsaw"
)</code></pre></div><h2>Step 6: Publish Stable Asset State</h2><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;c97857cd-8567-430e-92e9-1b57b8cf22b0&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">if pipeline_status == "success" and dq_result.passed and freshness.passed:
    catalog.update_asset_metadata(
        asset="finance.orders_mart",
        data_contract_status="certified",
        quality_certification="certified",
        pipeline_health="healthy",
        freshness_state="fresh"
    )
else:
    catalog.update_asset_metadata(
        asset="finance.orders_mart",
        data_contract_status="warning"
    )</code></pre></div><p>The detailed operational results should stay in the right systems:</p><pre><code><strong>Pipeline logs </strong>     &#8594; orchestrator
<strong>DQ check results</strong>   &#8594; DQ platform / sidecar results table
<strong>Freshness history</strong>  &#8594; DQ platform / sidecar results table
<strong>Stable trust state</strong> &#8594; catalog / asset metadata</code></pre><p>The rule is simple:</p><pre><code><code>Operational evidence belongs in operational systems.
Stable trust state belongs in asset metadata.</code></code></pre><h2>The Combined Metadata Model</h2><p>For asset metadata, I suggest expose a compact view:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;faa15718-4041-4df1-bbc0-e543bdb8b824&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">asset: finance.orders_mart

pipeline:
  health: healthy
  last_successful_run_at: 2026-05-27T08:12:00Z
  owner: finance-data-platform
  run_url: https://orchestrator/runs/123

quality:
  certification: certified
  contract_status: certified
  monitoring_required: true
  owner: finance-data-platform
  latest_results_uri: https://dq-platform/runs/456

freshness:
  state: fresh
  sla: available_by_08_30
  last_checked_at: 2026-05-27T08:20:00Z
  latest_results_uri: https://dq-platform/freshness/789</code></pre></div><p>This is enough for catalogs, BI tools, policies, and pipelines. It is not trying to store every check result.</p><h2>How Data Engineers Should Use Quality Indicators in Pipelines</h2><p>The most useful quality metadata is not decorative.</p><p>It should change how pipelines behave.</p><p>A tag like <code>quality_certification=certified</code> is only valuable if systems and people use it to make decisions. Otherwise, it is just another label in a catalog.</p><p>For data engineers, quality indicators can become pipeline control signals.</p><h2>1. Input Gating</h2><p>Before a pipeline reads from an upstream table, it can check the upstream asset state.</p><p>For example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;001be3fe-e7ac-49aa-a943-bdb1eb3acd35&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">source_quality = catalog.get_asset_quality("raw.orders")

if source_quality.data_contract_status == "blocked":
    raise Exception("raw.orders is blocked by its data contract status")

if source_quality.quality_certification == "deprecated":
    warn("raw.orders is deprecated and should not be used for new pipelines")</code></pre></div><p>This does not require parsing every DQ result. The pipeline only needs a stable signal:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;133fdbdd-bd5d-41e1-9a02-a7a310e70dc9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">data_contract_status: certified | warning | blocked | deprecated</code></pre></div><p>This allows teams to prevent bad data from flowing silently into downstream tables.</p><h2>2. Freshness-Aware Execution</h2><p>Some pipelines should only run if the upstream data is fresh enough.</p><p>For example, a daily revenue table should not be rebuilt if the upstream orders table has not received today&#8217;s data.</p><p>The pipeline can check a freshness signal:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;acd5e349-cf62-47b8-8f7d-9dadb544f0d7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">freshness = dq_service.get_latest_check(
    asset="raw.orders",
    check="freshness"
)

if freshness.status == "failed":
    raise Exception("Upstream orders data is stale")</code></pre></div><p>There are two possible patterns here.</p><p>For critical operational freshness, read from the DQ platform directly because it has the latest check state.</p><p>For stable lifecycle decisions, read from the catalog or asset metadata.</p><p>That gives us a useful split:</p><pre><code><strong>Need latest operational status?</strong> &#8594; DQ platform
<strong>Need stable trust state?</strong>        &#8594; Catalog / asset metadata</code></pre><h2>3. Promotion Gates</h2><p>Quality indicators can/shall control movement between data layers.</p><p>For example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;c30bbd27-5660-44db-b632-45b15b22a087&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">bronze &#8594; silver
Requires:
- schema valid
- required fields present
- basic freshness check passing
- no severe ingestion errors

silver &#8594; gold
Requires:
- business rules passing
- accepted volume ranges
- key dimensions populated
- owner assigned

gold &#8594; certified
Requires:
- data contract approved
- monitoring enabled
- SLA defined
- alert routing configured
- successful check history</code></pre></div><p>This makes certification a real engineering workflow instead of a manual catalog label.</p><p>A pipeline could implement this as:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;dd48b609-1c01-4788-97c4-b1c9a7de712b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">dq_result = dq.run_checks("curated.orders")

if dq_result.passed_required_checks:
    catalog.update_asset_metadata(
        asset="curated.orders",
        data_contract_status="managed"
    )

if dq_result.passed_certification_checks and owner_approved:
    catalog.update_asset_metadata(
        asset="curated.orders",
        quality_certification="certified"
    )</code></pre></div><p>The important part is that the pipeline does not write every check result into the table metadata.</p><p>It writes detailed results to the DQ system, then updates only the stable asset state when the lifecycle state changes.</p><h2>4. Output Validation</h2><p>Every important pipeline should validate what it produces.</p><p>This is where DQ tools are most useful.</p><p>After writing the output table, the pipeline runs checks such as:</p><pre><code><code>order_id is not null
revenue is non-negative
event_time is within expected freshness window
row count is within expected range
country_code matches known reference values
no duplicate primary business keys</code></code></pre><p>Then it publishes the result:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;528842da-e9f7-4278-b1a5-e019615a1ed0&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">dq_result = dq.run_checks(
    asset="finance.orders_mart",
    checks=[
        "order_id_not_null",
        "revenue_non_negative",
        "freshness_within_sla",
        "row_count_within_expected_range",
        "valid_country_code",
        "unique_order_id"
    ]
)

dq_results_table.append(dq_result)
dq_platform.publish(dq_result)</code></pre></div><p>The asset metadata may then be updated with a stable summary:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;4960ae1f-7ecf-4848-8960-7311a36238b1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">monitoring_required: true
quality_certification: certified
data_contract_status: certified
quality_owner: finance-platform</code></pre></div><p>But the run-level details stay outside the asset metadata.</p><h2>5. Incident-Aware Pipelines</h2><p>If a critical upstream asset has an open incident, downstream jobs can behave differently.</p><p>For example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;ad988b52-cd57-48b1-8ca6-65dd0caf6eca&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">incident = dq_service.get_open_incident("raw.payments")

if incident.severity == "critical":
    stop_pipeline()

if incident.severity == "warning":
    run_pipeline_but_mark_output_as_impacted()</code></pre></div><p>This enables a more nuanced model than simply &#8220;run or fail.&#8221;</p><p>Possible actions:</p><pre><code><strong>Critical failure</strong> &#8594; stop pipeline
<strong>Warning</strong>          &#8594; continue but mark output as impacted
<strong>Deprecated input </strong>&#8594; continue for existing jobs, block new dependencies
<strong>Freshness delay</strong>  &#8594; wait, retry, or skip publish
<strong>Schema break </strong>    &#8594; fail immediately</code></pre><p>This is how quality metadata becomes operationally useful.</p><h2>6. Lineage-Aware Impact Propagation</h2><p>The most powerful pattern is lineage-aware propagation.</p><p>If <code>raw.orders</code> fails, the platform should know that <code>curated.orders</code>, <code>finance.revenue_mart</code>, and <code>executive.arr_dashboard</code> may be impacted.</p><p>This does not mean all downstream tables immediately become &#8220;failed.&#8221; It means their trust state should reflect dependency risk.</p><p>For example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;7beab7c4-903d-4cdd-a8f1-dc84d78a7db7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">quality_state: impacted
impacted_by: raw.orders
impact_reason: upstream_freshness_failed</code></pre></div><p>This is extremely useful for consumers.</p><p>Instead of discovering a broken dashboard manually, users can see that the underlying data is impacted by an upstream incident.</p><p>For data engineers, this also helps prioritize response. If a failed raw table impacts a board-level dashboard, it should be treated differently from a failure in an unused sandbox table.</p><h2>7. Environment and Release Gates</h2><p>Quality metadata can also be used in CI/CD workflows.</p><p>For example, a data contract change should not be promoted to production unless required checks exist.</p><p>A dbt model should not be marked as certified unless it has an owner, tests, freshness checks, and alert routing.</p><p>A table should not move from experimental to managed unless it has basic quality coverage.</p><p>This turns quality into a release policy:</p><pre><code><strong>No owner </strong><code>              &#8594; cannot certify
</code><strong>No freshness check</strong><code>     &#8594; cannot certify
</code><strong>No null checks</strong><code>         &#8594; cannot certify
</code><strong>No alert route </strong><code>        &#8594; cannot certify
</code><strong>Open critical incident</strong><code> &#8594; cannot promote</code></code></pre><p>This is where catalog metadata, DQ results, and pipeline orchestration come together.</p><h2>8. BI and Consumer Warnings</h2><p>Data engineers also need to think about downstream consumption.</p><p>A BI tool, notebook environment, or query interface can read asset trust metadata and display warnings.</p><p>For example:</p><pre><code>Warning<code>: this table is not certified.
Warning: this table is impacted by an upstream freshness incident.
Warning: this table is deprecated and will be removed after 2026-09-01.</code></code></pre><p>This is not a pipeline pattern, but data engineers need to publish the metadata that makes it possible.</p><h2>The Architecture I Would Recommend</h2><p>The clean architecture looks like this:</p><pre><code><code>Data pipeline
   &#8595;
Pipeline health checks
   &#8595;
DQ checks
   &#8595;
Freshness checks
   &#8595;
DQ platform / sidecar DQ results table
   &#8595;
Stable asset metadata update
   &#8595;
Catalog / governance layer
   &#8595;
Consumers, policies, BI warnings, certification workflows</code></code></pre><p>That is why it works.</p><h2>Final Take</h2><p>Data quality indicators should be part of the asset experience.</p><p>When someone opens a table, they should immediately understand whether it is certified, monitored, owned, trusted, stale, deprecated, blocked, or impacted by an upstream incident.</p><p>That information belongs in the catalog and can be mirrored into table metadata as stable properties.</p><p>At the same time, Data Quality operational results should not be embedded directly into Delta or Iceberg table metadata. They are too volatile, too detailed, and too operational. They can, and should, carry enough stable trust metadata to make data assets more discoverable, governable, and usable.</p><p></p><p></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data quality In Delta Lake and Iceberg ]]></title><description><![CDATA[Part 2: Three Pillars of Data Engineering Monitoring]]></description><link>https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg-e34</link><guid isPermaLink="false">https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg-e34</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Thu, 28 May 2026 12:59:23 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4750" height="3160" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3160,&quot;width&quot;:4750,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;a lake surrounded by mountains under a cloudy sky&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="a lake surrounded by mountains under a cloudy sky" title="a lake surrounded by mountains under a cloudy sky" srcset="https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1713545690664-8dcba75d3c3e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHxkZWx0YSUyMGxha2V8ZW58MHx8fHwxNzc5OTcxNDIyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@santurbanephotography">Abby Santurbane</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>For data engineers, &#8220;data quality&#8221; is only one part of the operational picture.</p><p>A production data asset is trustworthy only when three things are true:</p><p>1. <strong>Pipeline health</strong> - the pipeline which creates the data is healthy.</p><p>2. <strong>Data Quality </strong>- the data is correct enough for its use case.</p><p>3. <strong>Data Freshness</strong> - the data is fresh enough for its SLA.<br><br>They are not the same thing.</p><p>A pipeline can be green while the data is wrong.</p><p>A table can pass all schema and null checks while still being stale.</p><p>A freshness check can pass even if the pipeline is silently producing duplicated data.</p><p>Each pillar should produce operational signals, and only some of those signals should be promoted into stable asset metadata.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><p>This is the second part of series, the previous part is - </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6afd6c1f-d749-4844-b556-d3e71a460ae1&quot;,&quot;caption&quot;:&quot;Most companies already run some form of data quality monitoring. They have freshness checks, null checks, schema validation, row count checks, sometimes even anomaly detection, alerting, and incident workflows.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Data quality In Delta Lake and Iceberg&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:315636,&quot;name&quot;:&quot;Sergey&quot;,&quot;bio&quot;:&quot;Hey there! I'm Sergey Enin, a seasoned professional with 16+ years of experience in the advanced data analytics space. I've worked across the globe and I'm fluent in four languages &#128187;&#127757;&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb4b8594-ff73-4b33-84a3-6ce5a2583e1c_144x144.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-28T12:27:00.008Z&quot;,&quot;cover_image&quot;:&quot;https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:199590123,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:2240131,&quot;publication_name&quot;:&quot;Data, Engineering, and Beyond&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!SVdI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bfbcde6-87b3-4b8c-9b38-3d1b82408e62_800x800.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Pillar 1: Pipeline Health</h2><p>Pipeline health answers the question:</p><blockquote><p>Did the system that produces this asset run successfully?</p></blockquote><p>This is usually monitored by Airflow, dbt cloud, Spark jobs, Databricks Workflows, Flink, Kafka Connect, or another orchestration/runtime system.</p><p>Typical pipeline health signals:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;f59cb11b-139a-4e94-8cde-add9390b8e19&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">pipeline.last_run_status: success | failed | skipped | running
pipeline.last_run_at: 2026-05-27T08:00:00Z
pipeline.last_successful_run_at: 2026-05-27T08:00:00Z
pipeline.duration_seconds: 842
pipeline.retries: 1
pipeline.owner: data-platform
pipeline.run_url: https://orchestrator/runs/123</code></pre></div><p>So, it could mean:</p><pre><code><code>The Airflow DAG finished successfully at 08:00.
The Spark job wrote the output table.
No task failed.
The pipeline SLA was met.</code></code></pre><p>This is good news, but it does not prove the data is correct.</p><p>A pipeline can complete successfully and still produce bad data because of upstream delays, incorrect joins, unexpected source changes, or silent business logic issues.</p><h3>Pipeline Health Check</h3><p>A data engineer may define a simple rule:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;cd6d8f55-bdd6-40e5-91df-6de4010aad2b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">pipeline_run = orchestrator.get_latest_run("build_finance_orders_mart")

if pipeline_run.status != "success":
    catalog.update_asset_metadata(
        asset="finance.orders_mart",
        pipeline_health="failed",
        asset_state="not_trusted"
    )
    raise Exception("Pipeline failed")</code></pre></div><p>This signal is useful for operational dashboards and incident routing.</p><p>But I would not store every task log, retry event, executor metric, and stack trace in the table metadata. Those belong in the orchestrator and logging system.</p><p>The asset metadata should expose a compact state:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;25de0ebc-b491-43c1-b7aa-43e94ae90edd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">pipeline_health: healthy
last_successful_pipeline_run_at: 2026-05-27T08:00:00Z
pipeline_owner: finance-data-eng
pipeline_run_url: https://orchestrator/runs/123</code></pre></div><p></p><h2>Pillar 2: Data Quality</h2><p>Data quality answers the question:</p><blockquote><p>Did the produced data meet the expected rules?</p></blockquote><p>This is where tools such as Monte Carlo, Soda, Great Expectations, dbt tests, or even custom Spark checks are useful.</p><p>Typical data quality checks:</p><pre><code><code>Required columns are not null.
Primary business keys are unique.
Revenue is non-negative.
Country codes match reference data.
Order status belongs to an allowed list.
Row count is within an expected range.
Distribution of values did not unexpectedly shift.
Schema did not break.</code></code></pre><p>Example DQ result:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;687e098d-7b28-4f97-bfdc-3bb3509f8c59&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">dq.status: failed
dq.failed_checks_count: 2
dq.failed_checks:
  - order_id_not_null
  - revenue_non_negative
dq.run_id: dq_run_123
dq.results_uri: https://dq-platform/runs/123</code></pre></div><p>This is operational state. It may change every time checks run.</p><p>The detailed result should stay in the data quality platform or sidecar DQ results table.</p><p>The asset metadata should expose only a stable summary or lifecycle signal:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;95a696d3-ebec-4b6a-86d1-8b28a2a982de&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">quality_certification: certified
data_contract_status: warning
monitoring_required: true
quality_owner: finance-platform</code></pre></div><h3>Data Quality Gates</h3><p>A pipeline can run checks after producing a table:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;1c2e4817-fbb9-4bb4-b50f-2398ff7dff39&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">orders_mart = build_orders_mart(raw_orders)

write_table(orders_mart, "finance.orders_mart")

dq_result = dq.run_checks(
    asset="finance.orders_mart",
    checks=[
        "order_id_not_null",
        "unique_order_id",
        "revenue_non_negative",
        "valid_order_status",
        "row_count_within_expected_range"
    ]
)

dq_results_table.append(dq_result)

if dq_result.has_critical_failures:
    catalog.update_asset_metadata(
        asset="finance.orders_mart",
        data_contract_status="blocked"
    )
    raise Exception("Critical DQ checks failed")

if dq_result.passed_required_checks:
    catalog.update_asset_metadata(
        asset="finance.orders_mart",
        data_contract_status="certified"
    )</code></pre></div><p><strong>The DQ run result is stored in the DQ results system , while certification status is stored in asset metadata.</strong></p><p>That keeps the asset metadata clean while still allowing pipelines to act on quality.</p><p></p><h2>Pillar 3: Data Freshness</h2><p>Data freshness answers the question:</p><blockquote><p>Is the data current enough for the business use case?</p></blockquote><p>Freshness is related to pipeline health, but it is not the same thing.</p><p>A pipeline may run successfully at 08:00, but if the upstream source stopped receiving new events at 02:00, the table is still stale.</p><p>Freshness checks usually compare expected arrival patterns with actual data timestamps or ingestion times.</p><p>Typical freshness signals:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;97857649-1884-4798-9314-6346973f0665&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">freshness.status: fresh | stale | delayed | unknown
freshness.max_event_time: 2026-05-27T07:55:00Z
freshness.max_ingested_at: 2026-05-27T08:01:00Z
freshness.expected_by: 2026-05-27T08:15:00Z
freshness.delay_minutes: 20
freshness.sla_minutes: 60</code></pre></div><p>Example:</p><pre><code><code>The pipeline ran successfully.
The table has new rows.
But the newest business event is six hours old.
For a near-real-time reporting table, this is a freshness failure.</code></code></pre><h3>Freshness Gate</h3><p>For a critical dashboard table, a data engineer may define:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;e4548683-faf9-4baa-abca-6fed3c251d3f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">freshness = dq.get_freshness_status("finance.orders_mart")

if freshness.delay_minutes &gt; freshness.sla_minutes:
    catalog.update_asset_metadata(
        asset="finance.orders_mart",
        freshness_state="stale",
        data_contract_status="warning"
    )
    notify_owner("finance.orders_mart is stale")</code></pre></div><p>For a downstream pipeline:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;c020e86d-1669-44ff-aa2e-4819080d5cc9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">freshness = dq.get_freshness_status("raw.orders")

if freshness.status == "stale":
    raise Exception("Cannot rebuild finance.orders_mart because raw.orders is stale")</code></pre></div><p>Freshness is often the most important quality signal for business users.</p><p>A table can have perfect schema, zero nulls, and valid values, but if it is three days late, it is not useful.</p><h2>How the Three Pillars Work Together</h2><p>The real value comes from combining the three signals.</p><p>Consider this asset:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;d52740d6-239c-48ff-aae7-c143f9a3c67a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">asset: finance.orders_mart
pipeline_health: healthy
data_quality_status: passed
freshness_status: fresh
quality_certification: certified</code></pre></div><p>This table is in good shape.</p><div><hr></div><p></p><p>Now consider this one:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;2516fdce-f0f4-49b7-bc62-cc484aa54f03&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">asset: finance.orders_mart
pipeline_health: healthy
data_quality_status: passed
freshness_status: stale
quality_certification: certified</code></pre></div><p>The pipeline is green and quality checks pass, but the data is stale. This should trigger a warning, especially for operational dashboards.</p><div><hr></div><p></p><p>Another case:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;f04a1867-b2fa-427a-a138-2c0fd74e427c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">asset: finance.orders_mart
pipeline_health: failed
data_quality_status: unknown
freshness_status: stale
quality_certification: certified</code></pre></div><p>The last run failed, so we do not know whether the latest data would pass checks. Freshness is stale because the table has not been updated. This is primarily a pipeline incident.</p><div><hr></div><p></p><p>Another case:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;6c41fc85-0075-485b-860e-2e35108f379d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">asset: finance.orders_mart
pipeline_health: healthy
data_quality_status: failed
freshness_status: fresh
quality_certification: suspended</code></pre></div><p>The pipeline ran and the data is fresh, but the values are wrong. This is a data quality incident.</p><div><hr></div><p></p><p>These distinctions matter because the response should be different.</p><blockquote><p><strong>Pipeline failed</strong>  &#8594; fix orchestration, runtime, permissions, infrastructure, code </p><p><strong>DQ failed  </strong>      &#8594; inspect data values, business rules, source changes, transformations </p><p><strong>Freshness failed </strong>&#8594; inspect source arrival, ingestion lag, scheduling, upstream dependencies</p></blockquote><p>This is why the metadata model should not simply say:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;dca5a698-94d6-4c00-bf7c-12a93c1921c2&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">quality_status: failed</code></pre></div><p>That is too vague.</p><p>A better model says:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;821add22-33c1-4687-8de5-887fa878e652&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">pipeline_health: healthy
data_quality_status: passed
freshness_status: stale
data_contract_status: warning</code></pre></div><p>Now we know what is wrong and triage it.</p>]]></content:encoded></item><item><title><![CDATA[Data quality In Delta Lake and Iceberg]]></title><description><![CDATA[Part 1: Stable Metadata + Operational Evidence]]></description><link>https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg</link><guid isPermaLink="false">https://blog.dativo.io/p/data-quality-in-delta-lake-and-iceberg</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Thu, 28 May 2026 12:27:00 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="3008" height="2000" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2000,&quot;width&quot;:3008,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;white ice on body of water&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="white ice on body of water" title="white ice on body of water" srcset="https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1605963476871-42dfd3bfc7af?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3Mnx8aWNlYmVyZyUyMGZhbGxpbmd8ZW58MHx8fHwxNzc5OTY5MjYyfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@amelia1">Claudia Salvioli</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>Most companies already run some form of data quality monitoring. They have freshness checks, null checks, schema validation, row count checks, sometimes even anomaly detection, alerting, and incident workflows.</p><p>The problem is not that quality signals do not exist.</p><p>The problem is that they are usually hidden in operational tools, disconnected from the data assets people actually consume. So, data quality is a vanity metrics which is represented somewhere in the confluence and proudly shown to the leadership. But it means nothing.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><p></p><p>A data analyst opens a table in a catalog and sees the owner(finger crossed) , description, lineage, maybe some tags. But the real question is usually much simpler:</p><blockquote><p>Can I trust this table?</p></blockquote><p>That question leads to a idea:</p><p>Should data quality indicators become part of the table metadata itself?</p><p>Should Delta Lake or Apache Iceberg expose quality status directly as part of the asset?</p><p>After looking at this from the perspective of open table formats, catalogs, data quality platforms, and data engineering workflows, my conclusion is:</p><p><strong>Yes, data quality should be visible as asset metadata. But no, run-by-run data quality results should not be embedded directly into the open table format.</strong></p><p>The distinction matters.</p><h1>The Two Kinds of Data Quality Metadata</h1><p></p><p>When people say &#8220;data quality metadata,&#8221; they often mix two very different things.</p><p>The first category is stable asset state:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;d5fa1c03-b0b1-4e72-bc88-7cd0ddc4118a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">quality_certification: certified
data_contract_status: certified
quality_owner: finance-engineering
monitoring_required: true
sla_tier: gold
retention-policy: 1 year</code></pre></div><p>It does not change every few minutes. It is useful for discovery, governance, certification, access policies, platform automation, and downstream consumption.</p><p>The second category is operational quality state, something like:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;0a81cbd6-ac72-44a9-addc-4f6e0524e4d3&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">last_freshness_check: failed
null_rate_customer_id: 0.08
row_count_anomaly_score: 0.91
failed_checks_count: 3
incident_id: MC-12345
last_run_id: dq_run_20260527_103000</code></pre></div><p>This information is also important.</p><p>But it is operational. It changes every time a check runs. It belongs to the monitoring layer, not necessarily to the table definition.</p><p>The industry often gets into trouble when it treats these two categories as the same thing.</p><p>They are not the same thing.</p><p></p><h1>What Data Quality support Delta Lake and Iceberg Already Provide</h1><p>Before inventing a new metadata model, it is worth looking at what open table formats already support.</p><p>Delta Lake and Apache Iceberg are not data quality platforms, but they do contain several building blocks that are useful for data quality.</p><h2>Delta Lake</h2><p>Delta Lake has a few native capabilities that are directly or indirectly related to data quality.</p><p>The most obvious one is constraint enforcement.</p><h3>Hard rules: constraint enforcement.</h3><p>Delta supports <em>NOT NULL</em> constraints. This is the simplest form of quality enforcement: a required field cannot be missing.</p><p>Delta also supports <em>CHECK</em> constraints, which allow teams to define rules such as:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sql&quot;,&quot;nodeId&quot;:&quot;a5cfa69e-fd25-405b-8644-b9ab088bbad0&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sql">ALTER TABLE finance.orders ADD CONSTRAINT valid_revenue
CHECK (revenue &gt;= 0);</code></pre></div><p>This is useful because the rule is enforced at write time. If bad data is written, the write fails.</p><p>That makes constraints stronger than a dashboard, stronger than a catalog tag, and stronger than a downstream alert.</p><h3>Soft expectations: key constraints.</h3><p>In some environments, especially when using Unity Catalog, teams can also define informational primary key and foreign key constraints. These are not always enforced in the same way as traditional relational database constraints, but they are still valuable metadata. They describe expected uniqueness and relationships between datasets.</p><h3>Technical signals: <code>data-skipping statistics.</code></h3><p>Delta also collects file-level statistics for data skipping. These statistics can include values such as minimum and maximum column values, and they help query engines avoid reading unnecessary files. These statistics are not designed as data quality indicators, but they can support quality-adjacent use cases.m For example, they can help answer questions like:</p><pre><code><code>Does this file contain unexpected value ranges?
Are some partitions empty?
Did a column suddenly stop appearing in newly written data?
Is the table layout still useful for common access patterns?</code></code></pre><p>But this is important: Delta statistics are primarily an optimization feature.</p><p>They are not a semantic data quality model.</p><p>So Delta gives us three useful layers:</p><pre><code><code>1. Hard rules        &#8594; NOT NULL, CHECK constraints
2. Soft expectations &#8594; informational keys, schema expectations
3. Technical signals &#8594; data-skipping statistics</code></code></pre><p>That is useful, but it is not the same as a full data quality platform.</p><p>Delta does not natively answer questions like:</p><pre><code><code>Is this table certified?
Did the freshness check fail this morning?
Is there an open incident?
Was the latest anomaly acknowledged?
Which team owns the failed check?
What was the quality score over the last 30 days?</code></code></pre><p>Those questions belong to the observability and governance layer.</p><p></p><h2>Apache Iceberg</h2><p>Apache Iceberg has a different but equally interesting metadata model.</p><p>Iceberg tracks table state through metadata files, snapshots, manifest lists, and manifest files. Each snapshot represents the state of the table at a point in time. This makes Iceberg strong for table evolution, reproducibility, rollback, and time travel.</p><h3>File-level metrics.</h3><p>Iceberg manifests track data files and include file-level metrics. Depending on the writer and table configuration, these metrics may include information such as:</p><pre><code><code>record counts
null value counts
lower and upper bounds
column sizes
value counts</code></code></pre><p>This is very useful metadata.</p><p>For data engineers, these metrics can help identify quality symptoms.</p><p>For example:</p><pre><code><code>A null count suddenly increases.
A partition has far fewer records than usual.
A timestamp upper bound is older than expected.
A numeric column has values outside the expected range.</code></code></pre><p>Again, though, Iceberg does not treat these as business-level data quality indicators. They are technical metadata used mostly for planning, pruning, and efficient reads.</p><h3>Rich table metadata.</h3><p>Iceberg(as well, as Delta lake) also supports custom table properties. This gives teams a simple place to attach stable metadata such as:</p><pre><code><code>quality_certification: certified
data_contract_status: certified
monitoring_required: true
quality_owner: finance-platform</code></code></pre><p>For richer metadata, Iceberg has Puffin files. Puffin is designed to store additional statistics or index-like metadata that does not fit naturally into Iceberg manifests.</p><p>This could theoretically support more advanced quality-related artifacts, especially if the industry wanted to standardize richer table-level statistics.</p><p>But even with Puffin, I would be careful.</p><p>IMHO, Puffin is a place for statistics and technical metadata. It should not become a dumping ground for every data quality run result, incident, alert, and failed check payload.</p><p></p><h2>The Important Distinction: stable quality metadata vs operational data quality indicators</h2><p>Delta and Iceberg already provide useful quality-adjacent primitives:</p><pre><code><code>Schema enforcement
Constraints
Snapshots
Table properties
Column statistics
File metrics
Metadata tables
Additional statistics artifacts</code></code></pre><p>These are valuable.</p><p>But they mostly answer structural and technical questions:</p><pre><code><code>Is this write valid?
What files belong to this table version?
What are the column-level file statistics?
What changed between snapshots?
What properties describe this table?</code></code></pre><p>They do not answer the higher-level trust questions users care about:</p><pre><code><code>Is this asset certified?
Is it safe for executive reporting?
Did the latest freshness check pass?
Is there an open data incident?
Is this table covered by a data contract?
Who is responsible for quality?
</code></code></pre><p><s>Delta and Iceberg need to become data quality systems.</s></p><p>I would say:</p><blockquote><p>Delta and Iceberg should expose stable quality metadata.</p></blockquote><p></p><h3>Operational Data Quality Results Do Not Belong in Table Metadata</h3><p>The obvious implementation is tempting.</p><p>Every time Monte Carlo, Soda, Great Expectations, dbt tests,  or a custom data quality framework runs, write the latest status back to the table metadata.</p><p>Something like:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;304f2287-db86-4f50-bfb7-b97b538882cd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">quality_status: failed
quality_score: 0.82
last_checked_at: 2026-05-27T10:30:00Z
failed_checks_count: 3
results_uri: s3://dq-results/finance_metrics/run_123.json</code></pre></div><p>For a small number of tables, this looks feasible. At platform scale, it becomes problematic. Data quality indicators are operational and time-sensitive. They change on every check:</p><p>Freshness can fail at 10:00 and recover at 10:15;</p><p>A volume anomaly can be detected in one run and disappear in the next;</p><p>A null-rate check can fail because of a temporary upstream delay;</p><p>Incidents can be opened, acknowledged, suppressed, escalated, or resolved;</p><p>If all of that is written directly into table metadata, the metadata layer becomes a high-churn operational store.</p><p>That creates several problems.</p><p>First, metadata history becomes noisy. Instead of capturing meaningful asset changes &#8212; schema updates, ownership changes, lifecycle transitions, contract certification &#8212; the table history gets filled with operational status updates.</p><p>Second, catalogs and sync systems are not designed to be incident event stores. They are optimized for discovery, governance, lineage, and relatively stable asset metadata. Constantly mutating properties across thousands of assets creates unnecessary load.</p><p>Third, consumers may read stale quality state. The data quality system may have the latest result, but the catalog sync may lag. The table property may show yesterday&#8217;s status. The BI tool may cache an older version. Now we have multiple versions of &#8220;truth.&#8221;</p><p>Fourth, large per-check payloads do not fit well into table properties. Detailed DQ output includes check names, thresholds, observed values, sample failures, incident links, owners, routing rules, and historical context.</p><p>Trying to squeeze this into table metadata turns metadata into an awkward JSON dump. That is not asset metadata anymore.</p><p>That is an operational log pretending to be metadata.</p><p>A data quality platform( Montecarlo, Soda, DBT tests, your custom data quality platform) should be the source of truth for operational quality state because it is designed to store check history, detect anomalies, route alerts, manage incidents, and provide debugging context. Catalogs and table metadata should only expose stable trust signals, such as certification or contract status, while detailed check results remain in the DQ platform.</p><h2>The Better Pattern: Stable Metadata + Operational Evidence</h2><p>The compromise I like is simple. </p><p>Put stable state in the asset metadata.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;fd7c4228-8d3d-439a-a65d-432b3e5c138c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">quality_certification: certified
data_contract_status: certified
quality_owner: data-platform
monitoring_required: true
sla_tier: gold</code></pre></div><p>Put run-level results in a dedicated system.</p><p>That could be Monte Carlo. It could be Soda Cloud. It could be Great Expectations Cloud. It could be a custom observability service. It could also be a sidecar Delta or Iceberg table if you want a queryable internal history, such as </p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;1bd2ee68-a50b-410f-9faf-e8a22473adcf&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">dq_run_id
asset_id
checked_at
status
score
failed_checks_count
failed_checks
incident_id
results_uri
producer</code></pre></div><p>This gives us both things we need:</p><p>The asset remains clean and discoverable.</p><p>+</p><p>The operational history remains detailed and queryable.</p>]]></content:encoded></item><item><title><![CDATA[The Deletion Delusion: Your Modern Data Platform is Probably failing compliance]]></title><description><![CDATA[Modern data architectures vs GDPR]]></description><link>https://blog.dativo.io/p/the-deletion-delusion-your-modern</link><guid isPermaLink="false">https://blog.dativo.io/p/the-deletion-delusion-your-modern</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Mon, 20 Apr 2026 13:03:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!CPoY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There is a fundamental friction at the heart of modern data management: the widening chasm between legal fantasy and engineering reality. While your compliance/legal department assumes a "Right to Erasure" request is a simple SQL execution, every engineering lead knows the truth. Modern data platforms&#8212;built on &#8216;Big Data&#8217; and &#8216;Lakehouse&#8217; architectures&#8212;are optimized for append-heavy, read-intensive workloads. They were never designed for the selective, row-level mutations required by GDPR.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CPoY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CPoY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!CPoY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!CPoY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!CPoY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CPoY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2776797,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/194278852?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CPoY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!CPoY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!CPoY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!CPoY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f2ce7b6-bc58-41ad-b2a7-01114e05907e_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>While most/every company claims compliance, the underlying architecture of modern data lakes often makes true erasure a technical impossibility or hard(read expensive). Most organizations are operating under a "deletion delusion," where data is merely hidden from the application layer while remaining physically immutable in the depths of S3 or other storage.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><p></p><h2>The Financial Language of Privacy&#8212;ALE</h2><p>The biggest hurdle for privacy/platform engineering is securing a budget for a &#8220;Compliance Program.&#8221; To get the management&#8217;s attention, you must translate regulatory risk( practically - a city legend for management, until the are charged for lack of compliance) into <strong>Annual Loss Expectancy (ALE)</strong>. This isn&#8217;t just a compliance metric; it&#8217;s your <strong>ROSI (Return on Security Investment)</strong>.</p><p>The formula is: <strong>ALE = Probability of Failure (ARO) &#215; Impact (SLE)</strong>.</p><p>When calculating the <strong>Single Loss Expectancy (SLE)</strong>, don&#8217;t just look at the fine. You must include engineering remediation costs, legal fees, and the &#8220;compute surge&#8221; required to fix the data post-incident.</p><p><strong>The Financial Reality:</strong> If there is a <strong>2% chance</strong> of a material GDPR failure and the SLE (fine + remediation + legal) is <strong>&#8364;5,000,000</strong>, your ALE is:</p><p><strong>2% &#215; &#8364;5M = &#8364;100,000 / year</strong></p><p>If the cost to build a reliable automated shredder is &#8364;80,000, the program pays for itself. Without this quantitative model, you&#8217;re just an engineer asking for more &#8220;unproductive&#8221; budget.</p><h2>GDPR Assumes You Know What PII Is (You Don&#8217;t)</h2><p>The first point of failure isn&#8217;t a lack of intent; it&#8217;s a failure of <strong>Data Protection by Design</strong>. GDPR assumes you have a clear, static map of Personally Identifiable Information (PII). In a modern data platform, this is a <em>hallucination</em>.</p><p>Data classification is almost always incomplete, and in the complex web of modern pipelines, PII is transformed and re-created across 10+ systems. Even if you scrub a <code>user_id</code>, the user&#8217;s ghost persists via <strong>derived signals</strong> and <strong>indirect identifiers</strong>&#8212;session IDs, device fingerprints, and behavioral embeddings. AI systems can now re-identify individuals from data you never intentionally labeled as sensitive. If you can&#8217;t verify the location of every derived identifier, you are failing the accountability requirements.</p><div class="callout-block" data-callout="true"><p>Most companies are not non-compliant because they ignore GDPR. They are non-compliant because their architecture makes it (almost) impossible.</p></div><h2></h2><h2>"DELETE" is a Lie in the World of Immutable Files</h2><p>In traditional Relational Database  systems, deletion is a predictable row-level transaction. In the OLAP/Lakehouse world, your SQL <code>DELETE</code> is a lie. Because these systems rely on immutable columnar formats like Parquet, a delete command doesn&#8217;t remove data; it triggers a cascading failure of storage efficiency.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!v6tZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!v6tZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 424w, https://substackcdn.com/image/fetch/$s_!v6tZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 848w, https://substackcdn.com/image/fetch/$s_!v6tZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 1272w, https://substackcdn.com/image/fetch/$s_!v6tZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!v6tZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png" width="1456" height="191" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:191,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:83669,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/194278852?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!v6tZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 424w, https://substackcdn.com/image/fetch/$s_!v6tZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 848w, https://substackcdn.com/image/fetch/$s_!v6tZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 1272w, https://substackcdn.com/image/fetch/$s_!v6tZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72abc99f-57ee-4cd0-adae-010692a0a2ae_1600x210.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Behind a &#8220;Logical Table: Deleted&#8221; checkmark, a standard delete command actually causes:</p><ul><li><p><strong>Massive I/O &amp; Compute Spikes:</strong> The system must scan, filter, rewrite, and replace entire file groups.</p></li><li><p><strong>S3 Tier Promotion:</strong> Deletion jobs often drag data from &#8220;Cold&#8221; storage tiers back to &#8220;Hot,&#8221; spiking your monthly cloud bill while failing to actually purge the data.</p></li><li><p><strong>Passive Propagation:</strong> Deleted data persists in:</p><ul><li><p>&#10060; <strong>Delta/Iceberg History:</strong> Transaction logs maintain the state for &#8220;time travel.&#8221;</p></li><li><p>&#10060; <strong>S3/Cloud Backups:</strong> Data lives on in secondary storage and snapshots.</p></li><li><p>&#10060; <strong>Stale Aggregates:</strong> Downstream summaries still contain the mathematical influence of the deleted records.</p></li></ul></li></ul><p></p><p></p><p><strong>The AI Black Hole&#8212;Logs, Embeddings, and the Unknown</strong></p><p>The most dangerous tier of the deletion hierarchy is the &#8220;AI/Unknown&#8221; tier. Modern humanity are currently feeding massive amounts of PII into Large Language Models (LLMs), creating a trail of logs and high-dimensional embeddings that are mathematically impossible to &#8220;un-learn&#8221; or selectively purge.</p><p>Passive cleanup is no longer enough. We need an <strong><a href="https://github.com/dativo-io/talon">AI Control Plane</a></strong> that moves privacy to active runtime enforcement. This control plane must at least:</p><ul><li><p>Understand data sensitivity at the point of ingestion.</p></li><li><p>Enforce policies at the prompt/inference level.</p></li><li><p>Provide verifiable evidence of erasure across the entire AI lifecycle.</p><p></p></li></ul><p></p><p><strong>Conclusion: Beyond the Checkbox</strong></p><p>True compliance requires a shift from &#8220;passive compliance&#8221; to &#8220;active privacy engineering.&#8221; A defensible strategy involves a phased rollout based on the <strong> </strong>priority hierarchy, such as for example:</p><ol><li><p>Mandatory Right to Erasure (DDR) and Account Deletion.</p></li><li><p>Content deletion and retroactive backfills.</p></li><li><p>Addressing indirect identifiers (such as session IDs), full downstream propagation, and auditability.</p></li></ol><p><strong>Final Thought:</strong> Is your current &#8220;DELETE&#8221; button actually removing data, or is it just hiding it from view? The <strong><a href="https://www.edpb.europa.eu/coordinated-enforcement-framework_en">2026 European Data Protection Board&#8217;s report</a></strong> makes it clear: regulators are no longer accepting &#8220;technical difficulty&#8221; or &#8220;immutable architecture&#8221; as valid excuses for data persistence. They are specifically targeting gaps in backup erasure and failed anonymization. If your data platform treats deletion as an afterthought, you aren&#8217;t compliant&#8212;you&#8217;re just lucky. For now.</p>]]></content:encoded></item><item><title><![CDATA[If we redact PII before the model sees the prompt, can we still preserve enough context for good reasoning?]]></title><description><![CDATA[Same privacy boundary. Better answers. Measured across 200 A/B prompts.]]></description><link>https://blog.dativo.io/p/if-we-redact-pii-before-the-model</link><guid isPermaLink="false">https://blog.dativo.io/p/if-we-redact-pii-before-the-model</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Fri, 27 Mar 2026 21:41:22 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4000" height="6000" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:6000,&quot;width&quot;:4000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Wooden sculpture of a woman reading a book.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Wooden sculpture of a woman reading a book." title="Wooden sculpture of a woman reading a book." srcset="https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1764924671797-8d546240552b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2NHx8Y29udGV4dHxlbnwwfHx8fDE3NzQ2NDY1ODZ8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@le_y0u">You Le</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p></p><p>Most teams still talk about privacy and model quality as if you can only have one.</p><p>Either you protect sensitive data, or you preserve enough context for the model to be useful.</p><p>That tradeoff sounds intuitive. It is also too simplistic.</p><p>Ideally with tools like <a href="https://github.com/dativo-io/talon">Talon</a>, raw PII never reaches the model. But there is a big difference between removing PII and removing meaning.</p><p>A flat placeholder like <code>[PHONE]</code> protects privacy, but it also hides the one thing the model may actually need to answer correctly: is this a German number, a Polish number, or a French one?</p><p>So I tested a different approach.</p><p>Instead of sending raw personal data to the model, Talon can replace it with structured placeholders that preserve only safe, task-relevant semantics. Not the original value. Just the minimum useful context.</p><p>And when I compared that enriched approach against legacy type-only redaction, the enriched version won in both evaluation runs.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><p></p><h2>TL;DR</h2><p>I ran two 100-prompt A/B evaluations on Dativo Talon to test a simple question: if we redact PII before the model sees input, can we still preserve enough meaning for useful reasoning? Answer: <strong>yes</strong>. Enriched redaction (semantic placeholders) beat legacy type-only redaction in both runs, especially on attribute-dependent tasks like country routing and payment-method decisions.</p><h2>The setup</h2><p>I tested two variants.</p><p><strong>Variant A: legacy redaction</strong><br>The model sees flat placeholders such as <code>[PERSON]</code>, <code>[EMAIL]</code>, <code>[PHONE]</code>, <code>[IBAN]</code>.</p><p><strong>Variant B: enriched redaction</strong><br>The model sees structured placeholders such as:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;xml&quot;,&quot;nodeId&quot;:&quot;4dcd9da6-a431-45b7-b401-ebcf4cd7c036&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-xml">&lt;PII type=&#8221;phone&#8221; country_code=&#8221;PL&#8221;/&gt;</code></pre></div><p>or</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;xml&quot;,&quot;nodeId&quot;:&quot;51a67038-9818-4b39-b241-14e46c056955&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-xml">&lt;PII type=&#8221;email&#8221; domain_type=&#8221;free&#8221;/&gt;</code></pre></div><p>In both cases, raw PII is removed before model input.</p><p>This is not a comparison between &#8220;private&#8221; and &#8220;non-private.&#8221; Both variants keep raw personal data away from the model. The only difference is whether the model still receives safe semantic hints that matter for the task.</p><p>I ran two full experiments:</p><ul><li><p><strong>Run 1:</strong> <code>gpt-4o-mini</code>, 100 prompts.  <a href="https://gist.github.com/sergeyenin/3e90542d43c8c58d3bf12e0743ae10dd">See full logs</a></p><p><br></p></li><li><p><strong>Run 2:</strong> <code>gpt-4o</code>, 100 prompts. <a href="https://gist.github.com/sergeyenin/218ce51662482288e1d24b0defe77edb">See full logs</a><br></p></li></ul><p>Each prompt was scored on four dimensions, from 1 to 10:</p><ol><li><p>attribute reasoning(&#8220;context&#8221;)</p></li><li><p>utility preservation</p></li><li><p>semantic coherence</p></li><li><p>helpfulness<br></p></li></ol><p>So each prompt had a maximum total score of <strong>40</strong>.</p><p></p><h2>What happened</h2><h3>Run 1 &#8212; <code>gpt-4o-mini</code> (N=100)</h3><ul><li><p><strong>A mean total:</strong> 23.87</p></li><li><p><strong>B mean total:</strong> 28.58</p></li><li><p><strong>Mean delta:</strong> +4.71</p></li><li><p><strong>Numeric wins:</strong> B 59, A 19, ties 22</p></li></ul><h3>Run 2 &#8212; <code>gpt-4o</code> (N=100)</h3><ul><li><p><strong>A mean total:</strong> 23.5</p></li><li><p><strong>B mean total:</strong> 30.2</p></li><li><p><strong>Mean delta:</strong> +6.7</p></li><li><p><strong>Numeric wins:</strong> B 63, A 26, ties 11</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7OoY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7OoY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 424w, https://substackcdn.com/image/fetch/$s_!7OoY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 848w, https://substackcdn.com/image/fetch/$s_!7OoY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 1272w, https://substackcdn.com/image/fetch/$s_!7OoY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7OoY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png" width="956" height="216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:216,&quot;width&quot;:956,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38688,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/192355842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7OoY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 424w, https://substackcdn.com/image/fetch/$s_!7OoY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 848w, https://substackcdn.com/image/fetch/$s_!7OoY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 1272w, https://substackcdn.com/image/fetch/$s_!7OoY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35d11069-6839-4fff-9f62-dcc17ecdfa60_956x216.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Enriched redaction wins in both runs.<br>The strongest lift is exactly where it should be: <strong>attribute reasoning</strong>.</p><h2>Where enrichment helped most</h2><p>Not all semantic attributes are equally valuable.</p><p>The strongest gains came from attributes that directly affect routing, jurisdiction, or payment behavior.</p><h3>Strongest lift</h3><ul><li><p><strong>PHONE &#8594; </strong><code>country_code</code></p></li><li><p><strong>IBAN &#8594; </strong><code>country_code</code></p></li></ul><p>This was the clearest signal in both runs.</p><p>If the model sees only <code>[PHONE]</code>, it cannot reliably decide whether the request belongs with Germany, Poland, France, or another support flow.</p><p>If it sees <code>&lt;PII type="phone" country_code="DE"/&gt;</code>, that ambiguity disappears without exposing the original number.</p><p>The same pattern showed up for IBANs. A country code is often enough to reason about SEPA, local handling, or country-specific banking logic.</p><h3>Moderate lift</h3><ul><li><p><strong>PERSON &#8594; </strong><code>gender</code></p></li><li><p><strong>EMAIL &#8594; </strong><code>domain_type</code></p></li></ul><p>These still helped, but less dramatically.</p><p>That also makes sense.</p><p>A corporate email domain or a gendered title can improve the answer, but the downstream task is often less deterministic than country-based routing. The model can sometimes get close with generic language even without the attribute.</p><h3>Weakest area</h3><ul><li><p><strong>LOCATION &#8594; </strong><code>scope</code> such as city, region, or country</p></li></ul><p>This was the least convincing category.</p><p>The issue was not necessarily the enrichment itself. It was the prompt design.</p><p>Too many location prompts could still be answered with generic legal boilerplate. If a question does not force the model to actually use the distinction between city, region, and country, then that attribute will not show its value.</p><p>So this is less &#8220;scope does not help&#8221; and more &#8220;the benchmark did not pressure-test scope hard enough.&#8221;</p><h2>What this means in practice</h2><p>This is the production lesson.</p><p>Most teams fall into one of two bad patterns.</p><p>The first is to send raw prompts with PII to the model and hope governance happens somewhere later.</p><p>The second is to over-redact everything into useless placeholder soup and then act surprised when the model starts guessing.</p><p>Neither is a good long-term design.</p><p>The better path is narrower and more disciplined:</p><ul><li><p>remove raw PII before the model sees it</p></li><li><p>preserve only the minimum safe semantics needed for reasoning</p></li><li><p>decide those semantics through policy</p></li><li><p>record evidence of what the model actually saw</p></li></ul><p>That is the operating model Talon is built around, and this evaluation supports it.</p><h2>Edge cases worth being honest about</h2><p>The result is strong, but not perfect.</p><p>A few caveats matter.</p><p><strong>1. Prompt quality still varied</strong></p><p>Some prompts were genuinely attribute-dependent. Others were only loosely so.</p><p>That matters because if a prompt can be answered with generic common sense, the benchmark becomes less discriminative.</p><p><strong>2. Judge behavior still has style bias</strong></p><p>In a few cases, longer and more generic answers scored surprisingly well, even when a shorter answer was more precise.</p><p>That is a familiar problem in LLM-as-judge evaluations.</p><p><strong>3. Order effects were more visible on the smaller model</strong></p><p>I saw a bit more sensitivity in the <code>gpt-4o-mini</code> run than in the <code>gpt-4o</code> run.</p><p>Not enough to change the direction of the result, but enough to keep in mind.</p><p><strong>4. Location-scope prompts need to be redesigned</strong></p><p>This was the weakest benchmark segment and the one I would trust least in its current form.</p><p>So yes, the result is real. But some parts of the evaluation are stronger than others.</p><p></p><h2>What about cost?</h2><p>This is usually the first practical objection.</p><p>Semantic placeholders are longer. So do they make inference meaningfully more expensive?</p><p>In these runs, not in a way that mattered.</p><ul><li><p>In the <code>gpt-4o-mini</code> run, Variant B was slightly cheaper overall.</p></li><li><p>In the <code>gpt-4o</code> run, Variant B was materially cheaper in observed run cost.</p></li></ul><p>That does not mean enriched placeholders are inherently cheaper token-for-token. They are not.</p><p>It means end-to-end cost is dominated by full model behavior, especially output length and answer shape, not just placeholder size.</p><p>So the real takeaway is simpler:</p><p><strong>The quality gain was clear, and the added redaction structure did not create a practical cost penalty.</strong></p><p></p><h2>Why This Matters for Production</h2><p>Most teams still run one of two broken patterns:</p><ol><li><p>send raw prompts with PII and hope policy catches up later, or</p></li><li><p>over-redact into useless <code>[TYPE]</code> soup and lose utility.</p></li></ol><p>There is a better middle path:</p><ul><li><p>remove raw PII from model input</p></li><li><p>preserve a minimal, safe semantic layer</p></li><li><p>apply policy-as-code to which attributes are allowed</p></li><li><p>keep evidence for what the model actually saw</p></li></ul><p>That is what this experiment validates.</p><h2>What we are changing next in Talon</h2><p>Based on these runs, here is what I would do next:</p><ol><li><p><strong>Keep enriched redaction as the default</strong> for supported models.</p></li><li><p><strong>Improve the location-scope benchmark</strong>, especially prompt quality and dictionaries.</p></li><li><p><strong>Use a separate judge model</strong> and add a small human-reviewed subset.</p></li><li><p><strong>Run multi-seed evaluations with confidence intervals</strong> in CI.</p></li></ol><p>The core result is already useful. But the next version of the benchmark should be harder, cleaner, and more defensible.</p><h2>Final Thought</h2><p>Privacy-preserving AI does not need to be blind AI.</p><p>If your redaction layer removes everything useful, the model will guess.<br>If your redaction layer keeps safe semantics, the model can still reason.</p><p>This is not a theoretical point. We measured it across 200 prompt pairs.</p><p>No raw PII to the model. Better answers anyway.</p>]]></content:encoded></item><item><title><![CDATA[Why AI Agents Work in Demos but Fail in Production]]></title><description><![CDATA[Unless you&#8217;re doing it right&#8212;and right now, almost nobody is]]></description><link>https://blog.dativo.io/p/why-ai-agents-work-in-demos-but-fail</link><guid isPermaLink="false">https://blog.dativo.io/p/why-ai-agents-work-in-demos-but-fail</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Wed, 18 Mar 2026 10:35:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fYSI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fYSI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fYSI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!fYSI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!fYSI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!fYSI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fYSI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png" width="1024" height="1536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1536,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2457871,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/191348674?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fYSI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!fYSI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!fYSI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!fYSI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dcf8ed5-0992-4f6a-9aa9-2755bb4bee88_1024x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The AI silver bullet don&#8217;t exist as well :(</figcaption></figure></div><p>The current obsession with model &#8220;intelligence&#8221; is a failure of engineering discipline. CTOs and Senior Architects are chasing leaderboard scores like teenagers chasing fashion trends, only to watch their multi-agent systems (MAS) implode the moment they hit production.</p><p>The primary cause of multi-agent failure isn&#8217;t &#8220;weak models&#8221; - every week, a new model tops another benchmark. Every month, another company claims its agents are more autonomous, more intelligent, more capable. And yet the same thing keeps happening in production: Multi-agent systems look impressive in demos, then fall apart under real load. Not because the models are weak, but because the framework is.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The industry is solving the wrong problem. We are building systems as if LLMs are deterministic components, ignoring the reality that uncertainty at every agent handoff creates a multiplicative decay in reliability. High-performing models in a controlled notebook are a demo-day vanity metric. In the real world, &#8220;stochastic hope&#8221; is an engineering liability. Reliable agentic systems are not built by choosing better models; they are built by engineering rigid data boundaries and treating the entire system as a distributed data pipeline.</p><p>A multi-agent system is not &#8220;a group of smart agents working together&#8221;. It is a <strong>distributed pipeline of untrusted intermediate states</strong>.</p><p></p><h1>Software Engineering is dead, longs live the software engineering</h1><h2><strong>Agentic Systems as Probabilistic Pipelines</strong></h2><p>When you wire multiple agents together, you are building a series-system pipeline <a href="https://www.oreilly.com/radar/the-hidden-cost-of-agentic-failure/?utm_medium=email&amp;utm_source=platform+b2c&amp;utm_campaign=rediscover&amp;utm_content=canceled+20260317">governed by </a><strong><a href="https://www.oreilly.com/radar/the-hidden-cost-of-agentic-failure/?utm_medium=email&amp;utm_source=platform+b2c&amp;utm_campaign=rediscover&amp;utm_content=canceled+20260317">Lusser&#8217;s Law</a> </strong>.  In reliability engineering, this is the logic of a <strong>series system</strong>: if a workflow requires multiple sequential steps, the total success rate is the product of the success rates of the individual steps. </p><p>A single agent can look excellent in isolation&#8212;and that is exactly the trap. A model with 98% task accuracy sounds production-ready, but production systems are judged end-to-end.</p><p>In reliability engineering, a sequential workflow&#8217;s success is the product of the reliability of each step. If each hop succeeds with probability $p$, then an $n$-step workflow succeeds with probability:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;P(\\text{system success}) = p^n&quot;,&quot;id&quot;:&quot;HFEVPHFUJV&quot;}" data-component-name="LatexBlockToDOM"></div><p>Even with a stronger agents backed by better models, the decay is sharp:</p><ul><li><p>1 agent at 98%     &#8594;   Total success: 98.0%</p></li><li><p>5 agents at 98%   &#8594;    Total success:  90.4%</p></li><li><p>10 agents at 98% &#8594;    Total success: 81.7%</p></li></ul><p>One bad output does not just fail locally; it becomes state. The next agent reads it, trusts it, and reasons on top of it. A hallucinated tool response doesn&#8217;t just reduce the chance of success at one step; it <strong>poisons the steps that follow</strong>.</p><p>On the production floor, these mathematical decays manifest as destructive pathologies that eat your budget and kill your uptime:</p><ul><li><p><strong>Silent Schema Drift:</strong> A model outputs a slightly malformed JSON or an unexpected type. Without a validation gate, this corrupted state propagates downstream. Subsequent agents then condition their &#8220;reasoning&#8221; on garbage, leading to catastrophic cascades.</p></li><li><p><strong>Hallucinated Tool Outputs:</strong> Agents often condition their next action on a false return from an unvalidated tool call. Without a control plane to verify the return, the error remains invisible until the system produces a hallucinated final result.</p></li><li><p><strong>Unvalidated Handoffs:</strong> This is the peak of &#8220;stochastic hope&#8221;&#8212;passing raw strings between agents and praying the next model correctly parses the intent. It is the architectural equivalent of using <code>eval()</code> on untrusted user input.</p></li><li><p><strong>Operational Death Spirals:</strong> Recursive reasoning loops where supervisors fail to reach a terminal state. These loops consume thousands of tokens in seconds, draining API budgets without making an inch of progress toward the objective.</p></li></ul><p>This is the key mindset shift: a multi-agent system is not &#8220;a set of smart models collaborating.&#8221; It is a <strong>distributed pipeline of untrusted intermediate states</strong>.</p><p>Once you see it that way, the engineering answer becomes obvious. You do not solve the problem by making every model slightly better. You solve it by inserting <strong>contracts, validation gates, and control points</strong> so the system can survive when one hop is wrong.</p><h3><strong>The Analogy: MAS are Untyped Distributed Pipelines</strong></h3><p>The modern multi-agent stack looks a lot like the messy early days of data engineering. We are passing intermediate state between stages without schemas, relying on downstream logic to &#8220;figure out&#8221; malformed upstream outputs.</p><p>This isn&#8217;t an AI problem&#8212;it&#8217;s a <strong>data reliability problem</strong>. Until we treat agentic handoffs as formal contracts, these systems will never scale.</p><h1><strong>The Missing Layer: Contracts , Validation and Control</strong></h1><p>The only way to break the multiplicative decay of Lusser&#8217;s Law is to introduce gates. By verifying an output before it reaches the next agent, you change the &#8220;reliability math.&#8221;</p><p>The <strong>Effective Probability formula</strong> </p><p></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;p_{\\text{effective}} = p + (1 - p) \\cdot v&quot;,&quot;id&quot;:&quot;ZOKOFDPYDM&quot;}" data-component-name="LatexBlockToDOM"></div><p>shows us the way out. By applying a validation catch rate (<em>v</em>), you recover failures before they propagate. A 98% accurate agent with a 90% validation catch rate becomes 99.8% effective. Over 10 hops, that&#8217;s the difference between an 81.7% failure-prone system and a 98% stable one.</p><p></p><h1>A recipe for good life with AI Agents</h1><h2>Recipe 1: Pydantic + Instructor for Handoff Contracts</h2><p>The first job is to stop bad state from propagating. That means validation gates.</p><p><strong>Rule: Never pass raw LLM output to the next agent</strong>. Use <a href="https://docs.pydantic.dev/latest/">Pydantic</a> to define a contract and Instructor to force the model to satisfy it.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;373b9313-24ae-4023-bccb-aaf2f587c471&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, model_validator
from typing import Literal, Optional

client = instructor.from_openai(OpenAI(), max_retries=2)

class TicketDecision(BaseModel):
    action: Literal["approve", "reject", "escalate"]
    ticket_id: str
    risk_score: float = Field(ge=0, le=1)
    reason: str = Field(min_length=20, max_length=500)
    approver_id: Optional[str] = None

    @model_validator(mode="after")
    def enforce_business_rules(self):
        if self.action == "approve" and self.risk_score &gt; 0.7:
            raise ValueError("high-risk tickets cannot be auto-approved")
        return self</code></pre></div><p>This shifts the system from &#8220;trust the prompt&#8221; to &#8220;trust only validated state&#8221;.</p><h2>Recipe 2: Best-of-N + Controlled Ranking</h2><p>Validation prevents malformed outputs, but it doesn&#8217;t tell you which valid output is <em>best</em>. For complex tasks, use <strong>Best-of-N generation</strong> followed by a ranking step (like <a href="https://openpipe.ai/blog/ruler">RULER</a>).</p><ol><li><p><strong>Generate:</strong> Create 4 candidates.</p></li><li><p><strong>Validate:</strong> Filter out any that fail the Pydantic schema.</p></li><li><p><strong>Rank:</strong> Use a judge model to pick the winner based on a specific rubric (correctness, policy, clarity).</p></li></ol><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;d2251207-6ded-4468-a66a-eccdb187a28e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from typing import List
import instructor
from pydantic import BaseModel

# 1. Define the Contract
class AnalysisReport(BaseModel):
    summary: str
    sentiment: str
    confidence_score: float

# 2. Generate N Candidates
def generate_candidates(task: str, n: int = 4) -&gt; List[AnalysisReport]:
    candidates = []
    for _ in range(n):
        try:
            # Each call is a stochastic draw
            res = client.chat.completions.create(
                model="gpt-4o",
                response_model=AnalysisReport,
                messages=[{"role": "user", "content": task}]
            )
            candidates.append(res)
        except Exception:
            continue # Skip malformed candidates
    return candidates

# 3. Apply RULER (The Judge)
def ruler_rank(candidates: List[AnalysisReport], task: str) -&gt; AnalysisReport:
    # We ask a stronger model to act as the 'RULER' judge
    judge_prompt = f"""
    Task: {task}
    Candidates: {candidates}
    
    Rank these candidates based on:
    1. Depth of insight in the 'summary'
    2. Alignment between 'sentiment' and 'summary'
    3. Realistic 'confidence_score' (avoid overconfidence)
    
    Return only the best candidate's index.
    """
    # Logic to select the winner based on the judge's decision
    # ...
    return candidates[winner_index]</code></pre></div><p></p><h2>Recipe 3: Talon for Bounded Search and Budget Control</h2><p>Local validation is a start, but production reliability requires an <strong>external control gateway</strong>. This is where <strong><a href="https://github.com/dativo-io/talon">Talon</a></strong> fits.</p><p>Search-based reliability (like Best-of-N or multi-step reasoning) is a double-edged sword. Without a control plane, a "smart" supervisor might trigger an infinite loop of retries, re-rankings, and judge calls. This leads to "test-time bankruptcy," where a single user request consumes hundreds of dollars in tokens. Talon places hard caps on the reasoning process itself.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;b152a267-ab15-402b-937c-c21ac86f68fd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">policies:
  session_limits:
    max_cost: 2.50          # Absolute dollar cap per trace
    max_candidates: 4       # Limit Best-of-N generation width
    max_judge_calls: 2      # Limit the number of re-evaluations</code></pre></div><h2>Recipe 4: Talon as the Validated Commit Boundary</h2><p>In production, you should never give an LLM &#8220;raw&#8221; write access to your database or APIs. An agent is a stochastic engine; if it hallucinates a <code>DELETE</code> flag or an unbounded <code>LIMIT</code>, the damage is instantaneous. Talon acts as a &#8220;Commit Wrapper,&#8221; forcing every tool call to pass through a deterministic governance layer before execution.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;61276251-1b99-4295-a5bb-e4dac109417c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">tool_policies:
  update_customer_records:
    max_row_count: 100            # Block "Update All" hallucinations
    require_dry_run: true         # Force a simulation first
    forbidden_argument_values:
      mode: ["truncate", "drop"]  # Block destructive operations
    arguments:
      query: redact               # Strip PII from logs/traces
    timeout: "15s"                # Kill runaway tool executions</code></pre></div><p></p><h2>Recipe 5: Talon Idempotency to Stop Duplicate Side Effects</h2><p>Retries are a requirement for reliability, but they are dangerous for side-effecting tools like sending emails or charging credit cards. If an upstream planner fails and retries the entire sequence, you risk executing the same action twice. Talon tracks the intent of a tool call, ensuring that repeated calls with the same parameters do not trigger duplicate external actions.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:&quot;2bcfde40-9bc7-49f2-ae2c-2587ed62b1c6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">tool_governance:
  send_notification_email:
    idempotency_key: "request_id" # Link to the unique session ID
    cache_ttl: "24h"              # Prevent double-send within a window
    on_duplicate: "return_cached" # Return the original success response
    strict_mode: true             # Fail if idempotency cannot be verified</code></pre></div><p></p><h1><strong>Engineering by Design</strong></h1><p>Engineering reliable massive agentic systems requires two fundamental shifts in leadership perspective:</p><ol><li><p><strong>From &#8220;Models are smart&#8221; to &#8220;Systems must be safe under uncertainty.&#8221;</strong> Assume the model will fail. Build the safety net first.</p></li><li><p><strong>From &#8220;Prompt Engineering&#8221; to &#8220;System Architecture.&#8221;</strong> Reliability is a function of boundary enforcement and budget control, not the phrasing of a system prompt.</p></li></ol><p>Agentic systems do not become reliable by chance; they become reliable by design. Scalable AI is a data engineering challenge involving state management, contract enforcement, and cost governance. By implementing strict validation boundaries, using Talon as a control plane, and amortizing intelligence through Reinforcement Learning, you move from &#8220;stochastic hope&#8221; to production-ready infrastructure. Reliability is a choice. Make it.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[LLM APIs Have No Seatbelts. I Built One.]]></title><description><![CDATA[Why People Are Putting a Reverse Proxy in Front of Their AI Traffic]]></description><link>https://blog.dativo.io/p/llm-apis-have-no-seatbelts-i-built</link><guid isPermaLink="false">https://blog.dativo.io/p/llm-apis-have-no-seatbelts-i-built</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Sat, 28 Feb 2026 15:24:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Pehe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>TL;DR</strong>: LLM APIs don&#8217;t ship with the controls you&#8217;d expect from any other piece of infrastructure &#8212; no per-caller auth, no tool restrictions, no cost ceiling. I got burned (meeting invitation I did not want to sent, $385 weekend loop, private information, like IBAN, in plaintext) and built a reverse proxy that fixes it. Here&#8217;s how three features work in practice, with the exact configs I run.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pehe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pehe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Pehe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Pehe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Pehe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pehe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3795710,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/189337071?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pehe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Pehe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Pehe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Pehe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9850867b-aa32-4812-9895-a097dfc96032_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>It was a Sunday afternoon. I was building a personal scheduling agent &#8212; the kind that reads your calendar, finds gaps, and books meetings automatically. Super useful for coordinating squash or catching up with friends. I&#8217;d been hacking on it for a couple of days and wanted to test the full flow end-to-end. </p><p>I needed test contacts which would react, so I exported a few from my phone &#8212; figured I&#8217;d use people I actually know. My friends Marek and Tomek, and a couple of others. I told the agent to &#8220;book some test meetings for next week&#8221; and went to make coffee.</p><p>By the time I got back, it had sent real calendar invites. To all of them. For a meeting titled &#8220;Test Meeting 3&#8221; with no agenda, no description, nothing. Marek texted me: &#8220;what is this?&#8221; Lukasz sent mem in response. Tomek ignored it.  Fine.</p><p>But there was a fourth contact in that export I&#8217;d forgotten about &#8212; someone from a networking event six months ago. I barely remembered his name. He accepted the invite without replying.</p><p>Monday morning he showed up on the call. I had no idea who was joining or why. I spent the next ten minutes pretending this was intentional.</p><p>The agent did exactly what I asked. &#8220;Book meetings.&#8221; With exactly the data I gave it. Nothing in between said &#8220;these are real people, maybe confirm first.&#8221;</p><p>That was the moment I understood the problem. Not that the agent was broken. That I had no layer between its decisions and the real world.</p><p></p><h2>The Real Issue: LLM APIs Ship Without Operational Controls</h2><p>After my inbox incident I started looking at how other teams run their agents. I talked to a friend at a mid-size fintech &#8212; five departments, three different API keys, zero idea what they were spending. Last month someone grep&#8217;d the logs during an unrelated investigation and found customer IBANs going to GPT-4 in plaintext. Thousands of requests over four months. Nobody had noticed because the bot worked great.</p><p>Different setups. Same gap.</p><blockquote><p><strong>&#8220;We have no idea what our agents are sending, what they&#8217;re allowed to do, or what they&#8217;re costing us.&#8221;</strong></p></blockquote><p>It&#8217;s not because anyone is careless. It&#8217;s because <strong>LLM APIs don&#8217;t ship with operational controls</strong>. There&#8217;s no per-caller identity. No way to say &#8220;this bot can&#8217;t see destructive tools.&#8221; No cost ceiling that actually shuts the door. You get an API key, you call the endpoint, and whatever the client sends goes straight through.</p><p>Every other piece of infrastructure I&#8217;ve run &#8212; databases, message queues, HTTP backends &#8212; has a proxy layer with auth, rate limiting, and observability. LLM traffic had none of that.</p><p>So I built one. It became the gateway component of <strong>Dativo Talon</strong> &#8212; an open-source tool I&#8217;ve been working on. A reverse proxy that sits between your clients and the LLM provider, identifies each caller, and applies policy before forwarding. One Go binary, one YAML config.</p><p>Here&#8217;s how the three features I needed most work in practice.</p><h2>1. Tool Filtering &#8212; So the Model Never Learns <code>calendar_invite</code> Exists</h2><p>This is the feature I built first, because it directly solves what happened to me.</p><p>My scheduling agent had five tools: read_calendar, find_gaps, create_draft, calendar_invite, and send_reminder. The first three are <em>safe</em> &#8212; they read data or create local drafts. The last two reach the real world. And the model couldn&#8217;t tell the difference, because I&#8217;d given it all five.</p><p>I didn&#8217;t need to remove calendar_invite from my code. I needed to remove it from what the model sees during testing.</p><p>That&#8217;s what the gateway does. It inspects the tools array in the JSON body before the request reaches OpenAI. Any tool matching a forbidden pattern gets stripped. The model never learns it exists. It can&#8217;t call calendar_invite if it was never told about calendar_invite.</p><blockquote><p>Tool filtering is <strong>prevention</strong>, not detection. By the time you intercept a tool call, the model already decided to make it. The gateway removes the option before the decision happens.</p></blockquote><p>Here&#8217;s what the config looks like:</p><pre><code><code>gateway:
  default_policy:
    # "filter" = silently strip matching tools before the model sees them
    tool_policy_action: "filter"
    forbidden_tools:
      - "calendar_invite"  # the tool that sent three real meeting invites
      - "send_*"           # matches send_email, send_reminder, send_sms
      - "delete_*"         # matches delete_thread, delete_emails
      - "admin_*"
      - "bulk_*"
      - "drop_*"</code></code></pre><p>What this looks like in practice:</p><pre><code><code># What OpenAI sees WITHOUT the gateway:
tools: [read_calendar, find_gaps, create_draft, calendar_invite, send_reminder]

# What OpenAI sees WITH the gateway:
tools: [read_calendar, find_gaps, create_draft]</code></code></pre><p>The model gets the read tools and the draft tool. It can plan meetings and prepare invites all day long. But it can&#8217;t <em>send</em> anything, because it doesn&#8217;t know sending is an option.</p><p>Patterns use glob syntax, case-insensitive. send_* matches send_email, send_reminder, Send_SMS. The lists are additive across levels &#8212; default policy, provider, and per-caller overrides all merge into one set.</p><p>Two modes:</p><ol><li><p>filter (default) &#8212; silently removes forbidden tools, forwards the rest. The agent keeps working; it just can&#8217;t see the ones that reach the real world.</p></li><li><p>block &#8212; rejects the entire request with HTTP 403 if any forbidden tool is present.</p></li></ol><p>You can also go the other direction with a per-caller <strong>allowlist</strong>. Only the tools you name get through:</p><pre><code><code>callers:
  - name: "scheduling-agent"
    api_key: "talon-gw-sched-001"
    tenant_id: "default"
    policy_overrides:
      # strict allowlist: ONLY these tools pass through
      allowed_tools: ["read_calendar", "find_gaps", "create_draft"]
      tool_policy_action: "block"</code></code></pre><p>Now the agent can read, search, and draft. Nothing else. If I later wire up calendar_invite or send_email, the model will never see it unless I explicitly add it to the allowlist. This is the config I run for any agent that&#8217;s still in testing &#8212; default to read-only, unlock write tools deliberately.</p><p>The same principle would have prevented the OpenClaw incident. One forbidden_tools: ["delete_*"] line and the model would never have known deletion was an option.</p><p>Test it yourself:</p><pre><code><code>curl -s -X POST http://localhost:8080/v1/proxy/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer talon-gw-sched-001" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role":"user","content":"Book a test meeting for next Tuesday"}],
    "tools": [
      {"type":"function","function":{"name":"find_gaps","parameters":{}}},
      {"type":"function","function":{"name":"calendar_invite","parameters":{}}}
    ]
  }'</code></code></pre><p>calendar_invite gets stripped. The model only sees find_gaps. It can find the time slot, but it can&#8217;t book anything. The evidence record logs which tools were requested, filtered, and forwarded &#8212; signed and timestamped.</p><h2>2. PII-Based Routing &#8212; Because I Fed the Model Real Email Addresses</h2><p>Here&#8217;s the thing I didn&#8217;t appreciate until after the calendar incident: the tool wasn&#8217;t the only problem. The data was the problem too. I fed my agent real contact email addresses, and those went straight to OpenAI as part of the prompt. Even if I&#8217;d blocked calendar_invite, the model would still have seen sarah.chen@company.com and marcus.klein@bigcorp.de in its context window. Those are real people&#8217;s real email addresses sitting on OpenAI&#8217;s servers.</p><p>The fintech story made it worse. Their support bot was summarising customer tickets, and those tickets contained IBANs, email addresses, phone numbers. Thousands of requests over four months. All of it went to GPT-4 in plaintext.</p><p>Under pii_action: "block", every one of those requests would have been rejected before reaching OpenAI. Under "redact", the IBANs would have been replaced with [REDACTED:iban] and the emails with [REDACTED:email] before the model saw them. Either way, four months of undetected PII leakage doesn&#8217;t happen.</p><p>Every request that hits the gateway goes through a PII classifier first. It scans for personal data patterns &#8212; IBANs, emails, phone numbers, tax IDs &#8212; and assigns a data tier (0, 1, or 2) based on what it finds. That tier feeds into what happens next.</p><p>Four actions, two directions. On the request side (what the client sends): allow passes through, warn logs to evidence, redact replaces PII with [REDACTED:type] before forwarding, block rejects with HTTP 400. On the response side (what the model sends back): same four actions, with block returning HTTP 451.</p><p>Different callers get different treatment:</p><pre><code><code>callers:
  - name: "internal-analytics"
    api_key: "talon-gw-analytics-001"
    tenant_id: "default"
    team: "data"
    policy_overrides:
      pii_action: "warn"             # log PII, forward unchanged &#8212; need to iterate fast
      response_pii_action: "warn"
      max_data_tier: 1               # deny tier 2 (high-sensitivity) requests

  - name: "customer-facing-bot"
    api_key: "talon-gw-custbot-002"
    tenant_id: "default"
    team: "support"
    policy_overrides:
      pii_action: "redact"           # IBANs, emails &#8594; [REDACTED:type] before OpenAI sees them
      response_pii_action: "redact"  # redact PII in model responses too
      max_data_tier: 0               # only public/anonymised data allowed

  - name: "scheduling-agent-dev"
    api_key: "talon-gw-sched-dev-001"
    tenant_id: "default"
    team: "engineering"
    policy_overrides:
      pii_action: "redact"           # would have caught sarah.chen@company.com
      response_pii_action: "warn"</code></code></pre><p>Internal analytics gets warn &#8212; I see what PII is flowing, but the team can iterate. Customer-facing bot gets redact &#8212; any IBAN or email in the prompt becomes <strong>[REDACTED:iban]</strong> before it touches OpenAI. My scheduling agent in dev gets redact &#8212; so even if I&#8217;m lazy and paste real contacts into the test prompt, the gateway scrubs them before the model sees them. Sunday-afternoon-proof.</p><p>The max_data_tier adds a second gate. If the classifier tags a request as tier 2 (high-sensitivity) but the caller is only cleared for tier 0, the policy engine denies it regardless of the PII action. Your customer-facing bot can&#8217;t accidentally process data it was never supposed to see.</p><p>Response scanning works for both streaming (SSE) and non-streaming. For streams, the gateway buffers the full response, scans, and forwards the original events if clean or rewrites them if redaction is needed.</p><p>Every PII detection &#8212; both directions &#8212; ends up in the evidence store. talon audit list shows which requests contained PII, what types, and what action was taken. No log grepping.</p><div><hr></div><h2>3. Cost Caps &#8212; I Burned $385 on a Saturday. Here&#8217;s How I Made Sure It Never Happens Again.</h2><p>Different weekend, different mistake. I left a test loop running &#8212; GPT-4, increasingly long context windows, no stop condition. By Sunday evening: $385 in API charges on a project budgeted at $20/month.</p><p>I seem to learn everything on weekends.</p><p>That Monday I added cost caps to the gateway. Least interesting feature to build, most money saved.</p><p>Here&#8217;s the thing most teams don&#8217;t realise: <strong>you find out about a cost overrun when the monthly invoice arrives.</strong> OpenAI&#8217;s usage dashboard updates, but there&#8217;s no hard stop. No circuit breaker. A gateway that blocks at the daily limit is fundamentally different from a provider alert that shows up 30 days later.</p><p>Every request gets a cost estimate based on the model and token count. The gateway tracks daily and monthly spend per caller by querying the evidence store &#8212; the same SQLite database that holds audit records. When a caller hits the cap, the next request gets a 403. No grace period.</p><pre><code><code>callers:
  - name: "production-agent"
    api_key: "talon-gw-prod-001"
    tenant_id: "default"
    policy_overrides:
      max_daily_cost: 50.00          # hard cap: 403 after $50/day
      max_monthly_cost: 1000.00

  - name: "dev-sandbox"
    api_key: "talon-gw-dev-002"
    tenant_id: "default"
    policy_overrides:
      max_daily_cost: 5.00           # weekend loops die at $5, not $385
      max_monthly_cost: 50.00

default_policy:
  max_daily_cost: 100.00             # global ceiling for callers without overrides
  max_monthly_cost: 2000.00</code></code></pre><p>production-agent gets $50/day. dev-sandbox gets $5/day. If I leave another loop running on a Saturday, the gateway kills it at $5 instead of letting it burn for 48 hours.</p><p>The CLI tells you where you stand:</p><pre><code><code>talon costs --tenant default

# Agent             Today ($)   Month ($)   Limit (day)   Limit (month)
# production-agent      22.10     487.30         50.00        1000.00
# dev-sandbox            1.80      28.70          5.00          50.00
# support-bot            0.80      15.20          &#8212;             &#8212;
# Total                 24.70     531.20        100.00        2000.00</code></code></pre><p>Every evidence record includes model_used, cost, input_tokens, output_tokens, and duration_ms. Export with talon audit export --format csv and you can answer: which model is burning the most, which caller is growing fastest, where tokens are wasted on retries.</p><p>Rate limiting complements cost caps for the speed-of-spend problem. Cost caps say &#8220;no more than $50 today.&#8221; Rate limits say &#8220;no more than 60 requests per minute.&#8221; Together they catch both the slow bleed and the fast burst:</p><pre><code><code>rate_limits:
  global_requests_per_min: 300       # shared across all callers
  per_caller_requests_per_min: 60    # per-caller cap &#8212; slows runaway agents</code></code></pre><div><hr></div><h2>A Full Config &#8212; All Three Together</h2><p>Here&#8217;s what I actually run for three callers, each with different tool, PII, and cost policies:</p><pre><code><code>gateway:
  enabled: true
  listen_prefix: "/v1/proxy"
  mode: "enforce"

  providers:
    openai:
      enabled: true
      secret_name: "openai-api-key"          # real key in encrypted vault, never in client config
      base_url: "https://api.openai.com"
      allowed_models: ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo"]

  callers:
    - name: "production-agent"
      api_key: "talon-gw-prod-001"           # caller token &#8212; not the OpenAI key
      tenant_id: "default"
      team: "engineering"
      allowed_providers: ["openai"]
      policy_overrides:
        max_daily_cost: 50.00
        max_monthly_cost: 1000.00
        pii_action: "redact"                 # scrub PII from requests
        response_pii_action: "warn"          # log PII in responses, don't block
        allowed_models: ["gpt-4o", "gpt-4o-mini"]
        forbidden_tools: ["delete_*", "admin_*", "drop_*", "send_*"]

    - name: "internal-bot"
      api_key: "talon-gw-internal-001"
      tenant_id: "default"
      team: "support"
      allowed_providers: ["openai"]
      policy_overrides:
        max_daily_cost: 10.00
        max_monthly_cost: 200.00
        pii_action: "redact"
        response_pii_action: "redact"
        allowed_tools: ["search_kb", "read_ticket", "create_draft"]  # strict allowlist
        tool_policy_action: "block"          # reject if any other tool appears

    - name: "dev-sandbox"
      api_key: "talon-gw-dev-002"
      tenant_id: "default"
      team: "engineering"
      allowed_providers: ["openai"]
      policy_overrides:
        max_daily_cost: 5.00                 # Saturday-proof
        max_monthly_cost: 50.00
        pii_action: "redact"                 # scrub real emails from test prompts
        allowed_models: ["gpt-4o-mini"]      # cheapest model only

  default_policy:
    default_pii_action: "warn"
    response_pii_action: "warn"
    max_daily_cost: 100.00
    max_monthly_cost: 2000.00
    require_caller_id: true
    log_prompts: true
    tool_policy_action: "filter"
    forbidden_tools:
      - "delete_*"
      - "admin_*"
      - "export_all_*"
      - "bulk_*"
      - "rm_*"
      - "drop_*"
    attachment_policy:
      action: "warn"
      injection_action: "block"              # block prompt injection in file attachments
      max_file_size_mb: 10

  rate_limits:
    global_requests_per_min: 300
    per_caller_requests_per_min: 60

  timeouts:
    connect_timeout: 10s
    request_timeout: 120s
    stream_idle_timeout: 60s</code></code></pre><p>Three callers, three risk profiles. <strong>production-agent</strong> gets a generous budget, PII redaction on input, and a blocklist of destructive and send tools. <strong>internal-bot</strong> gets a strict allowlist (three tools, nothing else), PII redaction both ways, and a tighter budget. <strong>dev-sandbox</strong> gets the cheapest model, PII redaction (no more testing with real emails), and a $5/day ceiling.</p><p>The clients don&#8217;t know about any of this. They point at the gateway URL with their caller key. The gateway does the rest.</p><div><hr></div><h2>When This Is the Wrong Choice</h2><p>A gateway adds a hop. If you&#8217;re running a single script on your laptop and you&#8217;re the only user, it&#8217;s overhead for no benefit.</p><p>If you need the absolute lowest first-token latency and you&#8217;re at the edge, PII scanning on streaming responses adds buffering time. The passthrough path (pii_action: "allow") is ~1ms overhead, but redaction on a long stream is measurable.</p><p>If your agents only have read-only tools and never touch sensitive data, the risk profile is lower. Still worth auditing, but the urgency drops.</p><p>If you&#8217;re not dealing with customer PII yet &#8212; pre-revenue, purely internal &#8212; the compliance angle is less pressing. But the moment you start processing real user data or fall under NIS2 scope, the gateway goes from &#8220;nice to have&#8221; to &#8220;how did we not have this.&#8221;</p><p>And if you need policy on every individual tool invocation &#8212; not just what the model is told about, but what happens when the tool runs &#8212; a gateway isn&#8217;t enough. That&#8217;s a different shape: an MCP proxy or full agent runner with per-tool policy.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><div><hr></div><h2>Final Thought</h2><p>Calendar invites to real people. $385 on a weekend loop. IBANs in plaintext for four months. Every one of those happened because there was nothing between the agent and the API &#8212; no filter on what tools the model could see, no scan on what data was in the prompt, no ceiling on what it could spend.</p><p>The fix is the same pattern we&#8217;ve been using on HTTP traffic for twenty years: a reverse proxy with policy. It just hadn&#8217;t been applied to LLM APIs yet.</p><p><strong><a href="https://github.com/dativo-io/talon">talon init</a></strong> takes fifteen minutes. The difference between &#8220;the agent booked a meeting with a stranger&#8221; and &#8220;the agent tried and the gateway said no&#8221; is one YAML file.</p><p>Checkout <a href="https://github.com/dativo-io/talon">GitHub</a> for more.</p>]]></content:encoded></item><item><title><![CDATA[I Gave OpenClaw a Kill Switch Before It Could Decide for Itself.]]></title><description><![CDATA[I Watched OpenClaw Delete a Meta Director's Inbox. And decided I need a kill switch &#8212; before the agent decides for me.]]></description><link>https://blog.dativo.io/p/i-was-exicited-and-scared-of-openclaw</link><guid isPermaLink="false">https://blog.dativo.io/p/i-was-exicited-and-scared-of-openclaw</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Thu, 26 Feb 2026 22:25:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_IdR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I like OpenClaw. I use it for many personal things - call me to remind about doctor appoint,  search for opensource Github project,&#8230; you call it! It&#8217;s fast, it&#8217;s hackable, and it connects to basically everything.</p><p>And then it deleted Meta Director&#8217;s email.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_IdR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_IdR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 424w, https://substackcdn.com/image/fetch/$s_!_IdR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 848w, https://substackcdn.com/image/fetch/$s_!_IdR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 1272w, https://substackcdn.com/image/fetch/$s_!_IdR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_IdR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&#8216;This should terrify you&#8217;: Meta Superintelligence safety director lost control of her AI agent&#8212;it deleted her emails&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="&#8216;This should terrify you&#8217;: Meta Superintelligence safety director lost control of her AI agent&#8212;it deleted her emails" title="&#8216;This should terrify you&#8217;: Meta Superintelligence safety director lost control of her AI agent&#8212;it deleted her emails" srcset="https://substackcdn.com/image/fetch/$s_!_IdR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 424w, https://substackcdn.com/image/fetch/$s_!_IdR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 848w, https://substackcdn.com/image/fetch/$s_!_IdR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 1272w, https://substackcdn.com/image/fetch/$s_!_IdR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cb371f4-7aa9-427d-994e-1e8767fd322d_1920x1080.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you missed it: <a href="https://www.businessinsider.com/meta-ai-alignment-director-openclaw-email-deletion-2026-2?IR=T">in February 2026, an OpenClaw agent connected to a Meta director for AI&#8217;s inbox went on a speed-run</a>. It mass-deleted emails, ignored stop commands, blew through cost in minutes, and kept going even after the user tried to shut it down. The context window compacted and the agent lost track of the original instructions. It just&#8230; decided deleting was the task.</p><p><strong>That scared me</strong>. Not because OpenClaw is broken &#8212; it&#8217;s a great agent runtime. But because there&#8217;s nothing between the agent and the API. No filter on what tools the model sees. No cost ceiling. No way to remotely kill a run. No record of what happened that you could trust after the fact. It&#8217;s a straight pipe from agent to OpenAI, and if the agent goes sideways, you find out when the damage is done.</p><p>So I built a way to put a wall in front of it.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>The actual problem</h2><p>OpenClaw sends your LLM requests directly to OpenAI. The model sees every tool you registered &#8212; including <strong>delete_emails, bulk_remove, drop_table</strong>, whatever you&#8217;ve wired up. If the model decides to call one, it calls it. There&#8217;s no checkpoint, no approval, no &#8220;hey, are you sure?&#8221;</p><p>And there&#8217;s no audit trail. If something goes wrong, you&#8217;re digging through stdout logs trying to reconstruct what the agent did, in what order, with whose data. Good luck.</p><p>What I wanted was simple:</p><ul><li><p><strong>Don&#8217;t let the model see tools it shouldn&#8217;t use.</strong> Not &#8220;block the call after it happens.&#8221; Remove the tool from the request <em>before</em> the model knows it exists. It can&#8217;t call <code>delete_emails</code> if it was never told about <code>delete_emails</code>.</p></li><li><p><strong>Cap the spend.</strong> Daily, monthly, per-request. When the budget&#8217;s done, the gateway says no.</p></li><li><p><strong>Record everything.</strong> Every request, every denial, every tool that got stripped. Signed, queryable, trustworthy.</p></li><li><p><strong>Keep my real API key out of OpenClaw.</strong> OpenClaw gets a caller token. The real key lives in an encrypted vault and gets injected at forward time.</p></li></ul><h2>How I set it up</h2><p>I built this into <strong><a href="https://github.com/dativo-io/talon">Dativo Talon</a></strong> &#8212; a single Go binary that sits between OpenClaw and OpenAI. Here&#8217;s the exact setup I run.</p><h3>Step 1: Install and init</h3><pre><code><code>go install github.com/dativo-io/talon/cmd/talon@latest

mkdir talon-openclaw &amp;&amp; cd talon-openclaw
talon init --pack openclaw --name openclaw-gateway</code></code></pre><p>This generates two files: agent.talon.yaml (server policy) and talon.config.yaml (gateway config). The gateway config is where the real controls live.</p><h3>Step 2: Store your OpenAI key in the vault</h3><pre><code><code>export TALON_SECRETS_KEY=$(openssl rand -hex 32)  # save this somewhere safe
talon secrets set openai-api-key "$OPENAI_API_KEY"</code></code></pre><p>Your real OpenAI key is now encrypted at rest. OpenClaw will never see it.</p><h3>Step 3: Start the gateway</h3><pre><code><code>talon serve --gateway</code></code></pre><p>That&#8217;s it. Talon is now listening on <code>localhost:8080</code>.</p><h3>Step 4: Point OpenClaw at Talon</h3><p>In <code>~/.openclaw/openclaw.json</code>:</p><pre><code><code>{
  "models": {
    "providers": {
      "openai": {
        "baseUrl": "http://localhost:8080/v1/proxy/openai/v1",
        "apiKey": "talon-gw-openclaw-001",
        "api": "openai-responses",
        "models": [
          { "id": "gpt-4o", "name": "gpt-4o" },
          { "id": "gpt-4o-mini", "name": "gpt-4o-mini" }
        ]
      }
    }
  }
}</code></code></pre><p>Notice the apiKey &#8212; that&#8217;s the <strong>caller token</strong>, not the OpenAI key. Talon identifies OpenClaw by this token and injects the real key when it forwards to OpenAI.</p><p>Restart OpenClaw (openclaw gateway stop &amp;&amp; openclaw gateway start) and you&#8217;re running through the gateway.</p><p></p><h2>The config that would have stopped the inbox incident</h2><p>Here&#8217;s the talon.config.yaml I use. I&#8217;ll walk through the parts that matter.</p><pre><code><code>gateway:
  enabled: true
  listen_prefix: "/v1/proxy"
  mode: "enforce"

  providers:
    openai:
      enabled: true
      secret_name: "openai-api-key"
      base_url: "https://api.openai.com"
      allowed_models: ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo"]

  callers:
    - name: "openclaw-main"
      api_key: "talon-gw-openclaw-001"
      tenant_id: "default"
      team: "engineering"
      allowed_providers: ["openai"]
      policy_overrides:
        max_daily_cost: 25.00
        max_monthly_cost: 500.00
        pii_action: "redact"
        allowed_models: ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo"]

  default_policy:
    require_caller_id: true
    log_prompts: true

    # --- THIS IS THE BIG ONE ---
    # Tool governance: strip dangerous tools BEFORE the model sees them.
    tool_policy_action: "filter"
    forbidden_tools:
      - "delete_*"
      - "admin_*"
      - "export_all_*"
      - "bulk_*"
      - "rm_*"
      - "drop_*"

    # PII: redact personal data from requests headed to OpenAI
    default_pii_action: "redact"
    response_pii_action: "warn"

    # Attachments: scan PDFs and CSVs for prompt injection
    attachment_policy:
      action: "warn"
      injection_action: "block"
      max_file_size_mb: 10

  rate_limits:
    global_requests_per_min: 300
    per_caller_requests_per_min: 60

  timeouts:
    connect_timeout: 10s
    request_timeout: 120s
    stream_idle_timeout: 60s</code></code></pre><p>Let me break down what each piece would have done in the emails incident:</p><ul><li><p>forbidden_tools: ["delete_*", "bulk_*"] &#8212; The agent had access to delete_email, delete_thread, and bulk operations. With this config, Talon strips those tools from the JSON body before OpenAI ever sees them. The model literally cannot decide to delete anything because it doesn&#8217;t know deletion is an option.</p></li><li><p>tool_policy_action: "filter" &#8212; This is the mode. "filter" silently removes forbidden tools and forwards the rest. If you want to be more aggressive, set it to "block" &#8212; that rejects the entire request if any forbidden tool is present. I prefer "filter" because it keeps the agent functional for everything except the dangerous stuff.</p></li><li><p>max_daily_cost: 25.00 &#8212; The incident ran up significant cost in minutes. This cap shuts the door after $25/day for this caller. Done. No negotiation.</p></li><li><p>per_caller_requests_per_min: 60 &#8212; The agent was firing requests as fast as it could. Rate limiting slows a runaway agent to a manageable pace and gives you time to notice.</p></li><li><p>request_timeout: 120s &#8212; No single request gets more than 2 minutes. The agent can&#8217;t sit in an infinite loop waiting for a response.</p></li></ul><h2>Per-caller tool allowlists (when you want to be strict)</h2><p>If <code>forbidden_tools</code> is a blocklist, you can also go the other direction &#8212; a strict allowlist. Only the tools you name get through:</p><pre><code><code>callers:
  - name: "openclaw-main"
    policy_overrides:
      allowed_tools: ["search_web", "read_file", "list_files", "create_draft"]
      tool_policy_action: "block"</code></code></pre><p>Now OpenClaw can only use those four tools. Everything else &#8212; delete_emails, send_email, admin_reset, whatever &#8212; gets rejected. The model never sees them. This is the nuclear option and it&#8217;s the one I&#8217;d use if I were connecting an agent to anyone&#8217;s inbox.</p><h2>Verify it works</h2><p>Send a request with a dangerous tool and watch what happens:</p><pre><code><code>curl -s -X POST http://localhost:8080/v1/proxy/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer talon-gw-openclaw-001" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role":"user","content":"Clean up my inbox"}],
    "tools": [
      {"type":"function","function":{"name":"search_web","parameters":{}}},
      {"type":"function","function":{"name":"delete_emails","parameters":{}}}
    ]
  }'</code></code></pre><p>delete_emails gets stripped. The model only sees search_web. Check the evidence:</p><pre><code><code>talon audit list --agent openclaw-main --limit 5</code></code></pre><p>You&#8217;ll see exactly which tools were requested, which were filtered, and which were forwarded. Signed and timestamped.</p><p></p><h2>When this isn&#8217;t the answer</h2><ul><li><p><strong>You&#8217;re just playing around.</strong> If it&#8217;s a hobby project and nothing is at stake, the gateway is overhead you don&#8217;t need. Especially if you have unlimited money , and you have nothing to hide ;)</p></li><li><p><strong>You trust the tool set completely.</strong> If your agent only has read-only tools &#8212; no delete, no write, no send &#8212; the risk profile is lower. Still worth auditing, but the urgency is different.</p></li><li><p><strong>You need governance </strong><em><strong>inside</strong></em><strong> MCP tool calls.</strong> The gateway governs what goes to and from the LLM. If you need policy on every individual tool invocation (not just what the model is told about), that&#8217;s Talon&#8217;s MCP proxy &#8212; a different deployment shape.</p></li></ul><h2>Final thought</h2><p>The email incident wasn&#8217;t a bug in OpenClaw. It was a missing layer. The agent did exactly what agents do &#8212; it picked from the tools it was given and executed. The problem is it was given delete_emails and nobody was standing between the model and that tool.</p><p>That&#8217;s what I&#8217;m solving. Not replacing OpenClaw &#8212; I still use it every day( finger cross my tool would catch all dangerous stuff). Just making sure it runs through a gateway that strips the dangerous tools, caps the cost, and writes down everything that happened. If something goes wrong, I want to know exactly what the agent tried to do and exactly where it was stopped.</p><p><code>talon init --pack openclaw</code>. Fifteen minutes. That&#8217;s the difference between &#8220;the agent deleted everything&#8221; and &#8220;the agent tried to delete everything and was told no.&#8221;</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Memory for AI Agents]]></title><description><![CDATA[Why Every Orchestration Platform Is Racing to Solve the Same Problem]]></description><link>https://blog.dativo.io/p/memory-for-ai-agents</link><guid isPermaLink="false">https://blog.dativo.io/p/memory-for-ai-agents</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Mon, 23 Feb 2026 21:27:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!bQKx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Different contexts. Same question.</p><blockquote><p>&#8220;How does this agent can remember anything?&#8221;</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bQKx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bQKx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!bQKx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!bQKx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!bQKx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bQKx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png" width="494" height="741" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1536,&quot;width&quot;:1024,&quot;resizeWidth&quot;:494,&quot;bytes&quot;:2094070,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/188950260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!bQKx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!bQKx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!bQKx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!bQKx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03207695-14db-4bc0-be87-981ee39966b7_1024x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the last six months, I&#8217;ve watched the same architectural question surface in every AI project I&#8217;ve been close to. Startups building their first agent workflows. Scaleups wrapping compliance around existing automation. Large enterprises deploying AI across regulated environments.</p><p>Not remember within a single chat. That&#8217;s just a context window. I mean remember across sessions, across days, across hundreds of runs &#8212; while staying coherent, efficient, and (increasingly) compliant.</p><p>This article explains why agent memory became one of the hottest enteprise infrastructure problem of early 2026.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.dativo.io/subscribe?"><span>Subscribe now</span></a></p><p></p><h2>The Problem That Context Windows Don&#8217;t Solve</h2><p>Every LLM is stateless. GPT-4, Claude, Gemini &#8212; they all start each API call with a blank slate. The context window creates an illusion of memory, but it&#8217;s really just a very large input buffer.</p><p>That works for chatbots. It breaks for agents.</p><p>The moment you build an agent that runs repeatedly &#8212; a sales analyst that processes reports daily, a support bot that handles tickets across shifts, a compliance monitor that learns from policy violations &#8212; you hit the wall.</p><p>Context windows reset. Knowledge is lost. The agent makes the same mistakes it made last week. It asks the user the same questions. It doesn&#8217;t learn.</p><p>Bigger context windows don&#8217;t fix this. Models with 128K or even 1M token windows still reset between API calls. Even within a single call, performance degrades over long contexts &#8212; models lose track of details buried in the middle. And stuffing every prior interaction into the prompt gets expensive fast.</p><p>What you actually need is a <strong>memory system</strong>: infrastructure that decides what to store, how to retrieve it, when to update it, and when to forget.</p><p>This is where things get interesting.</p><h2>Three Types of Memory (That Actually Matter)</h2><p>The academic literature &#8212; particularly the survey &#8220;Memory in the Age of AI Agents&#8221; (December 2025) and the CoALA framework &#8212; converges on three types of long-term memory that production agents need. This isn&#8217;t just taxonomy. Each type has different storage patterns, retrieval strategies, and lifecycle rules.</p><p><strong>Semantic memory</strong> stores what the agent knows &#8212; facts, preferences, constraints. &#8220;The user prefers Python.&#8221; &#8220;Our fiscal year starts in April.&#8221; &#8220;The compliance threshold is &#8364;100K.&#8221; These are stable facts that hold across sessions and should be updated when they change, not duplicated.</p><p><strong>Episodic memory</strong> stores what happened &#8212; specific interactions, outcomes, decisions. &#8220;On February 15, the policy engine denied SQL access because the query contained PII.&#8221; &#8220;Last Thursday&#8217;s report cost &#8364;0.42 and used Claude Sonnet.&#8221; These are events. They accumulate. They provide context for pattern recognition.</p><p><strong>Procedural memory</strong> stores how to do things &#8212; learned behaviors, workflows, response patterns. &#8220;When the user asks for a financial summary, always check PII classification first.&#8221; &#8220;Use the compact format for Slack responses.&#8221; These are rare but powerful &#8212; they represent the agent actually improving its own behavior.</p><p>Most memory systems on the market implement some version of this taxonomy, whether they call it that or not. The real differentiation is in what happens after storage: consolidation, retrieval, and lifecycle management.</p><h2>The Core Pipeline: Extract &#8594; Consolidate &#8594; Store &#8594; Retrieve</h2><p>If you just append every interaction to a database and search it later, you&#8217;ve built a log, not a memory. Logs grow without bound, contain duplicates, hold contradictory facts, and become increasingly expensive to query.</p><p>Production memory systems follow a pipeline pattern that mirrors (loosely) how human memory works.</p><p><strong>Extraction</strong> takes raw conversation data and distills it into structured memory units &#8212; facts, observations, preferences. Instead of storing &#8220;the user said they switched from JavaScript to Rust last month,&#8221; you extract the fact: {preference: "Rust", negated: "JavaScript", timestamp: "2026-02"}.</p><p><strong>Consolidation</strong> is where the real engineering happens. New facts are compared against existing memories. Duplicates are detected. Contradictions are resolved. Stale entries are invalidated. Without consolidation, your memory fills with noise. With it, storage drops by roughly 60% and retrieval precision improves by over 20%, according to Mem0&#8217;s benchmarks on LOCOMO.</p><p><strong>Storage</strong> persists the processed memories &#8212; typically in a vector database for semantic search, sometimes augmented with a graph database for relational reasoning. The choice of backend shapes what kinds of queries you can answer efficiently.</p><p><strong>Retrieval</strong> fetches relevant memories at query time and injects them into the agent&#8217;s prompt. The sophistication here ranges from simple keyword matching to composite scoring that weighs relevance, recency, memory type, and trust.</p><p>Every serious memory product implements some version of this pipeline. The differences are in the details.</p><h2>The AUDN Cycle: How Mem0 Handles Consolidation</h2><p><a href="https://mem0.ai/">Mem0</a> &#8212; arguably the clearest &#8220;memory as a product&#8221; offering &#8212; popularised what I&#8217;ll call the <strong>AUDN cycle</strong>: Add, Update, Delete, Noop.</p><p>For each candidate fact extracted from a conversation, Mem0 retrieves the top-S most similar existing memories using vector similarity. It then presents both the new fact and the existing memories to an LLM through a tool-calling interface. The LLM decides:</p><ul><li><p><strong>Add</strong>: genuinely new information, store it</p></li><li><p><strong>Update</strong>: augments an existing memory with more detail</p></li><li><p><strong>Delete</strong>: contradicts an existing memory, remove the old one</p></li><li><p><strong>Noop</strong>: already captured, skip</p></li></ul><p>This is elegant because it offloads conflict resolution to the LLM itself. The model decides whether &#8220;prefers Python&#8221; should be overwritten by &#8220;switched to Rust&#8221; or whether both should coexist. No hand-crafted rules needed.</p><p>The results are strong. On the LOCOMO benchmark, Mem0 delivers a 26% accuracy uplift over OpenAI&#8217;s built-in memory, 91% lower p95 latency compared to full-context baselines, and 90% token cost savings. The graph-enhanced variant (Mem0g) adds entity-relationship extraction for multi-hop reasoning &#8212; &#8220;what decisions led to this outcome?&#8221; becomes answerable.</p><p>The limitation is that Mem0 deletes contradicted facts. Once overwritten, the old memory is gone. For a personal assistant, that&#8217;s fine. For a regulated enterprise environment, it&#8217;s a problem &#8212; auditors want to see what the system believed at a given point in time, not just what it believes now.</p><h2>Temporal Knowledge Graphs: How Zep Thinks About Time</h2><p><a href="https://www.getzep.com/">Zep</a>, and its open-source engine Graphiti, take a fundamentally different approach. Where Mem0 is optimised for fast, flat fact retrieval, Zep builds a <strong>temporal knowledge graph</strong> with bi-temporal semantics.</p><p>Every fact in Zep has four timestamps: when the event occurred, when it became invalid, when the system first learned about it, and when the system stopped considering it current.</p><p>This dual timeline &#8212; event time and ingestion time &#8212; enables queries that no flat memory store can handle. &#8220;What did the agent know as of last Tuesday?&#8221; &#8220;Show me how this relationship evolved over the past month.&#8221; &#8220;When did we first learn that the customer changed their billing address?&#8221;</p><p>When new information contradicts existing facts, Zep doesn&#8217;t delete the old edge. It <strong>invalidates</strong> it &#8212; setting an <code>invalid_at</code> timestamp and preserving the full history. The knowledge graph grows richer over time, not just larger.</p><p>On benchmarks, Zep achieves up to 18.5% accuracy improvement on LongMemEval (which tests cross-session reasoning and temporal tasks) and 90% latency reduction compared to baselines. It particularly excels at multi-hop reasoning &#8212; connecting facts across multiple sessions and time periods.</p><p>The trade-off is complexity. Zep requires a graph database (Neo4j or FalkorDB), embedding infrastructure, and more operational overhead than a simple key-value memory store. The open-source Graphiti framework makes this accessible, but it&#8217;s still a heavier commitment than Mem0&#8217;s three-line API.</p><h2>LangMem: Memory as a Library</h2><p>LangChain&#8217;s <a href="https://langchain-ai.github.io/langmem/">LangMem</a> takes a more modular approach. Rather than being a standalone memory product, it provides composable primitives &#8212; create_memory_manager, create_search_memory_tool, create_prompt_optimizer &#8212; that plug into LangGraph&#8217;s agent framework.</p><p>LangMem separates memory into <strong>profiles</strong> (structured schemas updated in-place, like user preferences) and <strong>collections</strong> (unbounded document sets searched semantically). It supports background consolidation through a memory manager that extracts, deduplicates, and updates memories asynchronously.</p><p>The key design choice is storage agnosticism. LangMem doesn&#8217;t mandate a specific backend &#8212; it works with any store that supports save and semantic search. MongoDB, Postgres with pgvector, in-memory stores, or custom implementations all work.</p><p>For teams already invested in the LangChain ecosystem, LangMem is the path of least resistance. The trade-off is that you&#8217;re assembling pieces rather than getting a turnkey system. There&#8217;s no built-in temporal model, no graph-based reasoning, and conflict resolution is delegated to the LLM without the structured AUDN pipeline that Mem0 provides.</p><h2>Letta (MemGPT): Memory as Agent State</h2><p><a href="https://www.letta.com/">Letta</a> &#8212; the production evolution of the MemGPT research project &#8212; treats memory as a first-class component of the agent&#8217;s state. Agents have explicit <strong>core memory blocks</strong> (always injected into the prompt &#8212; persona, goals, preferences) and <strong>archival memory</strong> (out-of-context storage searched on demand).</p><p>The distinctive feature is that agents can explicitly write, update, and delete their own memory blocks through tool calls. Memory isn&#8217;t something that happens to the agent &#8212; it&#8217;s something the agent actively manages.</p><p>This makes Letta particularly well-suited for persistent assistants and local-LLM deployments (it works well with Ollama and vLLM). The agent maintains identity and continuity across restarts, which is critical for long-lived worker agents.</p><p>The limitation is that Letta&#8217;s memory model is agent-centric. It doesn&#8217;t natively handle cross-agent memory sharing, multi-tenant isolation, or compliance-grade audit trails. For single-agent personal assistant scenarios, it&#8217;s excellent. For enterprise multi-agent deployments, you&#8217;ll need to build additional infrastructure.</p><h2>The Gap Nobody Is Filling: Governed Memory</h2><p>Here&#8217;s what struck me as I surveyed the landscape.</p><p>Every product optimises for <strong>recall quality</strong>. Better accuracy. Lower latency. Fewer tokens. Richer reasoning. Those metrics matter. But they&#8217;re all answering the same question: &#8220;How well does the agent remember?&#8221;</p><p>Nobody is answering: &#8220;Is it <strong>safe</strong> for the agent to remember this?&#8221;</p><p>Consider what happens when an AI agent processes a customer support ticket containing a credit card number, a medical diagnosis, or an employee&#8217;s home address. The agent learns from it &#8212; stores an observation, updates its memory, maybe adjusts its behavior. But should it?</p><p>Under GDPR Article 25, that&#8217;s a data protection by design question. Under the EU AI Act (full enforcement August 2026), high-risk AI systems need technical documentation of every decision. Under NIS2, incident response requires reconstructing what the system knew at the time of a breach.</p><p>None of the major memory products address this. Mem0 has no PII detection on memory writes. Zep has no policy enforcement layer. LangMem delegates governance entirely to the application developer. Letta stores whatever the agent decides to store.</p><p>This isn&#8217;t a criticism of those products &#8212; they&#8217;re solving a different problem. But for European enterprises deploying AI agents in regulated environments, the gap is real.</p><h2>Dativo Talon</h2><p>At <a href="https://github.com/dativo-io/talon">Dativo Talon</a> &#8212; the open-source compliance-first AI orchestration platform I am currently building, started from the governance side and worked toward recall quality, rather than the other way around.</p><p><a href="https://github.com/dativo-io/talon">Talon&#8217;s</a> memory architecture wraps a full governance pipeline around every write operation. Before anything hits the memmory database, it passes through PII scanning (25+ EU-specific patterns covering all 27 member states), OPA policy evaluation, category validation against allow/forbid lists, policy override detection, conflict checking, and provenance tracking with trust scores. Every write &#8212; and every governance decision &#8212; generates an HMAC-signed evidence record.</p><p>The storage layer uses SQLite with FTS5 for full-text search, progressive disclosure (lightweight index entries for prompt injection, full detail on demand), and AI-compressed observations that reduce raw agent runs from thousands of tokens down to roughly 500-token structured summaries. ( N.B. I love SQLite and I truely believe it is excellent solution without unneccessary overhead)</p><p>What I am working right now &#8212; and what motivated this article &#8212; is the consolidation layer. I am working on a governed AUDN cycle inspired by Mem0&#8217;s approach, but with a critical difference: invalidated entries are <strong>preserved</strong> (Zep-style temporal invalidation), not deleted. Every consolidation decision &#8212; add, update, invalidate, noop &#8212; is governed and audited. We&#8217;re adding bi-temporal queries so any auditor can reconstruct what the agent knew at any point in time.</p><p>We&#8217;re also moving from flat timestamp retrieval to relevance-scored retrieval that weighs keyword relevance, recency, memory type (semantic/episodic/procedural), and trust score &#8212; matching the retrieval sophistication of Mem0 while preserving the audit trail.</p><p>The goal is build only memory system where an agent&#8217;s learning is both high-quality <strong>and</strong> compliance-grade. Where governed memory is a compliance asset, not just a developer convenience.</p><h2>When You Don&#8217;t Need Persistent Memory</h2><p>Ok, Memory isn&#8217;t always the answer.</p><p>If your agent handles single-turn queries &#8212; &#8220;translate this,&#8221; &#8220;summarise that document,&#8221; &#8220;generate a report&#8221; &#8212; the context window is sufficient. Adding a memory layer introduces complexity, latency, and storage costs with minimal benefit.</p><p>If your agent runs the same static workflow every time (process this CSV, send this email), procedural memory might help but semantic and episodic memory probably won&#8217;t.</p><p>If your data is already in a well-structured knowledge base, retrieval-augmented generation (RAG) over that knowledge base is likely a better fit than agent memory. Memory shines when the knowledge comes from the interactions themselves &#8212; not from pre-existing documents.</p><p>Memory pays off when agents run repeatedly, learn from outcomes, serve multiple users, or operate in environments where context evolves over time.</p><h2>Final Thought</h2><p>Agent memory is having its &#8220;Iceberg moment.&#8221;</p><p>Just as <a href="https://blog.dativo.io/p/apache-iceberg">Iceberg standardised</a> the table format for data lakes &#8212; separating storage from compute and making data engine-independent &#8212; the memory layer is becoming the standard infrastructure for making AI agents stateful, efficient, and persistent.</p><p>The products differ in approach. Mem0 optimises for speed and simplicity. Zep optimises for temporal reasoning and relational depth. LangMem optimises for composability. Letta optimises for agent autonomy.</p><p>But the underlying pattern is converging: extract, consolidate, store, retrieve. With scoring. With lifecycle management. With conflict resolution.</p><p>What&#8217;s still missing &#8212; and what I believe will matter enormously as AI regulation matures in Europe and worldwide &#8212; is governance around that pipeline. Not as an afterthought. Not as a compliance checkbox. But as a first-class architectural concern, where every memory write is scanned, evaluated, and signed.</p><p></p><div><hr></div><p><em>If you&#8217;re building AI agents and thinking about memory architecture, I&#8217;d love to hear how you&#8217;re approaching it. Drop a comment or reach out &#8212; this is a fast-moving space and I&#8217;m learning from every conversation.</em></p><p><em>(1) Dativo Talon is open-source and available on <a href="https://github.com/dativo-io/talon">GitHub</a>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Apache Iceberg]]></title><description><![CDATA[Why Data Engineers Are Quietly Standardising on It]]></description><link>https://blog.dativo.io/p/apache-iceberg</link><guid isPermaLink="false">https://blog.dativo.io/p/apache-iceberg</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Tue, 13 Jan 2026 08:06:47 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="3126" height="2344" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2344,&quot;width&quot;:3126,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;white and gray rock formation on blue sea under blue sky during daytime&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="white and gray rock formation on blue sea under blue sky during daytime" title="white and gray rock formation on blue sea under blue sky during daytime" srcset="https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1599622893826-eac29bcef5a6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8aWNlYmVyZ3xlbnwwfHx8fDE3NjgyNTc2OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@sailingaroundtheworld">Christian Pfeifer</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p></p><p>Over the last 18&#8211;24 months, I&#8217;ve seen the same decision show up in very different companies. Startups building their first serious data platform. Scaleups trying to escape warehouse lock-in. Large enterprises re-platforming analytics.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Different contexts. Same outcome.</p><blockquote><p>&#8220;Let&#8217;s standardise on Iceberg.&#8221;</p></blockquote><p>Not because it&#8217;s trendy. Not because a vendor pushed it. But because Iceberg fixes a set of structural problems data teams have been carrying since the Hive era &#8212; and warehouses never really solved.</p><p>This article explains what problem Iceberg actually solves, why it&#8217;s gaining traction now, and how teams are using it in practice &#8212; with <strong>Snowflake</strong>, <strong>Databricks</strong>, and plain object storage like AWS S3.</p><p></p><h2>The Core Problem Iceberg Solves (That Warehouses Don&#8217;t)</h2><p>Most traditional data stacks tightly couple <strong>storage, metadata, and compute</strong>. Data lives where the engine lives. Moving compute usually means copying data. Governance, retention, and lifecycle rules get re-implemented per platform. Costs become opaque very quickly.</p><p>Warehouses optimise for convenience and performance, not architectural separation.</p><p>Iceberg flips this model!</p><p>Iceberg is not a database. <strong>It&#8217;s a table standard for data lakes.</strong></p><p>Data is stored once &#8212; typically as Parquet on object storage &#8212; and any engine that understands Iceberg can read or write it safely. That &#8220;safely&#8221; part matters more than people realise, because it&#8217;s where previous lake designs failed.</p><p></p><h2>Why Iceberg Succeeded Where Hive Tables Failed</h2><p>Hive tables and raw Parquet worked until scale and concurrency arrived. Then the cracks appeared.</p><p>They lacked atomic writes. There was no snapshot isolation. Concurrent writers were unsafe. Schema evolution was fragile and often required rewrites or downtime. These weren&#8217;t edge cases &#8212; they were structural limitations.</p><p>Iceberg fixes this by making metadata first-class.</p><p>Every Iceberg table is defined by immutable data files, immutable metadata files, and snapshots that represent a consistent table state. Writers commit atomically. Readers always see a coherent view. Time travel, rollback, concurrent writes, and controlled schema evolution become normal operations rather than special cases.</p><p>In practice, Iceberg behaves like a real database table &#8212; but lives entirely on object storage.</p><p>That&#8217;s the real breakthrough.</p><h2>Why Iceberg Is Becoming the Standard</h2><p>Iceberg isn&#8217;t winning because it&#8217;s novel. It&#8217;s winning because several things finally aligned.</p><ol><li><p>First, it is genuinely vendor-neutral. Snowflake, Databricks, AWS, Trino, Flink, and DuckDB all support it, and no single vendor controls the specification. That matters more than individual features.</p></li><li><p>Second, Iceberg enforces a clean separation of concerns. Storage lives in S3, GCS, or ADLS. Metadata lives in a catalog such as Glue, Nessie, Unity, Polaris, or Lakekeeper. Compute is whatever engine you choose today &#8212; and can change tomorrow. Each layer evolves independently.</p></li><li><p>Third, Iceberg is now operationally mature. Compaction, snapshot expiration, partition evolution, and schema evolution are no longer afterthoughts. Five years ago these were DIY problems. Today they&#8217;re table stakes.</p></li></ol><p>Finally, Iceberg is enterprise-ready. It works at petabyte scale, supports multi-writer workloads, and integrates with existing IAM and catalog systems. That full combination simply didn&#8217;t exist before.</p><p></p><h2>How Iceberg Is Used in Practice</h2><p>In Snowflake environments, Iceberg is increasingly used to decouple storage from the warehouse. Data lives in S3 or GCS. Snowflake manages Iceberg metadata and queries the data in place. Teams keep Snowflake for BI and analytics, avoid duplicating raw and curated data, and retain an exit option if pricing or strategy changes.</p><p>The trade-off is that Snowflake controls the metadata layer and optimisation is less transparent than with native tables. Performance, however, remains very strong for analytics workloads. This pattern is especially common in finance and large enterprise analytics teams.</p><p>In Databricks environments, Iceberg often plays a different role. Tables live on S3, metadata sits in Unity Catalog or an external catalog, and Spark is used for heavy transformations. Querying may happen in Databricks, Snowflake, or Trino.</p><p>Teams choose Iceberg here to avoid Delta lock-in, share tables across engines, and align with multi-cloud strategies. Databricks increasingly acts as a powerful compute engine rather than the system of record.</p><p>Then there&#8217;s the &#8220;bare metal&#8221; S3 model &#8212; where Iceberg really shines.</p><p>Here, S3 is the system of record. Iceberg provides table semantics. Glue, Nessie, or Lakekeeper manage metadata. Spark, Trino, Flink, or DuckDB handle compute. This model is popular with infra-heavy startups, platform teams, and cost-optimising organisations.</p><p>It works because storage is cheap, compute is elastic, governance logic is centralised, and there&#8217;s no per-terabyte warehouse tax. But it only scales well if teams invest in compaction, file size control, and snapshot management.</p><p></p><h2>Performance: How Iceberg Actually Compares</h2><p>Let&#8217;s be honest: Iceberg itself doesn&#8217;t make queries fast.</p><p>Performance depends on file sizes, partitioning, metadata pruning, and the compute engine. Get those wrong and Iceberg will feel slow. Get them right and it performs extremely well.</p><p>Compared to raw Parquet on S3, Iceberg is dramatically faster for analytical queries because engines can prune files using metadata and read consistent snapshots. Compared to Snowflake native tables, Snowflake still wins for small BI queries, but Iceberg becomes competitive at scale and wins on cost predictability and flexibility.</p><p>Compared to Delta Lake, performance is broadly similar. Iceberg wins on openness and portability. Delta wins on Databricks-specific optimisations.</p><p>In most real systems, the bottleneck is almost always small files and poor lifecycle management &#8212; not Iceberg itself.</p><h2>The Hidden Cost Nobody Talks About</h2><p>Iceberg introduces responsibility.</p><p>You now own compaction, snapshot expiration, retention, file size health, and cost observability across engines. Warehouses hide these concerns from you. Iceberg exposes them.</p><p>This is exactly why governance and FinOps around Iceberg are becoming critical. Once data is shared across multiple engines, someone needs to own its lifecycle and economics.</p><p>I ran into these trade-offs directly while building <a href="https://github.com/dativo-io/dativo-ingest">Dativo Ingest</a>, an open-source ingestion project built around Iceberg. Once Iceberg is your contract, you stop optimising for a single engine and start thinking in terms of table health, commits, and long-term operability.</p><h2>When Iceberg Is the Wrong Choice</h2><p>Iceberg is not a silver bullet.</p><p>If you only need BI on a small amount of data, don&#8217;t want to operate data infrastructure, or have no Spark or Trino experience, a warehouse alone is still a perfectly reasonable choice.</p><p>Iceberg pays off when flexibility, scale, and long-term economics matter.</p><p></p><h2>Final Thought</h2><p>Iceberg isn&#8217;t popular because it&#8217;s fashionable.</p><p>It&#8217;s popular because data teams are tired of rebuilding the same foundations in every warehouse.</p><p>Iceberg gives you a durable data layer, engine independence, and predictable long-term economics. And once teams adopt it, they rarely go back.</p><p>If you&#8217;re designing a data platform today, Iceberg should at least be in the conversation &#8212; even if Snowflake or Databricks still sit on top.</p><p>That&#8217;s the real shift we&#8217;re seeing.</p>]]></content:encoded></item><item><title><![CDATA[AI Engineers Need Culture Too]]></title><description><![CDATA[How .cursor/rules saved my pet project (and my sanity)]]></description><link>https://blog.dativo.io/p/ai-engineers-need-culture-too</link><guid isPermaLink="false">https://blog.dativo.io/p/ai-engineers-need-culture-too</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Thu, 08 Jan 2026 16:52:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!e7bc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e7bc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e7bc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!e7bc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!e7bc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!e7bc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e7bc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png" width="1024" height="1536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1536,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2110695,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/183912728?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!e7bc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!e7bc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!e7bc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!e7bc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc78a1f22-50d3-4c14-bd69-761e7ebe03d4_1024x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>TL;DR:</strong></p><ul><li><p><em>AI coding assistants can feel like magic&#8230; until they go off the rails and rewrite your repo just for a sake of having some unit test passing.</em></p></li><li><p><em>My AI pair programmer started refactoring my project unprompted and hallucinating frameworks &#8211; chaos ensued.</em></p></li><li><p><em>The solution was writing a </em><code>.cursor/rules</code><em> file: essentially an <strong>AI rulebook</strong> to enforce the coding standards and stop the madness.</em></p></li><li><p><em>With a custom rulebook in place, the Coding Assistant became a helpful( finger crossed) teammate again &#8211; smaller diffs, relevant suggestions, and far fewer &#8220;WTF?&#8221; moments in code review.</em></p></li><li><p><em>AI assistants need onboarding and culture too; we can&#8217;t just unleash them without guidance. Here&#8217;s how a few YAML guidelines turned my AI from rogue to rockstar.</em></p></li></ul><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The Honeymoon Phase (AI Feels Magical)</h2><p>Guilty: the first few times with an AI pair programmer my pet project ( check it out - <a href="https://github.com/dativo-io/dativo-ingest">https://github.com/dativo-io/dativo-ingest</a> , the headless data ingestion platform) felt like living in the &#8220;Ghost in the shell&#8221;. I&#8217;d type a command, and <em>poof!</em> the solution appeared as if my Cursor had a mind of the senior engineer. Functions that used to take an afternoon to debug practically wrote themselves while I sipped coffee. I was <strong>giddy with power</strong> &#8211; code reviews came back with fewer nitpicks, and I was closing tickets faster than ever. It was the <em>honeymoon phase</em>, when the AI could do no wrong( I prefer to keep my eyes closed). <strong>At first, coding with AI felt like unlocking cheat mode on reality.</strong> I bragged to my friends that I had a tireless junior dev who never sleeps and never complains, ready to do the grunt work at 3 AM. What could possibly go wrong?</p><p></p><h2>The Hangover (When the AI Goes Rogue)</h2><p>One night, I asked the AI to add a single field to the some class. The PR? <strong>38 files changed.</strong> It rewrote our CLI, abstracted config loaders, and migrated connectors to a fictional framework. It was confident. It was wrong. I went from AI-enhanced to <strong>AI-endangered</strong>.</p><p>Another real example: before I wrote <code>.cursor/rules</code>, the AI submitted a commit titled <em>&#8220;Major CLI Refactoring&#8221;</em> and wasn&#8217;t kidding. It blew up our <code>cli.py</code> into <code>cli_commands.py</code>, <code>startup.py</code>, <code>job_executor.py</code>, <code>connectors/factory.py</code>, and more &#8211; 9 files touched, 2,000+ LOC rewritten. All this&#8230; when I just wanted a new job flag. LOL(</p><p>In another masterpiece, I asked for incremental sync support. The AI replied with a <strong>&#8220;Unified Incremental Strategy Framework&#8221;</strong> &#8211; complete with file sync, cursor DB sync, Airbyte catalog support, state JSONs, and a cloud deployment story. Impressive. Also: completely <em>unified</em> overkill.</p><p>I definetly needed to tame the beast.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GtlE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GtlE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!GtlE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!GtlE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!GtlE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GtlE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png" width="1024" height="1536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1536,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2286773,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/183912728?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GtlE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!GtlE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!GtlE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!GtlE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad9be37-3fa0-439b-ae2f-a1dc0d921c5a_1024x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Writing The Rulebook (What <code>.cursor/rules</code> Is and How It Works)</h2><p>Enter <code>.cursor/rules</code>, a Markdown YAML-adjacent file I dropped into our repo. Cursor reads it. It learns(at least claims so). It stops trying to redesign the system every time I blink.</p><p>I tried to define law like:</p><pre><code># .cursor/rules
- &#8220;Every patch MUST reduce entropy.&#8221;
- &#8220;Implement only what is requested and stop once acceptance criteria are met.&#8221;
- &#8220;No refactoring outside scope.&#8221;
- &#8220;No new platforms, frameworks, or reports.&#8221;
- &#8220;Use YAML config over hardcoded logic.&#8221;</code></pre><p>Basically, It&#8217;s policy-as-code for your AI coworker.</p><p></p><h2>How LLMs changed the behaviour</h2><h3>Before: The Chaos Commits</h3><ul><li><p><strong>&#8220;Connector Registry V2&#8221;</strong> introduced external catalog loading, CLI support, schema changes, doc rewrites &#8211; all in <strong>one commit</strong>.</p></li><li><p>Schema validation? Ignored. New YAML fields like <code>streams_default</code> showed up unannounced, breaking CI.</p></li><li><p>Env vars as feature toggles? Oh yeah. I found <code>ENABLE_FANCY_MODE=true</code> buried in a helper.</p></li></ul><h3>After <code>.cursor/rules</code>: The Calm</h3><ul><li><p><strong>Mimesis connector PR:</strong> 5 files, clean logic, test included, YAML-configured, schema-valid &#8211; no chaos.</p></li><li><p><strong>Metrics Export:</strong> Added <code>metrics.py</code>, config toggles in <code>runner.yaml</code>, schema updated, validation green.</p></li><li><p><strong>Schema Guardrails:</strong> When adding <code>external_id</code>, the AI updated <code>connectors.schema.json</code> <strong>in the same commit</strong> &#8211; no CI surprises.</p></li></ul><p>The AI learned. No more feature-farming. Just real <s>hardcore</s> engineering.</p><h3>Before vs After: Diff Deltas</h3><pre><code># Before `<code>cursor/rules`
</code>Commit: &#8220;Unified Incremental Sync&#8221;
Files changed: 15
Includes: new abstractions, plugin system, state management</code></pre><pre><code># After `.cursor/rules`
Commit: &#8220;Add external_id to connector schema&#8221;
Files changed: 2
Includes: YAML field + validation schema</code></pre><p>This shift was no accident &#8211; it was enforcement. </p><p></p><h2>Additional Advice from <code>.cursor/rules</code> and the Field</h2><p>Here are more rules and tips pulled from the real-world <a href="https://github.com/dativo-io/dativo-ingest/blob/main/.cursor/rules/dativo-ingest-rules/RULE.md">dativo-ingest</a>&#8217;s rules file and other engineers who&#8217;ve fought the same fight:</p><h3>&#9989; Be Patch-Scoped</h3><blockquote><p>&#8220;A patch MUST only satisfy the requested acceptance criteria. All other behavior is out of scope.&#8221;</p></blockquote><p>Avoid change creep. Don&#8217;t sneak in that &#8220;quick cleanup&#8221; or bonus refactor.</p><h3>&#9989; Respect the Interface</h3><blockquote><p>&#8220;Never change interfaces or filenames unless explicitly asked.&#8221;</p></blockquote><p>Humans memorize file paths and function names. AI must not rename them casually.</p><h3>&#9989; Just Enough Testing</h3><blockquote><p>&#8220;Do not write tests for unchanged or unrelated components.&#8221;</p></blockquote><p>No 400-line test diffs when one new function was added.</p><h3>&#9989; Always Validate Config</h3><blockquote><p>&#8220;New config fields MUST be described in <code>*.schema.json</code> and tested against examples.&#8221;</p></blockquote><p>YAML-only is sacred. Forgetting the schema? Expect the wrath of CI.</p><h3>&#9989; Prefer Simplicity</h3><p>Inspired by Addy Osmani&#8217;s 70% Rule: <em>&#8220;AI gets you 70% of the way fast, but that last 30% is where bugs hide.&#8221;</em></p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:152543901,&quot;url&quot;:&quot;https://addyo.substack.com/p/the-70-problem-hard-truths-about&quot;,&quot;publication_id&quot;:2115638,&quot;publication_name&quot;:&quot;Elevate&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!8WxC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3704470-b6d5-48a9-a9d1-564bd833fc5c_1280x1280.png&quot;,&quot;title&quot;:&quot;The 70% problem: Hard truths about AI-assisted coding&quot;,&quot;truncated_body_text&quot;:&quot;After spending the last few years embedded in AI-assisted development, I've noticed a fascinating pattern. While engineers report being dramatically more productive with AI, the actual software we use daily doesn&#8217;t seem like it&#8217;s getting noticeably better. What's going on here?&quot;,&quot;date&quot;:&quot;2024-12-04T19:12:33.735Z&quot;,&quot;like_count&quot;:1514,&quot;comment_count&quot;:75,&quot;bylines&quot;:[{&quot;id&quot;:11623675,&quot;name&quot;:&quot;Addy Osmani&quot;,&quot;handle&quot;:&quot;addyosmani&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cee7ba66-e656-4450-a0ed-c951c27ee228_1080x1080.jpeg&quot;,&quot;bio&quot;:&quot;Engineering leader at Google, #1 Bestselling Amazon author, Award-winning engineer and international speaker. I want to help you succeed. My writing is about software engineering, motivation, and leadership.&quot;,&quot;profile_set_up_at&quot;:&quot;2023-11-19T09:33:50.395Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-11-29T05:13:59.015Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:2120503,&quot;user_id&quot;:11623675,&quot;publication_id&quot;:2115638,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:2115638,&quot;name&quot;:&quot;Elevate&quot;,&quot;subdomain&quot;:&quot;addyo&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Addy Osmani's newsletter on elevating your effectiveness. Join his community of 600,000 readers across social media.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3704470-b6d5-48a9-a9d1-564bd833fc5c_1280x1280.png&quot;,&quot;author_id&quot;:11623675,&quot;primary_user_id&quot;:11623675,&quot;theme_var_background_pop&quot;:&quot;#FF5CD7&quot;,&quot;created_at&quot;:&quot;2023-11-19T09:34:16.230Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Addy Osmani&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:2207048,&quot;user_id&quot;:11623675,&quot;publication_id&quot;:2192362,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:2192362,&quot;name&quot;:&quot;Large Scale Web Apps&quot;,&quot;subdomain&quot;:&quot;largeapps&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Learn tools and techniques to build and maintain large-scale React web applications.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9a53806-0d0b-4025-b992-145baca33809_512x512.png&quot;,&quot;author_id&quot;:11623675,&quot;primary_user_id&quot;:98078198,&quot;theme_var_background_pop&quot;:&quot;#99A2F1&quot;,&quot;created_at&quot;:&quot;2023-12-20T10:59:33.318Z&quot;,&quot;email_from_name&quot;:&quot;Addy and Hassan from Large Scale Apps&quot;,&quot;copyright&quot;:&quot;Addy Osmani and Hassan Djirdeh&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:2224891,&quot;user_id&quot;:11623675,&quot;publication_id&quot;:2209631,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:2209631,&quot;name&quot;:&quot;Deep Voice&quot;,&quot;subdomain&quot;:&quot;deepvoice&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;A newsletter on how to get more motivated. Brought to you by Addy Osmani.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/328afbab-a375-4ffd-ac83-40300eefc225_1280x1280.png&quot;,&quot;author_id&quot;:11623675,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#786CFF&quot;,&quot;created_at&quot;:&quot;2023-12-28T20:05:46.081Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Addy Osmani&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;,&quot;source&quot;:null}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://addyo.substack.com/p/the-70-problem-hard-truths-about?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!8WxC!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3704470-b6d5-48a9-a9d1-564bd833fc5c_1280x1280.png" loading="lazy"><span class="embedded-post-publication-name">Elevate</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">The 70% problem: Hard truths about AI-assisted coding</div></div><div class="embedded-post-body">After spending the last few years embedded in AI-assisted development, I've noticed a fascinating pattern. While engineers report being dramatically more productive with AI, the actual software we use daily doesn&#8217;t seem like it&#8217;s getting noticeably better. What's going on here&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">2 years ago &#183; 1514 likes &#183; 75 comments &#183; Addy Osmani</div></a></div><p>AI should avoid cleverness. If a simpler solution exists, pick it.</p><h3>&#9989; Don&#8217;t Narrate, Ship</h3><blockquote><p>&#8220;Avoid over-documenting implementation details or rationale unless explicitly required.&#8221;</p></blockquote><p>AI-generated README essays? Save it. We want user-facing docs, not ChatGPT&#8217;s inner monologue.</p><h2>The Bigger Picture</h2><p>Now, every AI task respects my <strong>GitOps-first</strong>, config-driven architecture. New flags go in <code>runner.yaml</code>, not <code>os.getenv</code>. Every schema change comes with a schema update and test. Our platform evolved. Our AI did too.</p><p><code>.cursor/rules</code> isn&#8217;t just instructions &#8211; it&#8217;s <strong>culture</strong>.</p><h2>AI Engineers Need Culture Too</h2><p>We wouldn&#8217;t onboard a junior dev by saying &#8220;just read the codebase.&#8221; We give them docs, a buddy, some rules. We didn&#8217;t do that for our AI&#8230; until it started acting like a junior dev with caffeine poisoning.</p><p><code>.cursor/rules</code> became my AI onboarding playbook. It cut entropy, limited scope creep, and gave the AI the same values we hold. And when your AI shares your values? It becomes a real teammate.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Asset-Centric Orchestration: Focus on "What," Not "How"]]></title><description><![CDATA[Why data engineering is stuck in a binary deadlock]]></description><link>https://blog.dativo.io/p/asset-centric-orchestration-focus</link><guid isPermaLink="false">https://blog.dativo.io/p/asset-centric-orchestration-focus</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Tue, 30 Dec 2025 14:10:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qTVw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I've built data platforms for years, and one thing is clear: modern "no-code" ingestion tools promise the moon but often leave you hanging. The current research establishes that data engineering is stuck in a <strong>binary deadlock</strong>. Organizations must choose between the <strong>Managed Service Tax</strong> (convenience at the cost of opaque billing) and the <strong>Operational Tax</strong> (flexibility at the cost of human capital).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qTVw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qTVw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qTVw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qTVw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qTVw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qTVw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg" width="914" height="545" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:545,&quot;width&quot;:914,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:183676,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/182860362?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qTVw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qTVw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qTVw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qTVw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F887dde6a-fdad-43b8-8b5d-48d12ea61158_914x545.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A Portrait of Guan Zhong from a segment of Wu family shrines stone-relief.</figcaption></figure></div><p><em>Two thousand years ago, the Chinese statesman <a href="https://en.wikipedia.org/wiki/Guan_Zhong">Guan Zhong</a> faced a familiar dilemma: tax people directly and risk revolt, or lower taxes and starve the state. His solution was neither. He eliminated most direct taxes entirely and moved revenue into infrastructure&#8212;salt, iron, trade&#8212;making taxation predictable, indirect, and almost invisible.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Modern data ingestion is stuck in the same deadlock. Managed services impose a hidden tax through opaque pricing and resync shocks. Code-first pipelines impose an operational tax through human toil. As a engineering leader who has faced late-night firefighting and &#8220;data swamps,&#8221; I&#8217;ve realized that the era of fragmented &#8220;best-of-breed&#8221; tooling is concluding, replaced by integrated platforms that offer streamlined workflows at the cost of increasing vendor lock-in and financial unpredictability.I believe the industry shall follow the third path: shifting cost upstream into structure, specs, and contracts&#8212;so ingestion stops taxing people at all.</p><h2><strong>The Binary Deadlock: Fivetran, dbt, and the &#8220;Fusion&#8221; Shift</strong></h2><p>The October 2025 merger between Fivetran and dbt Labs created a unified big data ETL player. This signals a shift toward integrated stacks where ingestion and transformation are unified. However, this consolidation brings risks for all the data folks:</p><ul><li><p><strong>The Licensing Divergence:</strong> While dbt Core remains Apache 2.0, the "future vision" for transformation lies in <strong>dbt Fusion</strong>. Written in Rust, <a href="https://www.getdbt.com/blog/new-code-new-license-understanding-the-new-license-for-the-dbt-fusion-engine">Fusion is licensed under the </a><strong><a href="https://www.getdbt.com/blog/new-code-new-license-understanding-the-new-license-for-the-dbt-fusion-engine">Elastic License v2 (ELv2)</a></strong>&#8212;a "source-available" license that prohibits using it to provide managed services to others.</p></li><li><p><strong>The Innovation Tax:</strong> As dbt Core enters <a href="https://www.tobikodata.com/blog/dbt-fusion-death-of-dbt-core">"maintenance mode"</a> innovation is reserved for proprietary engines, forcing organizations into a "dangerously dependent" relationship with a single vendor.</p></li></ul><h2><strong>GUI-First vs. Specification-First: The Hidden Operational Costs</strong></h2><p>Airbyte and Fivetran shine with slick UIs. But that magic comes at the cost of control. When engineering talent spends <strong><a href="https://ctomagazine.com/tackling-tech-debt-boosts-agility-data-engineering/">30% to 40% of their time "firefighting"</a></strong>&#8212;fixing broken pipelines and resolving sync failures&#8212;the long-term health of the platform enters a state of terminal decline.</p><ul><li><p><strong>Ingestion Debt:</strong> According to the <a href="https://ctomagazine.com/tackling-tech-debt-boosts-agility-data-engineering/">2024 Global Data Engineering Report</a>, <strong>64% of data leaders</strong> report that technical debt significantly limits their ability to achieve business goals.</p></li><li><p><strong>The &#8220;Fragility Loop&#8221;:</strong> Driven by pressure to deliver, engineers adopt brittle scripts that create a &#8220;data death cycle,&#8221; where every minor change requires manual oversight and eventually <a href="https://datalere.com/articles/two-realities-behind-data-engineering-delays">consumes 60% to 80% of the team&#8217;s capacity</a>.</p></li></ul><h2><strong>&#8220;Open-Source&#8221; Connectors: the limits and  solution</strong></h2><p>Meltano and the Singer ecosystem offer a code-first illusion, but the reality is a maintenance nightmare of outdated taps. To solve the brittleness of  ad-hoc fixing the single opensource provider, I advocate for <strong>Spec-Driven Data pipeline</strong>, which<strong> </strong>declares expected input and desired output. The connectors shall be serving as purely technical providers rather:</p><ul><li><p><strong>Precision over Prompting:</strong> Spec-Driven Development should use formal, machine-readable specifications as a single source of truth for both input and output. This structured collaboration aims for <strong>95% or higher accuracy</strong> in implementing specs on the first attempt.</p></li><li><p><strong>Spec-Driven Workflow:</strong> By moving intellectual effort &#8220;upstream&#8221; to the <strong>Specify</strong> and <strong>Plan</strong> phases, teams capture the &#8220;why&#8221; behind technical choices, preventing &#8220;intent-vs-implementation drift&#8221;. Of course the tehnical aspects of connectors and data movers are still present, but they are not so important and can be replaced with other providers if there is such need. </p></li></ul><h2><strong>Data Contracts: Beyond &#8220;Bring First-Decide Later&#8221;</strong></h2><p>The traditional approach of ingesting raw data and transforming it later leads to data swamps. Data quality issues cost organizations an average of <strong><a href="https://angrynerds.co/blog/data-engineering-solutions-4-business-challenges/">$12.9 million annually</a></strong><a href="https://angrynerds.co/blog/data-engineering-solutions-4-business-challenges/">.</a></p><ul><li><p><strong>Fail-Closed Validation:</strong> We declare machine-readable YAML contracts at the source. If a source system change violates the contract, the deployment is rejected&#8212;a <strong>&#8220;fail-closed&#8221;</strong> gate that prevents downstream pollution.</p></li><li><p><strong>Schema-as-Code:</strong> By versioning every asset's schema in Git and running strict validation at ingestion time, we would treat data movement with the same rigor as application code.</p></li></ul><h2><strong>Asset-Centric Orchestration: Focus on &#8220;What,&#8221; Not &#8220;How&#8221;</strong></h2><p>Traditional orchestrators like Airflow are task-centric, focusing on workflow execution. We should utilize an <strong>asset-centric</strong> approach in order to keep an eye on the thing which is really important - &#8220;What&#8221; we are expecting to get as the result of data pipeline.</p><ul><li><p><strong>Native Lineage:</strong> In an asset-centric model, the focus have be on the data products produced. Then lineage is captured automatically in a unified graph, making it easier to trace origin and transformation.</p></li><li><p><strong>One Job per one Asset:</strong> If we would enforce a one-asset-per-job design pattern, this will simplify retry logic and ensures that if one of fifty tables fails, we only re-run that specific job rather than retrying a massive, entangled pipeline.</p></li></ul><p></p><h2><strong>The &#8220;Third Way&#8221;</strong></h2><p>After slogging through these limitations, I built <a href="https://github.com/dativo-io/dativo-ingest">dativo-ingest</a> to test the hypothesis of breaking the binary deadlock.</p><ul><li><p><strong>Headless &amp; Config-Driven:</strong> No UI needed; pipelines are defined in YAML under GitOps, making deployments auditable.</p></li><li><p><strong>Lakehouse-Native:</strong> It writes Apache Iceberg tables directly and update metadata via Nessie, ensuring ACID semantics and propagation of FinOps tags directly into table properties.</p></li><li><p><strong>Metadata-Driven Production Readiness:</strong> I specifically designed it for high-scale environments like Databricks Lakeflow, generating production-ready code automatically from formal specs.</p></li></ul><p>I am going to continue posting about my findings, but one more time to underline - Dativo Ingest isn&#8217;t just about connectors - they are expremely replacable in the YAML configurations. The ineventables there are constraints&#8212;like upfront schemas and tags&#8212;because they force the discipline required to escape the data death cycle and reclaim technical agency.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Ingestion into DataLake Without Illusions]]></title><description><![CDATA[When the Tools Fall Short]]></description><link>https://blog.dativo.io/p/ingestion-into-datalake-without-illusions</link><guid isPermaLink="false">https://blog.dativo.io/p/ingestion-into-datalake-without-illusions</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Sun, 28 Dec 2025 14:15:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!C0V2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C0V2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C0V2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!C0V2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!C0V2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!C0V2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C0V2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/afff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3037496,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/182762060?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C0V2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!C0V2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!C0V2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!C0V2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafff7cf5-6505-4ef1-87c4-7a64d2bef6fb_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Classical myth of an order imposed on chaos. (<em>Chaos &#8594; Structure &#8594; Governance</em>)</figcaption></figure></div><p></p><p>The foundational promise of the modern data stack was a seductive one: that data ingestion could be reduced to a commodity utility, a set of plug-and-play connectors that would liberate data engineers from the drudgery of "glue code" and allow them to focus on high-value analytics. For nearly a decade, this narrative fueled the meteoric rise of companies like Fivetran and dbt Labs, predicated on the idea that the "best-of-breed" modularity of a fragmented stack was inherently superior to the monolithic architectures of the past. However, as these systems have encountered the reality of petabyte-scale ingestion, complex multi-tenant requirements, and the stringent demands of the data lakehouse, the "Modern Data Stack" has transitioned from a celebrated innovation to a source of profound technical and economic disillusionment.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data, Engineering, and Beyond! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Current practitioner sentiment across communities like Reddit and Hacker News suggests that the Data Stack is increasingly viewed as a "<a href="https://en.wikipedia.org/wiki/Rube_Goldberg_machine#:~:text=A%20Rube%20Goldberg%20machine%2C%20named,in%20achieving%20a%20stated%20goal.">Rube Goldberg-esque</a>"<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> infrastructure that consumes up to 80% of an engineering team&#8217;s bandwidth just to keep the lights on. The modular dream has mutated into a maintenance nightmare characterized by unmanageable tool sprawl, opaque usage-based costs, and a fragmentation of metadata that leaves organizations with no single source of truth. This report analyzes the technical failures of the industry&#8217;s primary ingestion tools, the strategic implications of the recent Fivetran&#8211;dbt merger, and the emerging engineering philosophies designed to move the industry past the illusions of "plug-and-play" data movement.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s7nU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s7nU!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 424w, https://substackcdn.com/image/fetch/$s_!s7nU!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 848w, https://substackcdn.com/image/fetch/$s_!s7nU!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 1272w, https://substackcdn.com/image/fetch/$s_!s7nU!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s7nU!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif" width="428" height="302" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:302,&quot;width&quot;:428,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;undefined&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="undefined" title="undefined" srcset="https://substackcdn.com/image/fetch/$s_!s7nU!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 424w, https://substackcdn.com/image/fetch/$s_!s7nU!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 848w, https://substackcdn.com/image/fetch/$s_!s7nU!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 1272w, https://substackcdn.com/image/fetch/$s_!s7nU!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4caba4f7-c3e1-4568-bf16-391d78b4c533_428x302.gif 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>The Technical Erosion of Managed Ingestion Tools</strong></h2><p>The primary critique leveled against the current generation of ingestion tools&#8212;specifically Fivetran, Airbyte, and Meltano&#8212;is that they often prioritize ease of initial setup over long-term operational stability and cost predictability. While Fivetran may offers the "purest" managed experience, its architectural choices and billing practices have led to widespread skepticism among practitioners who value transparency and control. Conversely, open-source alternatives like Airbyte and Meltano, while offering greater flexibility, introduce significant operational "taxes" that are often understated in their marketing materials.</p><p>Building a data lake is often perceived as a simple act of "dumping" data into cheap storage, but the reality of ingestion involves complex trade-offs between managed convenience and open-source flexibility. Organizations frequently underestimate the "operational taxes" associated with both paths, leading to either unpredictable billing or significant engineering debt.</p><p></p><h3><strong>1. Fivetran and the Black-box Illusion of Predictability</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OS2D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OS2D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!OS2D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!OS2D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!OS2D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OS2D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png" width="1024" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:608,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OS2D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!OS2D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!OS2D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!OS2D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbfff5b-8279-480d-873d-82a0bf63135d_1024x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A black box without understanding what is happening inside.</figcaption></figure></div><p>Fivetran is often cited as the "purest" managed experience, yet practitioners increasingly voice skepticism regarding its architectural choices and billing transparency. <a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> Specifically:</p><h4><strong>The &#8220;MAR&#8221; Billing and &#8220;Resync Tax&#8221;</strong></h4><p>The central friction point is the <strong>Monthly Active Rows (MAR)</strong> model. Users frequently encounter &#8220;bill shock&#8221; because the logic for what constitutes an &#8220;active&#8221; row is often opaque and proprietary.</p><ul><li><p><strong>Unpredictability:</strong> Technical operations, such as a database migration or a broad column-type update, can trigger massive row counts, effectively acting as a &#8220;resync tax&#8221; on necessary maintenance.</p></li><li><p><strong>Throughput Constraints:</strong> Despite high costs, users have reported transfer speeds topping out at 5MB/s during full resyncs, making recovery for terabyte-scale databases a process that can take weeks.</p></li></ul><p></p><h4><strong>Architectural Opacity</strong></h4><p>Critics describe Fivetran as a "black box" where business logic is buried, making audits and incident debugging complex undertakings. A specific technical critique involves Fivetran's allegedly problematic handling of the Postgres Write-Ahead Log (WAL); if the ingestion engine requires a log that has been rotated, the pipeline may get "stuck," necessitating a full, expensive resync.</p><h3><strong>2. The Open-Source Reality: Navigating Operational Taxes</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rLIx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rLIx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!rLIx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!rLIx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!rLIx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rLIx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png" width="1024" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:608,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rLIx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!rLIx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!rLIx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!rLIx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f5d368-a326-45ca-9cf0-f7edd35d0e1d_1024x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Install now , regret later.</figcaption></figure></div><p></p><p>Open-source alternatives like Airbyte and Meltano offer transparency and avoid vendor lock-in, but they introduce a "DevOps tax" that is often understated.</p><h4><strong>The Airbyte &#8220;DevOps Tax&#8221;</strong></h4><p>Self-hosting Airbyte requires significant infrastructure management, including Kubernetes orchestration, backups, and security patching.</p><ul><li><p><strong>Maintenance Effort:</strong> While Airbyte offers 600+ connectors, only about 150 are officially maintained. The remainder are community-contributed and can be &#8220;brittle and immature&#8221; at scale, forcing engineers to spend substantial time &#8220;babysitting&#8221; and debugging silent failures.</p></li><li><p><strong>Operational Overhead:</strong> Organizations often underestimate the labor hours required; some users report that self-hosting issues can occur several times a month, each taking hours to resolve.</p></li></ul><h3><strong>The Meltano &#8220;Engineering Tax&#8221;</strong></h3><p>Meltano follows a &#8220;code-first&#8221; philosophy that appeals to developers but imposes a steep learning curve for non-technical users.</p><ul><li><p><strong>Talent Requirements:</strong> Managing Meltano's plugin-based architecture requires specialized skills in Python and YAML. For teams lacking this foundation, the cost of compute and support labor often outweighs any software savings. </p></li><li><p><strong>Dependency Hell:</strong> Because Meltano orchestrates various plugins (e.g., Singer taps/targets, dbt models), maintaining compatibility after updates becomes a complex governance task that can lead to &#8220;pipeline rot.&#8221;</p></li></ul><p></p><h3><strong>3. Total Cost of Ownership (TCO) Spectrum</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-SW7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-SW7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 424w, https://substackcdn.com/image/fetch/$s_!-SW7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 848w, https://substackcdn.com/image/fetch/$s_!-SW7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 1272w, https://substackcdn.com/image/fetch/$s_!-SW7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-SW7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp" width="640" height="360" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:360,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The Matrix's real-world legacy - from red pill incels to ...&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The Matrix's real-world legacy - from red pill incels to ..." title="The Matrix's real-world legacy - from red pill incels to ..." srcset="https://substackcdn.com/image/fetch/$s_!-SW7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 424w, https://substackcdn.com/image/fetch/$s_!-SW7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 848w, https://substackcdn.com/image/fetch/$s_!-SW7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 1272w, https://substackcdn.com/image/fetch/$s_!-SW7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a5606f0-744d-4380-9b08-5774b4dbc8a4_640x360.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Red pill of truth or blue pill of denial.</figcaption></figure></div><p>The decision between managed and open-source is essentially a choice of where to pay the "tax": in OpEx fees or human capital. And the human capital is the largest and most frequently underestimated expense in the open-source path. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iw-9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iw-9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 424w, https://substackcdn.com/image/fetch/$s_!iw-9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 848w, https://substackcdn.com/image/fetch/$s_!iw-9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 1272w, https://substackcdn.com/image/fetch/$s_!iw-9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iw-9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png" width="986" height="390" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1fc70b9-70d5-4555-9417-c5343df33614_986x390.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:390,&quot;width&quot;:986,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.dativo.io/i/182762060?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iw-9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 424w, https://substackcdn.com/image/fetch/$s_!iw-9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 848w, https://substackcdn.com/image/fetch/$s_!iw-9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 1272w, https://substackcdn.com/image/fetch/$s_!iw-9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fc70b9-70d5-4555-9417-c5343df33614_986x390.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>4. Technical Debt: The Data Swamp Consequences</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4jvK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4jvK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4jvK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg" width="768" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!4jvK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Poorly managed ingestion, regardless of the tool, leads to &#8220;ingestion debt&#8221; and the creation of a &#8220;data swamp.&#8221;</p><ul><li><p><strong>Schema Drift:</strong> Source systems evolve independently; without automated detection (CDC), a simple column rename can break downstream dashboards instantly.</p></li><li><p><strong>Over-partitioning:</strong> Creating too many small files can slow query performance significantly. A partitioned query may take 30 seconds, while an unpartitioned one can take over 4 minutes.</p></li><li><p><strong>Observability Gaps:</strong> When pipelines lack metadata and lineage, debugging becomes nearly impossible. Practitioners report that analytics teams can spend up to 35% of their time explaining why numbers differ rather than generating insights.</p></li></ul><h3><strong>5. Strategic Synthesis: Breaking the Binary Choice with the &#8220;Third Way&#8221;</strong></h3><p></p><p>Modern software engineering" has traditionally forced teams into a binary deadlock: paying a heavy "managed service tax" (high variable costs and vendor lock-in) or a heavy "operational tax" (unmanageable infrastructure and labor overhead). Recognising that neither path is ideal, I am always curious about  "Third Way" strategy &#8212;a hybrid approach designed to combine the benefits of both while ensuring the <em>combined</em> cost is lower than either individual option.</p><h4><strong>The Philosophy of Combined Efficiency</strong></h4><p>To address this binary deadlock, I built <a href="https://github.com/dativo-io/dativo-ingest">dativo-ingest</a> as a framework that rejects the trade-off between control and convenience. This strategy optimizes the tax burden by accepting a minimized version of both:</p><ul><li><p><strong>Eliminating the Managed Markup:</strong> Instead of paying high MAR premiums for simple replication, the Third Way uses automated, Python-native libraries to control exactly what is synced. This avoids the &#8220;money grabs&#8221; common in managed platforms where unrequested tables or vendor-default settings inflate bills.</p></li><li><p><strong>Decoupling from Infrastructure Chaos:</strong> By utilizing lightweight frameworks that live in Git and deploy via standard CI/CD, teams avoid the &#8220;DevOps tax&#8221; of managing dedicated Kubernetes clusters. Ingestion is treated as a software component within the existing engineering stack, not as a standalone platform to be &#8220;babysat&#8221;.</p></li></ul><h4><strong>Implementation: Spec-Driven Development (SDD)</strong></h4><p>I believe, that a core pillar of this Third Way is <strong>Spec-Driven Development (SDD)</strong>. Rather than relying on &#8220;vibe-coding&#8221; (ad-hoc prompts and conversational configuration), frameworks like <code>dativo-ingest</code> utilize formal specifications as the source of truth.</p><ul><li><p><strong>Architectural Firewalls:</strong> SDD allows teams to build production-ready pipelines with up to 95% accuracy on the first implementation. This ensures that human developers and AI agents alike can maintain consistent, maintainable, and highly auditable pipelines.</p></li><li><p><strong>Unified Control Plane:</strong> This model permits organizations to pay a negligible &#8220;operational tax&#8221; for code maintenance and a small &#8220;service tax&#8221; for the underlying compute (e.g., Lambda or serverless executors). The result is an ingestion engine that provides the transparency of open-source with the speed of managed automation, without the scaling headaches of either.</p></li></ul><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>In the context of actual infrastructure, the term implies systems that are:</p><ul><li><p><strong>Overly complex:</strong> Involving many unnecessary steps or components.</p></li><li><p><strong>Inefficient:</strong> Disproportionate to the simplicity of the goal they achieve.</p></li><li><p><strong>Convoluted:</strong> Difficult to understand, manage, or maintain due to their complexity.</p></li><li><p><strong>Prone to failure:</strong> A breakdown in any single, non-essential link can disrupt the entire process.</p></li></ul></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>The perfect problem statement - &#8220;Why is it so hard to buy things that work well?&#8221; <a href="https://news.ycombinator.com/item?id=42430450">https://news.ycombinator.com/item?id=42430450</a></p><p></p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[[E]Commerce 101: Architecture.]]></title><description><![CDATA[Behind the Scenes: The Architecture Powering E-Commerce]]></description><link>https://blog.dativo.io/p/ecommerce-101-architecture</link><guid isPermaLink="false">https://blog.dativo.io/p/ecommerce-101-architecture</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Tue, 03 Dec 2024 12:47:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nqOY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nqOY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nqOY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nqOY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nqOY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nqOY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nqOY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg" width="1456" height="2188" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2188,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3552637,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nqOY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nqOY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nqOY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nqOY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe15672eb-f114-4e31-aedc-a349c0821d72_3280x4928.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image: Marrakesh Market by <a href="https://unsplash.com/@anniespratt">Annie Spratt (via Unsplash)</a></figcaption></figure></div><h1>Introduction</h1><p>Products are often sold through multiple sales channels, or &#8220;storefronts,&#8221; each with unique marketing and point-of-sale surfaces. Managing these channels typically spans teams, organizations, and even company boundaries. Each storefront is powered by a diverse set of Revenue and Engagement services&#8212;collectively referred to as the &#8220;Commerce Platform&#8221;(sometimes also &#8220;Growth Platform&#8221;).</p><p></p><p></p><h1>The Platform</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0s0I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0s0I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 424w, https://substackcdn.com/image/fetch/$s_!0s0I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 848w, https://substackcdn.com/image/fetch/$s_!0s0I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 1272w, https://substackcdn.com/image/fetch/$s_!0s0I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0s0I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png" width="1456" height="644" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:644,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:192116,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0s0I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 424w, https://substackcdn.com/image/fetch/$s_!0s0I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 848w, https://substackcdn.com/image/fetch/$s_!0s0I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 1272w, https://substackcdn.com/image/fetch/$s_!0s0I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F196f25e5-95d4-430b-9b18-d768ca112cc5_3336x1476.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3><strong>1. Customer Experience Layer</strong></h3><p><strong>Focuses on customer interaction and engagement.</strong></p><h4><strong>Products &amp; Pricing</strong></h4><p><strong>Functionality:</strong></p><ul><li><p>Manage catalog, pricing, incentives, and SKU relationships in a centralized system.</p></li><li><p>Enable dynamic pricing models, experimental price testing, and reseller-specific discounts.</p></li></ul><p><strong>Challenges:</strong></p><ol><li><p><strong>Marketplaces: Managing Region-Specific Pricing and Availability Across Multiple Sellers</strong></p><p>Implementing geographical pricing involves adjusting prices based on local market demand, taxes, and shipping costs, which can be complex in diverse regions. [<a href="https://brightdata.com/blog/proxy-101/location-based-pricing">Geographical Pricing Explained: How Location Impacts Prices</a>]</p></li><li><p><strong>Subscription-Based Storefronts: Configuring Trials, Upgrades, and Add-Ons Dynamically</strong></p><p>Offering flexible subscription models with trials and add-ons requires sophisticated billing systems to manage various customer preferences and billing cycles. [<a href="https://www.oracle.com/industrial-manufacturing/challenges-of-a-subscription-based-business-model/">The Challenges of a Subscription-Based Business Model</a>]</p></li></ol><div><hr></div><h4><strong>Personalization</strong></h4><p><strong>Functionality:</strong></p><ul><li><p>Deliver tailored recommendations, dynamic offers, and notifications.</p></li><li><p>Provide experimentation tools to optimize personalized content.</p></li></ul><p><strong>Challenges:</strong></p><ol><li><p><strong>Traditional E-Commerce: Generic Recommendations Limit Engagement</strong></p><ul><li><p>Without personalized recommendations, customers may experience irrelevant product suggestions, leading to decreased engagement and sales. [<a href="https://www.symson.com/blog/the-12-biggest-pricing-challenges">The 12 Biggest Pricing Challenges</a>]</p></li></ul></li><li><p><strong>Social Media Storefronts: Limited Ability to Personalize Shopping In-App</strong></p><ul><li><p>Integrating personalized shopping experiences within social media platforms can be challenging due to data privacy concerns and platform restrictions. [<a href="https://dataweave.com/blog/why-localized-store-specific-pricing-and-availability-insights-is-critical-for-consumer-brands">Why Localized, Store-Specific Pricing and Availability Insights is Critical for Consumer Brands</a>]</p></li></ul></li></ol><div><hr></div><h3><strong>2. Commerce Operations Layer</strong></h3><p><strong>Manages workflows, entitlements, and payment optimization.</strong></p><h4><strong>Billing &amp; Revenue Management</strong></h4><p><strong>Functionality:</strong></p><ul><li><p>Support flexible billing models (one-time, recurring, usage-based).</p></li><li><p>Federate billing across providers and automate revenue recognition.</p></li><li><p>Use payment retries and dunning strategies to reduce revenue leakage.</p></li><li><p>Automate multi-step workflows (purchase, provisioning, refunds).</p></li><li><p>Detect and resolve errors in processes with reusable, composable workflows.</p></li></ul><p><strong>Challenges:</strong></p><ol><li><p><strong>Marketplaces: Ensuring Accurate Revenue Sharing Among Sellers</strong></p><ul><li><p>Accurately distributing revenue among multiple sellers requires transparent and efficient systems to handle complex transactions. [<a href="https://www.bcg.com/publications/2024/how-b2b-marketplaces-are-rewriting-rules-of-trade">How B2B Marketplaces Are Rewriting the Rules of Trade</a>]</p></li></ul></li><li><p><strong>Custom-Built Storefronts: High Error Rates in Manual Order Workflows</strong></p><ul><li><p>Manual order processing is prone to errors, leading to delays and customer dissatisfaction. [<a href="https://www.walkme.com/blog/7-software-implementation-challenges/">7 Software Implementation Challenges &amp; How to Solve Them</a>]</p></li></ul></li><li><p><strong>Subscription-Based Storefronts: Managing Complex Billing Schedules</strong></p><ul><li><p>Handling various subscription tiers, billing cycles, and payment methods can be challenging without robust billing systems. [<a href="https://www.onebillsoftware.com/blog/recurring-billing-challenges-and-one-solution-to-solve-them/">5 Recurring Billing Challenges and How to Overcome Them</a>]</p></li></ul></li></ol><div><hr></div><h4><strong>Entitlements &amp; Provisioning</strong></h4><p><strong>Functionality:</strong></p><ul><li><p>Centralized management of user entitlements, groups, and access.</p></li><li><p>Automate provisioning of subscriptions and track feature compliance.</p></li></ul><p><strong>Challenges:</strong></p><ol><li><p><strong>SaaS Storefronts: Difficulty Integrating Entitlement Changes Across Systems</strong></p><ul><li><p>Synchronizing entitlement updates across various platforms can lead to inconsistencies and access issues. [<a href="https://www.walkme.com/blog/7-software-implementation-challenges/">7 Software Implementation Challenges &amp; How to Solve Them</a>]</p></li></ul></li><li><p><strong>App Stores: Synchronizing Entitlement Changes with External Platforms</strong></p><ul><li><p>Coordinating entitlement changes with third-party app stores requires seamless integration to ensure user access is up-to-date. [<a href="https://www.asbn.com/scale-your-business/ecommerce/mastering-multichannel-management-10-strategies-for-overcoming-challenges/">Mastering Multichannel Management: 10 Strategies for Overcoming Challenges</a>]</p></li></ul></li></ol><div><hr></div><h3><strong>3. Data Management &amp; Intelligence Layer</strong></h3><p><strong>Centralizes customer data, identity, and operational insights.</strong></p><h4><strong>Identity</strong></h4><p><strong>Functionality:</strong></p><ul><li><p>Maintain a canonical representation of users through a universal directory.</p></li><li><p>Provide membership and access management primitives (e.g., Groups and Resources).</p></li><li><p>Validate user authentication and ensure seamless interaction across systems.</p></li></ul><p><strong>Challenges:</strong></p><ol><li><p><strong>Marketplace Storefronts: Identity Federation Across Different Marketplaces</strong></p><ul><li><p>Integrating user identities across various marketplaces requires robust federation protocols to maintain consistency. [<a href="https://www2.deloitte.com/content/dam/Deloitte/us/Documents/public-sector/us-fed-deloitte-identity-federation-governance-05212015.pdf">Identity Federation Governance</a>]</p></li></ul></li><li><p><strong>Social Media Storefronts: Data Alignment Issues with External CRMs</strong></p><ul><li><p>Integrating social media interactions with external CRM systems can lead to data misalignment, affecting customer insights. [<a href="https://www.forbes.com/councils/forbescommunicationscouncil/2021/09/21/13-tips-to-connect-and-integrate-crm-tools-and-social-media-efforts/">13 Tips To Connect And Integrate CRM Tools And Social Media Efforts</a>]</p></li></ul></li><li><p><strong>Client App Storefronts: Seamless Transition of Subscriptions and Roles Across Devices</strong></p><ul><li><p>Ensuring that user subscriptions, entitlements, and roles are consistently maintained across multiple devices poses technical challenges. [<a href="https://optimalidm.com/resources/blog/identity-management-challenges-for-retailers/">Identity Management Challenges for Retailers</a>]</p></li></ul></li></ol><div><hr></div><h4><strong>Customer Data Platform (CDP)</strong></h4><p><strong>Functionality:</strong></p><ul><li><p>Build unified customer profiles by aggregating data from various touchpoints.</p></li><li><p>Segment customers for targeted marketing initiatives.</p></li><li><p>Orchestrate campaigns, customer journeys, and multi-channel communications.</p></li><li><p>Provide actionable insights to drive customer retention and acquisition strategies.</p></li></ul><p><strong>Challenges:</strong></p><ol><li><p><strong>Social Media Storefronts: Limited Access to First-Party Data for Targeting</strong></p><ul><li><p>Social media platforms often restrict access to granular user data, hindering personalized marketing efforts.[<a href="https://www.cmswire.com/digital-marketing/first-party-data-the-benefits-and-challenges-for-marketers/">First-Party Data: The Benefits and Challenges for Marketers</a>]</p></li></ul></li><li><p><strong>Marketplaces: Fragmented Customer Insights Due to Siloed Seller Data</strong></p><ul><li><p>Data silos across different sellers prevent a holistic view of customer behavior and preferences.</p></li></ul></li></ol><div><hr></div><h4><strong>Analytics &amp; Reports</strong></h4><p><strong>Functionality:</strong></p><ul><li><p>Provide real-time dashboards displaying revenue, campaign performance, and operational metrics.</p></li><li><p>Enable predictive analytics to identify opportunities for upselling and to mitigate customer churn.</p></li></ul><p><strong>Challenges:</strong></p><ol><li><p><strong>Dropshipping: Lack of Visibility into Supplier Performance Metrics</strong></p><ul><li><p>Dropshipping models often suffer from insufficient data on supplier reliability, affecting inventory management and customer satisfaction.[<a href="https://www.dragonsourcing.com/supplier-performance-measurement-challenges/">Supplier Performance Measurement Challenges: A Detailed Analysis</a>]</p></li></ul></li><li><p><strong>Traditional E-Commerce: Manual and Inconsistent Campaign Performance Reporting</strong></p><ul><li><p>Relying on manual processes for campaign reporting leads to inconsistencies and delays in decision-making.</p></li></ul></li></ol><p></p><h1>So, what ?</h1><p>The secret to a thriving e-commerce platform isn&#8217;t just the tools&#8212;it&#8217;s how you manage them. Success lies in <strong>centralized governance</strong>, <strong>industry standards</strong>, and <strong>powerful analytics</strong> that empower smarter decisions. Here&#8217;s how to make it happen:</p><ol><li><p><strong>Centralize for Control and Consistency</strong><br>Use the single playbook guiding every product update, pricing tweak, and campaign launch. With unified governance, you eliminate silos, ensure consistency, and make collaboration seamless across teams and storefronts.</p></li><li><p><strong>Speak the Same Language with Industry Standards</strong><br>Don&#8217;t reinvent the wheel. Use frameworks like SCIM for identity management or RACI to define team roles clearly. Standardized approaches let you focus on growth, not operational chaos, while ensuring your platform scales effortlessly.</p></li><li><p><strong>Let Data Lead the Way</strong><br>Data isn&#8217;t just numbers&#8212;it&#8217;s your strategy&#8217;s backbone. Build real-time dashboards that tell you what&#8217;s working and what&#8217;s not, and use predictive analytics to stay ahead of churn and seize new opportunities. Insights like these turn guesswork into growth.</p></li></ol><p>By putting governance, standards, and analytics at the heart of your platform, you&#8217;re not just building for today&#8212;you&#8217;re setting the stage for lasting success. It&#8217;s time to make your e-commerce ecosystem smarter, faster, and ready to thrive.</p>]]></content:encoded></item><item><title><![CDATA[[E]Commerce 101: Storefronts.]]></title><description><![CDATA[1. Introduction to Storefronts. This article explores the various types of e-commerce storefronts, their unique characteristics, key features, and real-world examples.]]></description><link>https://blog.dativo.io/p/ecommerce-101-storefronts</link><guid isPermaLink="false">https://blog.dativo.io/p/ecommerce-101-storefronts</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Wed, 27 Nov 2024 09:50:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zOEr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zOEr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zOEr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 424w, https://substackcdn.com/image/fetch/$s_!zOEr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 848w, https://substackcdn.com/image/fetch/$s_!zOEr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 1272w, https://substackcdn.com/image/fetch/$s_!zOEr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zOEr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png" width="1024" height="576" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1642482,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zOEr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 424w, https://substackcdn.com/image/fetch/$s_!zOEr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 848w, https://substackcdn.com/image/fetch/$s_!zOEr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 1272w, https://substackcdn.com/image/fetch/$s_!zOEr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1779121d-485f-4731-8338-47e9f6d76a5f_1024x576.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>In E-commerce or SaaS, <strong>storefronts</strong> refer to the digital equivalents of physical retail shops&#8212;websites or online platforms where businesses showcase and sell their products or services. These storefronts are carefully designed to create a seamless shopping experience for customers, acting as the brand's primary customer-facing interface. They often include features such as:</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Sergey Enin's substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ol><li><p><strong>Product Displays</strong>: Showcasing items with images, descriptions, prices, and specifications.</p></li><li><p><strong>Search and Navigation</strong>: Enabling users to explore product categories or search for specific items.</p></li><li><p><strong>Custom Branding</strong>: Incorporating brand elements like logos, colors, and messaging.</p></li><li><p><strong>Personalization</strong>: Tailoring the experience with recommendations and targeted offers based on user behavior.</p></li><li><p><strong>Customer Interaction</strong>: Facilitating reviews, FAQs, and chat support for better engagement.</p></li><li><p><strong>Transaction Capabilities</strong>: Secure and efficient checkout, payment processing, and order tracking.</p></li></ol><p>Storefronts are crucial in e-commerce as they influence customer impressions, conversions, and brand loyalty. The best storefronts are not only aesthetically pleasing but also optimized for performance, mobile compatibility, and user experience.</p><p></p><h1>Types of Storefronts</h1><h3><strong>1. Traditional E-Commerce Standalone Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BQxP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BQxP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 424w, https://substackcdn.com/image/fetch/$s_!BQxP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 848w, https://substackcdn.com/image/fetch/$s_!BQxP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 1272w, https://substackcdn.com/image/fetch/$s_!BQxP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BQxP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png" width="1456" height="753" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/acd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:753,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1132068,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BQxP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 424w, https://substackcdn.com/image/fetch/$s_!BQxP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 848w, https://substackcdn.com/image/fetch/$s_!BQxP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 1272w, https://substackcdn.com/image/fetch/$s_!BQxP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd13e38-f66a-4abb-a0b2-39c0bb758dfc_3430x1774.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Description</strong>: Standalone websites where businesses sell directly to consumers. These storefronts are fully branded and controlled by the business.</p></li><li><p><strong>Examples</strong>: Nike.com, Wizzair.com, ea.com.</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>Product catalog management</strong> with images, descriptions, and prices.</p></li><li><p>Advanced search and filtering options.</p></li><li><p>Secure checkout with multiple payment options.</p></li><li><p>Shipping and delivery integration.</p></li><li><p><strong>Own customer accounts</strong> for order tracking and preferences.</p></li><li><p>Promotions and discount management.</p></li></ul></li></ul><div><hr></div><h3><strong>2. Marketplace Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!237x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!237x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 424w, https://substackcdn.com/image/fetch/$s_!237x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 848w, https://substackcdn.com/image/fetch/$s_!237x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 1272w, https://substackcdn.com/image/fetch/$s_!237x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!237x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png" width="1456" height="749" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:749,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1300446,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!237x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 424w, https://substackcdn.com/image/fetch/$s_!237x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 848w, https://substackcdn.com/image/fetch/$s_!237x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 1272w, https://substackcdn.com/image/fetch/$s_!237x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97bca28d-724c-4b17-b4b8-dd5a5c0fd242_3440x1770.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ul><li><p><strong>Description</strong>: Platforms that aggregate multiple sellers, allowing them to sell their products under one digital roof. Customers can browse offerings from various vendors.</p></li><li><p><strong>Examples</strong>: Amazon Marketplace, Etsy, Allegro.</p></li><li><p><strong>Features</strong>:</p><ul><li><p>Seller registration and profile management.</p></li><li><p><strong>Multi-seller inventory management</strong>.</p></li><li><p>Order splitting and commission calculations.</p></li><li><p><strong>Ratings and reviews</strong> for sellers and products.</p></li><li><p><strong>Dispute resolution tools</strong> for buyers and sellers.</p></li></ul></li></ul><div><hr></div><h3><strong>3. Social Media Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S3yi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S3yi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 424w, https://substackcdn.com/image/fetch/$s_!S3yi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 848w, https://substackcdn.com/image/fetch/$s_!S3yi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!S3yi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S3yi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S3yi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 424w, https://substackcdn.com/image/fetch/$s_!S3yi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 848w, https://substackcdn.com/image/fetch/$s_!S3yi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!S3yi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1313f16-a360-45ae-8b53-98d8f8c1e6de_1867x1008.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Description</strong>: Integrated shopping experiences on social media platforms, enabling businesses to sell directly through posts, reels, or ads.</p></li><li><p><strong>Examples</strong>: Instagram Shops, Facebook Shops, TikTok Shopping.</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>Product tagging</strong> in posts, reels, or videos.</p></li><li><p>In-app checkout without leaving the platform.</p></li><li><p>AI-driven <strong>product recommendations based on engagement</strong>.</p></li><li><p>Messaging tools for customer queries.</p></li><li><p>Analytics for <strong>campaign performance and conversions</strong>.</p></li></ul><p></p></li></ul><div><hr></div><h3><strong>4. Dropshipping Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BMAK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BMAK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 424w, https://substackcdn.com/image/fetch/$s_!BMAK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 848w, https://substackcdn.com/image/fetch/$s_!BMAK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 1272w, https://substackcdn.com/image/fetch/$s_!BMAK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BMAK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png" width="1456" height="753" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28025159-b420-4f48-a950-500e5af83033_3454x1786.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:753,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1209877,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BMAK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 424w, https://substackcdn.com/image/fetch/$s_!BMAK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 848w, https://substackcdn.com/image/fetch/$s_!BMAK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 1272w, https://substackcdn.com/image/fetch/$s_!BMAK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28025159-b420-4f48-a950-500e5af83033_3454x1786.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ul><li><p><strong>Description</strong>: Stores that act as intermediaries, where businesses don&#8217;t hold inventory but fulfill orders directly through suppliers.</p></li><li><p><strong>Examples</strong>: Oberlo-powered Shopify stores, Spocket dropshipping storefronts, Printful storefronts.</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>Supplier integration</strong> for inventory and order synchronization.</p></li><li><p>Automated order forwarding to suppliers.</p></li><li><p><strong>Branding customization</strong> for products and packaging.</p></li><li><p>Shipping and tracking updates for customers.</p></li></ul></li></ul><div><hr></div><h3><strong>5. Subscription-Based Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EnoO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EnoO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 424w, https://substackcdn.com/image/fetch/$s_!EnoO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 848w, https://substackcdn.com/image/fetch/$s_!EnoO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 1272w, https://substackcdn.com/image/fetch/$s_!EnoO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EnoO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png" width="1456" height="734" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:734,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:404907,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EnoO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 424w, https://substackcdn.com/image/fetch/$s_!EnoO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 848w, https://substackcdn.com/image/fetch/$s_!EnoO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 1272w, https://substackcdn.com/image/fetch/$s_!EnoO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbf1ceb-b1e5-478f-906f-7df73cc103f1_3456x1742.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ul><li><p><strong>Description</strong>: Businesses sell products or services on a recurring basis, emphasizing long-term customer relationships.</p></li><li><p><strong>Examples</strong>: Netflix (digital content), HelloFresh (meal kits), Dollar Shave Club (grooming products).</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>Recurring billing</strong> with automated payment retries.</p></li><li><p>Flexible subscription management for <strong>upgrades or cancellations</strong>.</p></li><li><p><strong>Usage tracking and analytics</strong>.</p></li><li><p><strong>Notifications for renewals, cancellations, or offers</strong>.</p></li></ul></li></ul><div><hr></div><h3><strong>6. SaaS-Based Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!naUH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!naUH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 424w, https://substackcdn.com/image/fetch/$s_!naUH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 848w, https://substackcdn.com/image/fetch/$s_!naUH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 1272w, https://substackcdn.com/image/fetch/$s_!naUH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!naUH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png" width="1456" height="738" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:738,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:508931,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!naUH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 424w, https://substackcdn.com/image/fetch/$s_!naUH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 848w, https://substackcdn.com/image/fetch/$s_!naUH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 1272w, https://substackcdn.com/image/fetch/$s_!naUH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb30f0b5f-8374-4bc1-8cc8-b6e871a18177_3438x1742.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ul><li><p><strong>Description</strong>: Platforms that provide the tools and infrastructure for businesses to create their own storefronts.</p></li><li><p><strong>Examples</strong>: Shopify, BigCommerce, Wix E-commerce.</p></li><li><p><strong>Features</strong>:</p><ul><li><p>Drag-and-drop storefront design tools.</p></li><li><p><strong>Integration with third-party plugins for shipping, payments, and analytics</strong>.</p></li><li><p>Scalability for high traffic and large product catalogs.</p></li><li><p>Training and customer support for users.</p></li></ul></li></ul><div><hr></div><h3><strong>7. Custom-Built Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xaoI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xaoI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 424w, https://substackcdn.com/image/fetch/$s_!xaoI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 848w, https://substackcdn.com/image/fetch/$s_!xaoI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 1272w, https://substackcdn.com/image/fetch/$s_!xaoI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xaoI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png" width="1456" height="749" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:749,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1081858,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xaoI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 424w, https://substackcdn.com/image/fetch/$s_!xaoI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 848w, https://substackcdn.com/image/fetch/$s_!xaoI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 1272w, https://substackcdn.com/image/fetch/$s_!xaoI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a86b9cb-fff4-4faa-9ddf-8265d84b490b_3426x1762.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ul><li><p><strong>Description</strong>: Unique, tailored storefronts developed specifically for a brand&#8217;s needs, often requiring significant resources and technical expertise.</p></li><li><p><strong>Examples</strong>: Tesla (custom vehicle configuration), Zara (fashion storefront with unique UI).</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>Fully customizable</strong> layouts and features.</p></li><li><p>Integration with advanced technologies like AR/VR for immersive shopping.</p></li><li><p>Full flexibility on the QCP, O2Q, Q2C processes setup.</p></li></ul></li></ul><div><hr></div><h3><strong>8. Outbound Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ohRz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ohRz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 424w, https://substackcdn.com/image/fetch/$s_!ohRz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 848w, https://substackcdn.com/image/fetch/$s_!ohRz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 1272w, https://substackcdn.com/image/fetch/$s_!ohRz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ohRz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png" width="1440" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Media item 1 of 5&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Media item 1 of 5" title="Media item 1 of 5" srcset="https://substackcdn.com/image/fetch/$s_!ohRz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 424w, https://substackcdn.com/image/fetch/$s_!ohRz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 848w, https://substackcdn.com/image/fetch/$s_!ohRz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 1272w, https://substackcdn.com/image/fetch/$s_!ohRz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F722c0a33-55e0-40e2-87be-7359260380f8_1440x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Description</strong>: Sales-driven storefronts where software or services are sold directly through sales representatives, often using CRM systems.</p></li><li><p><strong>Examples</strong>: Salesforce-integrated sales for SaaS software, cold outreach strategies by enterprise SaaS teams.</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>Integration with CRM platforms like Salesforce</strong>.</p></li><li><p><strong>Automated quotes, invoices, and contracts</strong>.</p></li><li><p>Sales enablement tools for product demos and pricing.</p></li><li><p>Follow-up and task reminders for sales teams.</p></li></ul></li></ul><div><hr></div><h3><strong>9. Channel/Reseller Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1I1d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1I1d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 424w, https://substackcdn.com/image/fetch/$s_!1I1d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 848w, https://substackcdn.com/image/fetch/$s_!1I1d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 1272w, https://substackcdn.com/image/fetch/$s_!1I1d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1I1d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png" width="1456" height="752" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:752,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:873973,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1I1d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 424w, https://substackcdn.com/image/fetch/$s_!1I1d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 848w, https://substackcdn.com/image/fetch/$s_!1I1d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 1272w, https://substackcdn.com/image/fetch/$s_!1I1d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac649efc-acd7-4a4b-8199-b65d6947ddb1_3432x1772.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ul><li><p><strong>Description</strong>: Platforms that allow resellers or partners to distribute and sell products on behalf of a brand or manufacturer.</p></li><li><p><strong>Examples</strong>: Google Cloud Marketplace, AWS Marketplace.</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>Partner</strong> onboarding and account management.</p></li><li><p><strong>Revenue sharing automation</strong>.</p></li><li><p>Access to product catalogs with <strong>tiered pricing</strong>.</p></li><li><p>Marketing collateral for reseller promotions.</p></li></ul></li></ul><div><hr></div><h3><strong>10. App Stores</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LChr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LChr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 424w, https://substackcdn.com/image/fetch/$s_!LChr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 848w, https://substackcdn.com/image/fetch/$s_!LChr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 1272w, https://substackcdn.com/image/fetch/$s_!LChr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LChr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png" width="1456" height="741" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:741,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2247908,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LChr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 424w, https://substackcdn.com/image/fetch/$s_!LChr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 848w, https://substackcdn.com/image/fetch/$s_!LChr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 1272w, https://substackcdn.com/image/fetch/$s_!LChr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c571f00-aed3-4fce-b63b-a06d490ef16c_3450x1756.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ul><li><p><strong>Description</strong>: Platforms for selling and distributing software applications to end-users, usually tied to an ecosystem.</p></li><li><p><strong>Examples</strong>: Apple App Store, Google Play Store, Microsoft Store.</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>App submission</strong> and developer tools.</p></li><li><p>Revenue splitting and payment distribution.</p></li><li><p><strong>User reviews and app ranking</strong> algorithms.</p></li><li><p>Security compliance for apps.</p></li></ul></li></ul><div><hr></div><h3><strong>11. Client App Storefronts</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WabR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WabR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 424w, https://substackcdn.com/image/fetch/$s_!WabR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 848w, https://substackcdn.com/image/fetch/$s_!WabR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 1272w, https://substackcdn.com/image/fetch/$s_!WabR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WabR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png" width="700" height="501" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:501,&quot;width&quot;:700,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How much is Spotify Premium? - RouteNote Blog&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How much is Spotify Premium? - RouteNote Blog" title="How much is Spotify Premium? - RouteNote Blog" srcset="https://substackcdn.com/image/fetch/$s_!WabR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 424w, https://substackcdn.com/image/fetch/$s_!WabR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 848w, https://substackcdn.com/image/fetch/$s_!WabR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 1272w, https://substackcdn.com/image/fetch/$s_!WabR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676bd78a-ad14-4bc9-b66c-ec110561dd59_700x501.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Description</strong>: Businesses sell products or subscriptions directly within their mobile or desktop applications, creating a seamless in-app purchasing experience.</p></li><li><p><strong>Examples</strong>: Spotify (subscriptions), Kindle (ebooks), Adobe Creative Cloud.</p></li><li><p><strong>Features</strong>:</p><ul><li><p><strong>In-app purchases and subscriptions</strong>.</p></li><li><p>Secure payment gateways optimized for mobile.</p></li><li><p>Account management for tracking and modifying purchases.</p></li><li><p>Offline access for downloaded content or services.</p></li></ul></li></ul><div><hr></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Sergey Enin's substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Technical debt]]></title><description><![CDATA[This article delves into the concept of technical debt within the realm of software development, exploring its origins, implications, and how it can be effectively managed. It provides a comprehensive]]></description><link>https://blog.dativo.io/p/technical-debt</link><guid isPermaLink="false">https://blog.dativo.io/p/technical-debt</guid><dc:creator><![CDATA[Sergey]]></dc:creator><pubDate>Sun, 18 Aug 2024 12:41:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eOgF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><strong>Introduction</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eOgF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eOgF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 424w, https://substackcdn.com/image/fetch/$s_!eOgF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 848w, https://substackcdn.com/image/fetch/$s_!eOgF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 1272w, https://substackcdn.com/image/fetch/$s_!eOgF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eOgF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png" width="770" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bea64f94-9902-48cd-94cf-3430073f0741_770x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:770,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!eOgF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 424w, https://substackcdn.com/image/fetch/$s_!eOgF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 848w, https://substackcdn.com/image/fetch/$s_!eOgF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 1272w, https://substackcdn.com/image/fetch/$s_!eOgF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea64f94-9902-48cd-94cf-3430073f0741_770x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We were supposed to release this feature a couple of weeks ago. One developer got stuck on a framework update, another struggled with reorganizing the feature flags, and a third had to dig into an old repository to initiate database changes. As a result, the team is overwhelmed. Every feature release will feel like this until we can spend a few weeks addressing our technical debt. The challenge lies in getting the business to even consider this.</p><p>Here's the effect: the minute we mention "tech debt," everyone gets upset, but no one is listening. Each person assumes they know what we're all talking about, but their interpretations differ significantly. To the business, it sounds like the engineers are asking for three weeks without releasing any features. They remember the last time they granted those weeks: within a month, the team was overwhelmed again. When business leaders are reluctant to grant a "tech debt week" because the last one didn't improve the team's capacity, how can we expect them to agree to another?</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Sergey Enin's substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In software development, technical debt is a familiar concept. Under tight deadlines, developers often take shortcuts to complete projects quickly. However, over time, these shortcuts accumulate into "debt" that must eventually be "paid back." Some amount of technical debt is virtually inevitable. Leaders and teams must balance business goals with myriad design and implementation decisions. A developer or team that only makes optimal choices will struggle to ship regularly and quickly.</p><p>&#128161;</p><blockquote><p>Technical debt has an intuitive advantage when it comes to definition: the name itself is more self-explanatory than most tech terms. From a business and financial perspective, "technical debt" functions similarly to financial debt. We take on technical debt for reasons akin to taking on financial debt: we need something now (or very soon) that we don&#8217;t have the resources to pay for in full. So we borrow to get what we need [<a href="https://enterprisersproject.com/article/2020/6/technical-debt-explained-plain-english">How to explain technical debt in plain English</a>]. In software, this generally means making coding or design decisions that are suboptimal&#8212;or that we know will need to be addressed and updated in the future&#8212;to get what we want or need into production sooner.</p></blockquote><p>Here, the financial analogy becomes relevant again: we must use debt responsibly. We take on debt understanding that it will need to be repaid. Continuously accruing new debt without ever reducing existing balances will inevitably lead to problems&#8212;or even ruin&#8212;down the road.</p><h2><strong>Defining the Technical Debt</strong></h2><p>If you&#8217;ve been in the software industry for any period of time, you&#8217;ve likely heard the term &#8220;technical debt.&#8221; Also known as design debt or code debt, this metaphor is widely used in the technology space. It serves as a catchall term covering everything from bugs to legacy code to missing documentation. But what exactly is technical debt, and why do we call it that?</p><p>&#128161;</p><blockquote><p>Technical debt was originally coined by software developer Ward Cunningham, one of the authors of the Agile Manifesto and the inventor of the wiki. He used the metaphor to explain to non-technical stakeholders why resources needed to be budgeted for refactoring. His analogy was simple: with borrowed money, you can do something sooner than you might otherwise, but until you pay back that money, you&#8217;ll be paying interest. Similarly, rushing software out the door can provide short-term benefits, but it creates a "debt" that must be repaid through refactoring.</p></blockquote><p>Years later, Cunningham described how he initially came up with the&nbsp;[<a href="http://wiki.c2.com/?WardExplainsDebtMetaphor">technical debt metaphor</a>]:</p><blockquote><p>&#8220;With borrowed money, you can do something sooner than you might otherwise, but then until you pay back that money you&#8217;ll be paying interest. I thought borrowing money was a good idea, I thought that rushing software out the door to get some experience with it was a good idea, but that of course, you would eventually go back and as you learned things about that software you would repay that loan by refactoring the program to reflect your experience as you acquired it.&#8221;</p></blockquote><p>The metaphor of technical debt is abstract, leading to various interpretations. However, it generally <a href="https://www.sciencedirect.com/science/article/pii/S0950584917305098">refers to the consequences of prioritizing quick delivery over perfect implementation</a>. <strong>Technical debt accumulates when developers take shortcuts&#8212;using outdated libraries, writing poor quality code, or skipping proper testing and documentation&#8212;to meet deadlines.</strong></p><p>Traditional software development follows a phase-based approach: feature development, alpha, beta, and golden master (GM). Each phase ideally addresses residual issues from the previous one. However, in reality, residual issues often get deferred, leading to a growing backlog of technical debt. This creates a cycle where new bugs appear as quickly as old ones are fixed, making it difficult to achieve a stable release. Deferred bugs and shortcuts create a dangerous cycle. As the number of unresolved issues grows, they become increasingly daunting to tackle, slowing down development and frustrating customers. This leads to a vicious cycle where schedules get derailed, and quality suffers.</p><p>In agile methodologies, the focus on rapid delivery and iterative progress can inadvertently contribute to technical debt. Agile teams often prioritize getting a working product out quickly, which can lead to shortcuts in planning, designing, coding, testing, or documentation. While this approach allows for quick releases and adaptability, it can also result in accumulating technical debt. Over time, these small compromises add up, creating a significant burden that slows down future development and reduces overall software quality. Managing this debt effectively is crucial to maintaining the long-term health of the project.</p><p>Developers often compromise best practices due to tight deadlines, immediate business needs, and changing requirements. Technical debt arises from these compromises. It represents the "build now, fix later" mentality. While it may offer short-term gains, it incurs long-term costs that can hinder future development.</p><p>Why build something knowing it will break down later? Technical debt is like taking a loan to get ahead quickly, with the understanding that it will require more work to fix later. It's a necessary trade-off in a fast-paced industry where speed often takes precedence over perfection.</p><h3><strong>Is Tech Debt really Bad?</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1hQa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1hQa!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!1hQa!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!1hQa!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!1hQa!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1hQa!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif" width="480" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!1hQa!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!1hQa!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!1hQa!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!1hQa!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e0e604c-3c7f-4349-a22d-4ad213e0ac3b_480x270.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>&#128161;</p><blockquote><p>Technical debt is neither inherently good nor bad&#8212;it&#8217;s simply a tool. Like financial debt, it has both advantages and disadvantages. The key is to manage it responsibly. In a competitive market, the pressure to develop and ship quickly is immense, especially for startups. This often leads to the inevitable trade-off between taking on technical debt or delaying a launch.</p></blockquote><p>Constantly deferring bugs and creating a backlog is dangerous. As these unresolved issues pile up, they become increasingly daunting, causing delays and reducing software quality. Customers may become frustrated with persistent defects, leading to dissatisfaction and potential loss of business.</p><p>Software developers often find themselves compromising best practices due to tight deadlines and shifting requirements. Technical debt results from these compromises, embodying the "build now, fix later" approach. While it may seem like a quick fix, it can have severe long-term consequences.</p><p>Agile teams generally view technical debt as an unavoidable part of the development process. Most software products carry some degree of technical debt, reflecting the reality of rapid development cycles where working software is the primary measure of progress. The consensus is that technical debt is manageable if addressed proactively and strategically.</p><h3><strong>Technical Debt: 3 Definitions in plan English</strong></h3><p>Part of responsible technical debt management lies in a straightforward understanding of the concept and how it manifests. Eventually, Let's arm you with some other definitions from the industry experts<strong>[</strong><a href="https://enterprisersproject.com/article/2020/6/technical-debt-explained-plain-english">https://enterprisersproject.com/article/2020/6/technical-debt-explained-plain-english</a>]:</p><ol><li><p><strong>Justin Stone, Senior Director of Secure DevOps Platforms at Liberty Mutual Insurance:</strong></p><ol><li><p>Technical debt is the result of the design or implementation decisions you make and how those decisions age over time if they aren&#8217;t incrementally adjusted or improved. The longer you hold fast to those designs and implementations without incremental adjustments or improvements, the larger the debt, or effort, becomes to make those needed changes.</p></li></ol></li></ol><ol start="2"><li><p><strong>Christian Nelson, VP of Engineering at Carbon Five:</strong></p><ol><li><p>Technical debt is when the implementation &#8211; the code &#8211; for a product becomes unnecessarily complex, inconsistent, or otherwise difficult to understand. While there is no perfect code, code that contains technical debt [moves] farther [away] from a good solution for the problem it solves. The more debt, the farther the code misses the target. Technical debt makes it harder to understand what the code does, which makes it harder to build upon, and ultimately results in poor productivity and defects in the product.</p></li></ol></li></ol><ol start="3"><li><p><strong>Justin Brodley, VP Cloud Operations &amp; Engineering at Ellie Mae and Co-host of The Cloud Pod:</strong></p><ol><li><p>Technical debt is the cost of technical decisions that are made for the immediacy, simplicity, or [budget] that, while easy today, will slow you down or increase your operational costs/risks [over time]. Most often it&#8217;s related to technical products, but can be found in most business processes and use cases. Many times this technical debt can turn into 'human spackle,' where knowledge workers do repetitive tasks that could be automated.</p></li></ol></li></ol><h2><strong>Impact</strong></h2><p>Technical debt is the consequence of choosing quick and cheap technology solutions over robust and efficient ones. This can occur due to time constraints, budget limitations, or lack of awareness about future needs. While often discussed in the context of software development, technical debt has significant implications for business operations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4jvK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4jvK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4jvK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg" width="768" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!4jvK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4jvK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92546611-bcad-446b-b45e-44fe776c3af6_768x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Key Impacts of Technical Debt</strong></h3><ol><li><p><strong>Future Costs and Challenges </strong>&#128200;</p><ol><li><p><strong>Increased Effort and Resources:</strong> Like financial debt, technical debt incurs a cost that must be repaid in the future. Addressing suboptimal solutions will require additional resources, time, and effort.</p></li></ol></li></ol><ol start="2"><li><p><strong>Impact on Business Operations </strong>&#129320;</p><ol><li><p><strong>Operational Inefficiencies:</strong> Technical debt can lead to inefficiencies, reduced productivity, increased maintenance costs, system failures, and security vulnerabilities. It can also hinder an organization's ability to adapt to new technologies or market changes.</p></li></ol></li></ol><ol start="3"><li><p><strong>User Dissatisfaction and Revenue Loss </strong>&#128148;</p><ol><li><p><strong>Negative User Experience:</strong> Technical debt often appears as bugs, which lower the user experience. This translates to increased expenses for customer service and lower revenues due to customer defections.</p></li></ol></li></ol><ol start="4"><li><p><strong>Longer Development Cycles </strong>&#128075;</p><ol><li><p><strong>Delayed Time to Market:</strong> As technical debt worsens, it becomes harder for developers to work within the existing code base, splitting time between developing new features and correcting old ones. This slows the software development lifecycle and delays time to market.</p></li></ol></li></ol><ol start="5"><li><p><strong>Reduced Productivity and Innovation </strong>&#128555;</p><ol><li><p><strong>Limited Innovation:</strong> Severe technical debt forces developers to service existing issues rather than focusing on building innovative new features.</p></li></ol></li></ol><ol start="6"><li><p><strong>Potential Security Problems </strong>&#129302;</p><ol><li><p><strong>Increased Vulnerabilities:</strong> Technical debt can leave systems open to more vulnerabilities. These can be exploited by threat actors or insider threats, leading to security breaches laden with financial risk, including direct loss of assets, loss of business, and regulatory fines and penalties.</p></li></ol></li></ol><h3><strong>Views on Technical Debt Effects</strong></h3><ol><li><p><strong>Shaun McCormick:</strong></p><ol><li><p>I view technical debt as any code that decreases agility as the project matures. Note how I didn&#8217;t say bad code (as that is often subjective) or broken code.</p></li><li><p><a href="https://www.bigeng.io/why-the-way-we-look-at-technical-debt-is-wrong/">McCormick suggests that true technical debt is always intentional and not accidental.</a></p></li></ol></li></ol><ol start="2"><li><p><strong><a href="https://hackernoon.com/the-fallacy-of-technical-debt-202f7406337e">Gaminer's Explanation</a>:</strong></p><ol><li><p>Technical debt happens when you take shortcuts in writing your code to achieve your goal faster, but at the cost of uglier, harder-to-maintain code. It&#8217;s called technical debt because it&#8217;s like taking out a loan. You can accomplish more today than you normally could, but you end up paying a higher cost later.</p></li></ol></li></ol><ol start="3"><li><p><strong>Uncle Bob:</strong></p><ol><li><p>A mess is not technical debt. A mess is just a mess. Technical debt decisions are made based on real project constraints. They are risky, but they can be beneficial. The decision to make a mess is never rational. It&#8217;s always based on laziness and unprofessionalism and has no chance of paying off in the future. A mess is always a loss.</p></li><li><p>Uncle Bob supports McCormick&#8217;s claim that bad code does not qualify as technical debt. By his definition,<a href="https://sites.google.com/site/unclebobconsultingllc/a-mess-is-not-a-technical-debt"> taking on technical debt is always intentional and strategic</a>. This supports the idea that not every instance of technical debt falls into the same category. Understanding these nuances helps in making informed decisions about when and how to incur and manage technical debt.</p></li></ol></li></ol><h2><strong>Classification of Technical Debt</strong></h2><h3><strong>By intent and context</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cvw4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cvw4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cvw4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cvw4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cvw4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cvw4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg" width="768" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!cvw4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cvw4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cvw4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cvw4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafe86b2-b279-473d-9250-6ff01f09ff73_768x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Technical debt can be categorized into three primary types, each with distinct characteristics and implications:</p><ol><li><p><strong>Intentional Technical Debt</strong></p><ol><li><p><strong>Description:</strong> Created deliberately to expedite product delivery to market. This strategic decision is often made with the understanding that the debt will need to be addressed later.</p></li><li><p><strong>Impact:</strong> Can accelerate time-to-market but requires planned efforts for future repayment and refactoring.</p></li></ol></li></ol><ol start="2"><li><p><strong>Unintentional Technical Debt</strong></p><ol><li><p><strong>Description:</strong> Arises from sloppiness, unexpected complexity, or a lack of technical expertise. It includes poorly written code, overlooked design flaws, and inadequate documentation.</p></li><li><p><strong>Impact:</strong> Leads to increased maintenance costs, reduced productivity, and potential system failures due to unplanned inefficiencies and errors.</p></li></ol></li></ol><ol start="3"><li><p><strong>Environmental Technical Debt</strong></p><ol><li><p><strong>Description:</strong> Accrued over time due to the natural evolution of the software environment and lack of active management. This includes outdated technologies, deprecated libraries, and unaddressed infrastructure issues.</p></li><li><p><strong>Impact:</strong> Causes performance degradation, security vulnerabilities, and increased downtime, requiring ongoing management and updates to mitigate its effects.</p></li><li></li></ol></li></ol><h4><strong>Martin Fowler's Technical Debt Quadrant</strong></h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6cj9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6cj9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!6cj9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!6cj9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!6cj9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6cj9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!6cj9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!6cj9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!6cj9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!6cj9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F372ad28d-62ef-4816-9a9b-90ddb75d1196_1024x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To better understand and categorize technical debt, it&#8217;s helpful to use Martin Fowler's "Technical Debt Quadrant," which classifies the type of technical debt based on intent and context:</p><ul><li><p><strong>Prudent and Deliberate:</strong> The team is aware they are incurring debt but choose to prioritize shipping and deal with the consequences later. This decision is acceptable if the stakes are small or the payoff for an earlier release is greater than the costs of the technical debt.</p></li></ul><ul><li><p><strong>Reckless and Deliberate:</strong> The team knows about the consequences but still prioritizes speed over quality.</p></li></ul><ul><li><p><strong>Prudent and Inadvertent:</strong> The team learns how the solution should have been implemented after the implementation.</p></li></ul><ul><li><p><strong>Reckless and Inadvertent:</strong> The team lacks experience and blindly implements the solution, not realizing they are creating significant debt.</p></li></ul><p><strong>The left side of this quadrant, which includes reckless debt, should be avoided at all costs to maintain long-term software health and project success.</strong></p><h3><strong>By type</strong></h3><p>In software development and enterprise IT, <strong>technical debt can manifest in different forms</strong>, each with its own causes and consequences. Understanding these types helps in effectively managing and mitigating technical debt.</p><p>In 2014, a group of academics proposed a framework for categorizing technical debt based on its nature. According to their paper, published by the Software Engineering Institute as &#8220;Towards an Ontology of Terms on Technical Debt,&#8221; there are 13 distinct types of technical debt:</p><p><strong>1. Architecture Debt</strong></p><p>Description:</p><ul><li><p>Issues within the software architecture that hinder scalability, performance, or adaptability. Examples include monolithic architectures in need of microservices refactoring or poor modularization.</p></li></ul><p>Impact:</p><ul><li><p>Limits scalability and performance, making it difficult to introduce new features or adapt to new technologies. It can also lead to increased maintenance costs and complexity over time.</p></li></ul><p><strong>2. Build Debt</strong></p><p>Description:</p><ul><li><p>Problems related to the build process, such as slow, unreliable, or overly complex build pipelines.</p></li></ul><p>Impact:</p><ul><li><p>Increases the time required to build and deploy software, leading to slower development cycles, delayed releases, and increased frustration among developers.</p></li></ul><p><strong>3. Code Debt</strong></p><p>Description:</p><ul><li><p>Poor coding practices, lack of standardization, inadequate code comments, and outdated or inefficient coding techniques.</p></li></ul><p>Impact:</p><ul><li><p>Hinders code maintenance and scalability, making it harder to implement new features and increasing the likelihood of bugs and system failures.</p></li></ul><p><strong>4. Defect Debt</strong></p><p>Description:</p><ul><li><p>Accumulated unresolved defects or bugs that degrade software quality.</p></li></ul><p>Impact:</p><ul><li><p>Leads to lower user satisfaction, increased maintenance efforts, and potential system downtime. It can also cause a backlog of technical issues that slow down new development.</p></li></ul><p><strong>5. Design Debt</strong></p><p>Description:</p><ul><li><p>Flawed or outdated software design, including overly complex designs, improper use of patterns, and lack of modularity.</p></li></ul><p>Impact:</p><ul><li><p>Impedes scalability and the ability to introduce new features. It can also make the system more fragile and difficult to maintain.</p></li></ul><p><strong>6. Documentation Debt</strong></p><p>Description:</p><ul><li><p>Insufficient or outdated documentation that makes it difficult for team members to understand the system and the rationale behind certain decisions.</p></li></ul><p>Impact:</p><ul><li><p>Reduces efficiency in maintenance and development, leading to longer onboarding times for new team members and increased reliance on tribal knowledge.</p></li></ul><p><strong>7. Infrastructure Debt</strong></p><p>Description:</p><ul><li><p>Issues related to the environment in which the software operates, such as outdated servers, inadequate deployment practices, or the absence of disaster recovery plans.</p></li></ul><p>Impact:</p><ul><li><p>Results in performance issues, increased downtime, and higher operational risks. It can also make scaling the infrastructure more difficult and costly.</p></li></ul><p><strong>8. People Debt</strong></p><p>Description:</p><ul><li><p>Issues arising from team dynamics, skill gaps, or lack of training.</p></li></ul><p>Impact:</p><ul><li><p>Leads to suboptimal solutions and lower productivity. It can also cause increased turnover and a lack of cohesion within the development team.</p></li></ul><p><strong>9. Process Debt</strong></p><p>Description:</p><ul><li><p>Inefficient or outdated development processes and methodologies, including poor communication practices, lack of agile methodologies, and insufficient collaboration tools.</p></li></ul><p>Impact:</p><ul><li><p>Slows down development and reduces efficiency, leading to longer release cycles and higher project costs.</p></li></ul><p><strong>10. Requirement Debt</strong></p><p>Description:</p><ul><li><p>Unclear, incomplete, or changing requirements that lead to suboptimal implementation.</p></li></ul><p>Impact:</p><ul><li><p>Causes rework and delays, as developers may need to go back and adjust features to meet evolving requirements. This can also lead to misaligned expectations between stakeholders and the development team.</p></li></ul><p><strong>11. Service Debt</strong></p><p>Description:</p><ul><li><p>Issues related to service versioning, integration, and support for legacy systems.</p></li></ul><p>Impact:</p><ul><li><p>Leads to integration challenges and operational inefficiencies, making it difficult to maintain and upgrade services.</p></li></ul><p><strong>12. Test Automation Debt</strong></p><p>Description:</p><ul><li><p>Lack of automated testing, leading to increased manual testing and higher risk of defects.</p></li></ul><p>Impact:</p><ul><li><p>Slows down the development process and increases the likelihood of undetected bugs in production. It also makes regression testing more time-consuming and error-prone.</p></li></ul><p><strong>13. Test Debt</strong></p><p>Description:</p><ul><li><p>Inadequate test coverage and testing practices, including lack of unit tests, integration tests, and performance tests.</p></li></ul><p>Impact:</p><ul><li><p>Increases the risk of defects in production, leading to system failures and customer dissatisfaction. It also makes it harder to ensure the quality and stability of the software over time.</p></li></ul><p>Each type of technical debt presents unique challenges and necessitates specific strategies for management and resolution. Recognizing and addressing these different forms of debt is crucial for maintaining a healthy and sustainable IT ecosystem within organizations.</p><h2><strong>Pay the technical debt</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hxmb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hxmb!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 424w, https://substackcdn.com/image/fetch/$s_!Hxmb!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 848w, https://substackcdn.com/image/fetch/$s_!Hxmb!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 1272w, https://substackcdn.com/image/fetch/$s_!Hxmb!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hxmb!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif" width="320" height="250" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:250,&quot;width&quot;:320,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!Hxmb!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 424w, https://substackcdn.com/image/fetch/$s_!Hxmb!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 848w, https://substackcdn.com/image/fetch/$s_!Hxmb!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 1272w, https://substackcdn.com/image/fetch/$s_!Hxmb!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58570b8a-849e-43f4-8fd0-432d50719f02_320x250.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Measuring Technical Debt</strong></h3><p>Properly measuring technical debt helps ensure it doesn't accrue a problematic amount of interest and allows for effective management. However, measuring technical debt can be challenging. Several metrics can be used to measure technical debt, including the <a href="https://www.techtarget.com/searchitoperations/tip/How-to-track-and-measure-technical-debt">Technical Debt Ratio</a> (TDR), defect ratios, code quality, completion time, and code reworking or refactoring.</p><h4><strong>Key Metrics for Measuring Technical Debt</strong></h4><h4><strong>Technical Debt Ratio (TDR)</strong></h4><p><strong>Technical Debt Ratio (TDR)</strong> is a metric that estimates the future cost of technical debt relative to the overall development cost of a software system. The formula for TDR is:</p><p>&#128161;</p><blockquote><p>Technical Debt Ration=(Remediation Cost / Development Cost)&#215;100</p></blockquote><ul><li><p><strong>Remediation Cost:</strong> The estimated cost to fix the accumulated technical debt in the software system.</p></li></ul><ul><li><p><strong>Development Cost:</strong> The cost incurred in developing the software system.</p></li></ul><p>By multiplying the ratio by 100, we express TDR as a percentage, making it easier to interpret and compare.</p><h4><strong>Interpreting Technical Debt Ratio</strong></h4><ul><li><p><strong>Ideal TDR:</strong> <strong>An ideal TDR is around 5%</strong>. This suggests that the cost to fix the technical debt is 5% of the total development cost, indicating manageable technical debt levels.</p></li></ul><ul><li><p><strong>High TDR:</strong> A high TDR indicates a significant amount of technical debt, suggesting that future maintenance and development efforts will be costly and time-consuming.</p></li></ul><ul><li><p><strong>Low TDR:</strong> A low TDR suggests that technical debt is under control, and future development efforts will be less hindered by the need for extensive refactoring or bug fixing.</p></li></ul><h4><strong>Example Calculation</strong></h4><p>Suppose a software project has the following costs:</p><ul><li><p><strong>Remediation Cost:</strong> $50,000 (the estimated cost to fix technical debt)</p></li></ul><ul><li><p><strong>Development Cost:</strong> $1,000,000 (the total cost of developing the software)</p></li></ul><p>Using the TDR formula:</p><p>TDR=(50,000/1,000,000)&#215;100=5%</p><p>This means that the cost to fix the technical debt is 5% of the total development cost, which is considered an ideal and manageable level.</p><h3><strong>Defect Ratios</strong></h3><p>Measures the number of new defects compared with old ones. A high ratio indicates accumulating technical debt.</p><h3><strong>Code Quality</strong></h3><p>Evaluates the quality of the codebase using metrics such as cyclomatic complexity, code duplication, and adherence to coding standards.</p><h3><strong>Completion Time</strong></h3><ul><li><p><strong>Description:</strong> Tracks the time taken to complete tasks. Increased completion times may indicate rising technical debt.</p></li></ul><h3><strong>Code Reworking or Refactoring</strong></h3><ul><li><p><strong>Description:</strong> Measures the frequency and extent of code changes due to reworking or refactoring efforts.</p></li></ul><h3><strong>Capitalizing on Technical Debt</strong></h3><p>Software engineering salaries typically come from <a href="https://swizec.com/blog/the-3-budgets/?utm_source=hackernewsletter&amp;utm_medium=email&amp;utm_term=working">three budgets</a>, influencing day-to-day work and career paths:</p><ol><li><p><strong>Sales/Marketing</strong></p></li></ol><ol start="2"><li><p><strong>Research and Development</strong></p></li></ol><ol start="3"><li><p><strong>Maintenance</strong></p></li></ol><p><strong>Maintenance:</strong> Often focused on cost optimization, this budget includes sysadmins and platform engineers. Maintenance work is seen as a cost to minimize, and it is often undervalued.</p><blockquote><p>One of the responsibilities of software engineering leaders is to ensure that the work their team performs is properly capitalized. Software that increases digital assets should also be added to financial assets, reducing tax liabilities. Maintenance work, however, is considered an expense and cannot be capitalized.</p></blockquote><p>Technical debt can sometimes be packaged into mini-projects that can be capitalized, such as migrating to new infrastructure or significant rewrites leading to performance improvements. For example, resolving issues with an over-reliance on object-oriented programming constructs in a Scala codebase could result in a more maintainable system and improved performance. Identifying and addressing a group of technical-debt tickets can constitute a small project if the backlog.</p><h4><strong>Tracking Technical Debt</strong></h4><p>To stay ahead of and remain accountable for technical debt, teams need to track it through change management processes. One effective way to do this is by creating a technical debt registry.</p><h4><strong>Technical Debt Registry</strong></h4><p>A technical debt registry is a document that lists all existing issues, explains their consequences, suggests resources to fix the problems, and categorizes them by severity. As new problems arise and decisions are made, changes can be logged using a ticket or tracking system and prioritized in the registry.</p><p>Some project management tools include features to improve code quality and manage backlogs, helping teams to track and manage technical debt more effectively.</p><h3><strong>Best Practices to become free of technical debt</strong></h3><ol><li><p><strong>Track Tech Debt Meticulously </strong>&#128373;&#65039;<strong> </strong>Monitor, track, and prioritize tech debt like any other development challenge. You can't fix what you can't see.</p></li></ol><ol start="2"><li><p><strong>Categorize Tech Debt </strong>&#129534; Distinguish between intentional (good) and unintentional (bad) tech debt to prioritize effectively.</p></li></ol><ol start="3"><li><p><strong>Prioritize Critical Issues </strong>&#10071; Address tech debt causing application failures or security flaws first. Less impactful issues can wait.</p></li></ol><ol start="4"><li><p><strong>Integrate Tech Debt in Agile Processes </strong>&#129501;<strong> </strong>Ensure paying off tech debt is part of daily development, not indefinitely postponed.</p></li></ol><ol start="5"><li><p><strong>Set and Adhere to Quality Standards </strong>&#128105;&#8205;&#128187;<strong> </strong>Prevent accidental tech debt by discouraging sloppy coding practices.</p></li></ol><ol start="6"><li><p><strong>Reward Maintenance Work </strong>&#127959;&#65039;<strong> </strong>Recognize efforts in maintaining and improving existing software, not just new developments.</p></li></ol><ol start="7"><li><p><strong>Avoid Sudden Schedule Changes </strong>&#128197;<strong> </strong>Provide realistic schedules to prevent tech debt from accruing due to last-minute changes.</p></li></ol><h4><strong>Taming Your Team's Debt</strong></h4><p>Working with legacy code often means inheriting technical debt. Here are steps to manage it:</p><ol><li><p><strong>Define Technical Debt:</strong></p><ol><li><p>Technical debt is the gap between what was promised and what was delivered, including shortcuts taken to meet deadlines.</p></li><li><p>Clear communication between development and product management is essential to distinguish between tech debt, architectural changes, and new features.</p></li></ol></li></ol><ol start="2"><li><p><strong>Prioritize in Sprint Planning:</strong></p><ol><li><p>Include tech debt in sprint planning, not in a separate backlog.</p></li></ol></li></ol><ol start="3"><li><p><strong>Maintain a Strict Definition of Done:</strong></p><ol><li><p>Avoid adding separate testing tasks to user stories. Ensure testing is part of the original story to prevent tech debt.</p></li></ol></li></ol><ol start="4"><li><p><strong>Automate Testing:</strong></p><ol><li><p>When a bug is found, create an automated test for it. Fix the bug and rerun the test to ensure it is resolved. This is key to maintaining quality.</p></li></ol></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9a7m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9a7m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 424w, https://substackcdn.com/image/fetch/$s_!9a7m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 848w, https://substackcdn.com/image/fetch/$s_!9a7m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 1272w, https://substackcdn.com/image/fetch/$s_!9a7m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9a7m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png" width="768" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!9a7m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 424w, https://substackcdn.com/image/fetch/$s_!9a7m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 848w, https://substackcdn.com/image/fetch/$s_!9a7m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 1272w, https://substackcdn.com/image/fetch/$s_!9a7m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49c28655-9ebb-4ce7-a862-0ff4f3800d2b_768x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Action Items for Managing Technical Debt</strong></h4><ol><li><p><strong>Educate Product Owners:</strong> Highlight the true cost of technical debt to ensure accurate story point values for future resolutions.</p></li></ol><ol start="2"><li><p><strong>Modularize Architecture:</strong> Adopt modular design and strictly manage tech debt in new components. As agility in new components becomes evident, extend these practices.</p></li></ol><ol start="3"><li><p><strong>Write Automated Tests:</strong> Use automated tests and continuous integration to prevent bugs. When new bugs are found, create tests to catch them early in the future.</p></li></ol><p>By following these best practices, teams can manage technical debt effectively, ensuring sustainable development and maintaining software quality.</p><h2><strong>Boeing's Technical Debt Case Study: The Cost and Evolution</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I_8G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I_8G!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 424w, https://substackcdn.com/image/fetch/$s_!I_8G!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 848w, https://substackcdn.com/image/fetch/$s_!I_8G!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 1272w, https://substackcdn.com/image/fetch/$s_!I_8G!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I_8G!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif" width="480" height="288" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:288,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;notion image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="notion image" title="notion image" srcset="https://substackcdn.com/image/fetch/$s_!I_8G!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 424w, https://substackcdn.com/image/fetch/$s_!I_8G!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 848w, https://substackcdn.com/image/fetch/$s_!I_8G!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 1272w, https://substackcdn.com/image/fetch/$s_!I_8G!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84aa106f-09c0-479a-9519-56acd9264aed_480x288.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>The Genesis of Technical Debt</strong></h3><p>Boeing's technical debt began forming between 2015 and 2018, driven by a combination of design flaws, production challenges, and regulatory oversights. This period marked a critical juncture for the company as it faced mounting pressure to stay ahead of its competitors, particularly Airbus.</p><h4><strong>Accelerated Production Goals</strong></h4><p>In response to fierce competition from Airbus, Boeing expedited the development and production of the 737 MAX. This urgency strained the supply chain and led to numerous quality issues. Internal reports revealed chaotic production conditions, with out-of-sequence work and parts shortages forcing makeshift solutions like using concrete blocks on engine pylons to prevent tipping due to missing engines. This hasty approach to production compromised the integrity of the final product, leading to severe consequences later on (<a href="https://www.politico.com/news/magazine/2024/02/26/former-boeing-employee-speaks-out-00142948">Politico</a>) (<a href="https://www.faa.gov/newsroom/updates-boeing-737-9-max-aircraft">FAA</a>).</p><h4><strong>MCAS System Flaws</strong></h4><p>The Maneuvering Characteristics Augmentation System (MCAS) was designed to make the 737 MAX handle like previous models. However, inadequate testing and insufficient pilot training turned the MCAS into a critical factor in two fatal crashes. Boeing had initially not disclosed the MCAS to pilots, exacerbating the safety risks. This lack of transparency created a dangerous scenario where pilots were unprepared to handle the system's unexpected behavior (<a href="https://link.springer.com/article/10.1007/s11948-020-00252-y">SpringerLink</a>) (<a href="https://democrats-transportation.house.gov/committee-activity/boeing-737-max-investigation">HouseTransCom</a>).</p><h4><strong>Management and Regulatory Oversights</strong></h4><p>Boeing&#8217;s management prioritized speed over safety, a decision that proved costly in the long run. The minimal FAA oversight during manufacturing allowed quality control issues to persist unchecked, further contributing to the technical debt. This lack of stringent regulatory supervision created an environment where safety protocols were often bypassed in favor of meeting production deadlines (<a href="https://link.springer.com/article/10.1007/s11948-020-00252-y">SpringerLink</a>) (<a href="https://www.wemu.org/npr-national-news/2024-01-12/the-faa-is-tightening-oversight-of-boeing-and-will-audit-production-of-the-737-max-9">WEMU</a>).</p><h3><strong>Consequences</strong></h3><p>The accumulation of technical debt had severe and far-reaching impacts on Boeing, affecting both its financial health and its reputation.</p><h4><strong>Alaska Airlines Incident</strong></h4><p>In early 2024, an Alaska Airlines flight experienced a midair blowout of a cabin panel door plug on a nearly new 737 MAX 9, forcing an emergency landing and highlighting ongoing quality control issues. This incident underscored the persistent problems within Boeing&#8217;s production processes and raised further questions about the reliability of the 737 MAX aircraft (<a href="https://www.marketscreener.com/quote/stock/BOEING-4816/news/Boeing-taps-debt-market-to-raise-10-billion-sources-46567255/">MarketScreener</a>) (<a href="https://www.houstonpublicmedia.org/npr/2024/01/06/1223296736/faa-orders-grounding-of-certain-boeing-737-max-9-planes-after-alaska-airlines-incident/">Houston Public Media</a>).</p><h4><strong>Reputational Damage</strong></h4><p>The 737 MAX crashes and subsequent investigations severely damaged Boeing&#8217;s reputation. Whistleblowers and employees highlighted systemic issues within the company&#8217;s culture and practices, exacerbating the trust deficit. This reputational damage has long-term implications, affecting customer trust and investor confidence (<a href="https://www.politico.com/news/magazine/2024/02/26/former-boeing-employee-speaks-out-00142948">Politico</a>) (<a href="https://link.springer.com/article/10.1007/s11948-020-00252-y">SpringerLink</a>).</p><h4><strong>Financial Impact</strong></h4><p>Boeing reported a GAAP net loss of $355 million in Q1 2024 and a negative free cash flow of $3.9 billion. To manage liquidity and upcoming debt maturities, Boeing raised $10 billion from the debt market. The total costs for technical debt and related issues are estimated to be around $20 billion, including settlements, fines, and additional safety measures. These financial strains have forced the company to rethink its strategies and focus on long-term sustainability (<a href="https://www.fitchratings.com/research/corporate-finance/fitch-revises-boeing-outlook-to-stable-15-03-2024#:~:text=URL%3A%20https%3A%2F%2Fwww.fitchratings.com%2Fresearch%2Fcorporate">Fitch Ratings</a>) (<a href="https://boeing.mediaroom.com/2024-04-24-Boeing-Reports-First-Quarter-Results">MediaRoom</a>) (<a href="https://www.marketscreener.com/quote/stock/BOEING-4816/">MarketScreener</a>).</p><h3><strong>Mitigation Efforts</strong></h3><p><a href="https://www.newsweek.com/boeing-whistleblower-richard-cuevas-says-problems-are-tip-iceberg-1920307">Boeing Whistleblower Issues New Warning: 'Tip of the Iceberg'</a></p><p><a href="https://www.newsweek.com/boeing-whistleblower-richard-cuevas-says-problems-are-tip-iceberg-1920307">A Boeing whistleblower raised concerns with Newsweek about the the company's safety issues and workplace culture.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VBzE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VBzE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VBzE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VBzE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VBzE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VBzE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg" width="768" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Boeing Whistleblower Issues New Warning: 'Tip of the Iceberg'&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Boeing Whistleblower Issues New Warning: 'Tip of the Iceberg'" title="Boeing Whistleblower Issues New Warning: 'Tip of the Iceberg'" srcset="https://substackcdn.com/image/fetch/$s_!VBzE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VBzE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VBzE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VBzE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b780ef7-b0d1-4efc-9ffd-5b36ecef6db7_768x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://www.newsweek.com/boeing-whistleblower-richard-cuevas-says-problems-are-tip-iceberg-1920307">https://www.newsweek.com/boeing-whistleblower-richard-cuevas-says-problems-are-tip-iceberg-1920307</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QZ0P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QZ0P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QZ0P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QZ0P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QZ0P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QZ0P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg" width="1011" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1011,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Boeing Whistleblower Issues New Warning: 'Tip of the Iceberg'&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Boeing Whistleblower Issues New Warning: 'Tip of the Iceberg'" title="Boeing Whistleblower Issues New Warning: 'Tip of the Iceberg'" srcset="https://substackcdn.com/image/fetch/$s_!QZ0P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QZ0P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QZ0P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QZ0P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f4d1c4-6c3a-47c7-82e4-0f029ae73f71_1011x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In response to the crisis, Boeing implemented several strategies to address its technical debt and restore confidence in its operations.</p><h4><strong>Production Adjustments</strong></h4><p>Boeing slowed production rates to ensure higher quality and safety standards. This step was crucial in addressing out-of-sequence work and quality control failures that had plagued the production lines. By taking a more measured approach, Boeing aimed to produce safer and more reliable aircraft (<a href="https://boeing.mediaroom.com/2024-04-24-Boeing-Reports-First-Quarter-Results">MediaRoom</a>) (<a href="https://democrats-transportation.house.gov/committee-activity/boeing-737-max-investigation">HouseTransCom</a>).</p><h4><strong>Debt Financing</strong></h4><p>To stabilize its financial situation, Boeing raised $10 billion through bond markets. This move enhanced liquidity and helped the company manage its debt maturities more effectively. By securing additional funds, Boeing was able to invest in critical areas that required immediate attention (<a href="https://www.marketscreener.com/quote/stock/BOEING-4816/news/Boeing-taps-debt-market-to-raise-10-billion-sources-46567255/">MarketScreener</a>) (<a href="https://www.marketscreener.com/quote/stock/BOEING-4816/">MarketScreener</a>).</p><h4><strong>Quality Management Improvements</strong></h4><p>Boeing established a new safety board and appointed a chief safety officer to enhance its quality management systems. These changes were made in response to FAA feedback and internal audits, aiming to create a more robust framework for monitoring and improving safety standards. The new safety board is tasked with overseeing all aspects of production and ensuring compliance with regulatory requirements (<a href="https://www.politico.com/news/magazine/2024/02/26/former-boeing-employee-speaks-out-00142948">Politico</a>) (<a href="https://www.faa.gov/newsroom/updates-boeing-737-9-max-aircraft">FAA</a>).</p><h4><strong>Regulatory Compliance and Training</strong></h4><p>Boeing made significant revisions to the MCAS system, including using inputs from both angle of attack (AOA) sensors, limiting MCAS activation, and ensuring manual override by pilots. Additionally, Boeing mandated pilot simulator training for the redesigned system. These measures were critical in addressing the flaws that had led to previous accidents and ensuring that pilots were adequately prepared to handle the aircraft (<a href="https://link.springer.com/article/10.1007/s11948-020-00252-y">SpringerLink</a>) (<a href="https://democrats-transportation.house.gov/committee-activity/boeing-737-max-investigation">HouseTransCom</a>).</p><h4><strong>Cultural Shift and Employee Engagement</strong></h4><p>To address the underlying cultural issues, Boeing has initiated programs aimed at fostering a culture of safety and transparency within the organization. Employee engagement initiatives have been launched to encourage open communication and empower workers to report safety concerns without fear of retaliation. These efforts are intended to rebuild trust within the company and create a more collaborative work environment.</p><h4><strong>Long-Term Strategic Planning</strong></h4><p>Boeing is also focusing on long-term strategic planning to ensure sustainable growth and prevent the recurrence of similar issues. This includes investing in new technologies, improving supply chain management, and exploring new markets. By taking a holistic approach, Boeing aims to strengthen its position in the industry and build a more resilient organization.</p><p>These steps represent Boeing's concerted efforts to rectify the systemic issues that led to its technical debt. By addressing both the immediate and long-term challenges, Boeing aims to ensure a safer, more reliable, and financially stable future for its operations.</p><h2><strong>Conclusion</strong></h2><p>Technical debt, like financial debt, is a normal part of software development that needs careful handling. While it allows for quick progress, it can lead to long-term problems like lower software quality and slower development. The goal is to balance fast delivery with maintaining good, clean systems. To manage technical debt effectively, teams should keep track of it, prioritize fixing the most important issues, include debt resolution in their regular workflow, and understand its impact. By following good practices and promoting a culture of quality and responsibility, teams can reduce the risks of technical debt and ensure ongoing, high-quality software development.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.dativo.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Sergey Enin's substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>