Skip to content

Allocation Case Study Full

The Missing Infrastructure for Enterprise AI Governance

AI skeptics say it produces ungoverned code. AI believers say it's transformative. They're both right. Here's the architecture — and the proof.


If you follow enterprise AI, you know the debate.

The believers: AI is transformative. It understands intent, generates systems from natural language, collapses months of development into days. The productivity leap is real and irreversible.

The skeptics: AI-generated code can't be governed. Business logic is path-dependent. Dependencies are implicit. Constraints get enforced in some paths and silently missed in others. You can't audit what you can't trace.

Both camps are right.

And remarkably, AI itself agrees. When we asked GitHub Copilot to generate business logic from natural language requirements, it produced 220 lines of procedural code — with critical bugs. When we prompted it about edge cases, it found the bugs. Then, unprompted, it wrote a comprehensive analysis explaining why procedural code for business logic cannot be correct.

"AI alone generates broken code. AI + Declarative Rules generates working systems."

That's not our claim. That's Copilot's conclusion about its own output.

This is the story of what "AI + Declarative Rules" looks like in practice — told through a real system, with real code, and a result that makes the debate obsolete.


Why the Skeptics Are Precisely Right

Before the story, the diagnosis — because most AI governance discussions get this wrong.

The problem isn't code quality. It's structural. AI generates code by pattern matching — inferring what comes next from what it has seen. For most programming tasks, this works remarkably well. But enterprise business logic has a specific property that pattern matching cannot handle: transitive dependency chains.

Here's a concrete example. Five simple business rules:

  1. Copy unit_price from Product to Item
  2. Item amount = quantity × unit_price
  3. Order total = sum of Item amounts
  4. Customer balance = sum of unshipped Order totals
  5. Customer balance ≤ credit_limit

Copilot generated 220 lines of procedural code for these five rules. When we asked "what if the order's customer_id changes?" — Copilot found a critical bug. When we asked "what if the item's product_id changes?" — another critical bug.

Both bugs follow the same pattern: foreign key changes require updating both old and new parents. Procedural code handles one direction and silently misses the other. The system doesn't crash. Tests pass. The numbers are quietly wrong until an audit finds them.

Then Copilot diagnosed why:

Dependency graphs have exponential combinations of change paths. AI generates code sequentially — functions for each operation — but cannot enumerate all change paths, cannot prove completeness, and cannot handle transitive dependencies. Even asking "what if X changes?" only finds bugs for that specific X. We'd need to ask about every attribute in every table.

This isn't a capability gap that better models will close. It's a paradigm mismatch. Dependency graphs require deterministic computation, not inference. As Copilot concluded: "This is not about code quality or AI capability. Procedural code for dependency graphs becomes prohibitively complex because it must handle all change paths with explicit code and prove that all paths are covered — a difficult verification problem."

The skeptics are precisely right. And "be more careful with prompts" doesn't solve it. The architecture is the problem.


A Three-Chapter Story

Chapter 1: Procedural Code — Four Years, Never Shipped

A US state agency needed to allocate charges across departments and GL accounts.

The logic is straightforward to describe: when a charge arrives against a project, cascade it through two levels — split across departments by funding percentages, then split each department's share across GL accounts by charge definition percentages. Both levels must total exactly 100%. A charge may only post if the project's funding definition is active. All totals must roll up correctly to project and GL account balances.

Two developers spent four years trying to build this in procedural code.

They never shipped it.

Not because they were slow. Because enforcing these rules correctly across every code path — every API, every integration, every edge case, while maintaining consistency as requirements evolved — is intractable in procedural code at scale. The dependency chains are real. Missing one path breaks the books silently.

Four years. Two developers. No working system. The skeptics were right.


Chapter 2: Declarative Rules — One Weekend

Here's what changed everything — and note carefully: this happened before AI.

A developer with expertise in declarative rules rebuilt the same system over a weekend.

The paradigm shift: instead of writing how to enforce each constraint across every code path, you declare what must be true — and the engine handles the rest:

# 100% validation — maintained automatically, on every path, always
Rule.sum(derive=models.DeptChargeDefinition.total_percent,
         as_sum_of=models.DeptChargeDefinitionLine.percent)
Rule.formula(derive=models.DeptChargeDefinition.is_active,
             as_expression=lambda row: 1 if row.total_percent == 100 else 0)

# Roll-ups — automatically correct across all sources, all paths
Rule.sum(derive=models.Project.total_charges, as_sum_of=models.Charge.amount)
Rule.sum(derive=models.GlAccount.total_allocated,
         as_sum_of=models.ChargeGlAllocation.amount)

# Two-level cascade — declared once, fires on every charge insert, every path
Allocate(provider=models.Charge,
         recipients=funding_lines_for_charge,
         creating_allocation=models.ChargeDeptAllocation,
         while_calling_allocator=allocate_charge_to_dept)

Allocate(provider=models.ChargeDeptAllocation,
         recipients=charge_def_lines_for_dept_allocation,
         creating_allocation=models.ChargeGlAllocation,
         while_calling_allocator=allocate_dept_to_gl)

These rules fire at commit time — for every path, automatically, without exception. The engine handles what Copilot said procedural code cannot: it builds the complete dependency graph at startup, listens to every attribute change, fires only the affected rules in dependency order, handles old and new parents automatically on FK changes. There is no path that can bypass it. There is no change it can miss.

Think of a spreadsheet. You don't write code to handle "what if cell B5 changes." You declare that B10 = SUM(B1:B9) and the spreadsheet handles dependencies automatically. That's what the rules engine does for multi-table business logic, at commit time, across every path.

44x less code. What defeated two developers over four years — what Copilot confirmed cannot be made correct in procedural form — becomes a handful of declared invariants. All correct. All auditable. All automatically enforced.

The system that never shipped in four years: done in a weekend.

Rules are the governance infrastructure that enterprise AI is missing. They've existed for decades. The barrier was the learning curve — thinking declaratively is a different paradigm.


Chapter 3: AI + Rules + CE — Five Minutes From a Business Prompt

Now AI enters. And this is where the believers are completely right.

Here is what it takes to build the same allocation system with GenAI-Logic today:

Departments own a series of General Ledger Accounts. Departments also own Department Charge Definitions — each defines what percent of an allocated cost flows to each of the Department's GL Accounts. An active Department Charge Definition must cover exactly 100% (derived: total_percent = sum of lines; is_active = 1 when total_percent == 100).

Project Funding Definitions define which Departments fund a designated percent of a Project's costs. An active Project Funding Definition must cover exactly 100%.

When a Charge is received against a Project, cascade-allocate it in two levels: Level 1 — allocate the Charge amount to each Department per their Project Funding Line percent Level 2 — allocate each department amount to that Department's GL Accounts per their Charge Definition line percents

Constraint: a Charge may only be posted if the Project's Project Funding Definition is active.

Charges can be placed by contractors who may supply only a minimal project description — use AI Rules to find an Active Project based on a fuzzy match to project name and past charges from the contractor.

A business prompt. No rules syntax. No declarative training. Written the way a business analyst writes a specification.

Five minutes. Working system — database, API, admin UI, two-level cascade, 100% validation, automated roll-ups, AI fuzzy project matching, full audit trail — governed from the first commit.

The progression: - Procedural code: four years, two developers, never shipped - Declarative rules: one weekend, requires deep expertise
- AI + Rules + CE: five minutes, business prompt, no rules training required


What Makes the Difference: Context Engineering

The same business prompt given to a native AI tool produces procedural code. The cascade is nested loops. The constraint lives in one endpoint. Dependencies are implicit. FrankenCode — working in the demo, broken in production.

The difference is Context Engineering — a 9,000-line architectural knowledge layer that lives in the repository alongside the code. CE teaches AI the rules DSL: what rule types exist, how they interact, how to structure cascades, aggregations, and constraints. Without CE, AI pattern-matches to what it knows: procedural code. With CE, AI translates business intent into declared invariants on data.

Procedural intent in. Commit-enforced, path-independent governance out.

This is what closes the learning curve that kept rules inaccessible for decades. You no longer need to think declaratively. You write the business requirement. CE handles the translation.


And AI Has a Runtime Role Too

Contractors submitting charges provide only rough project descriptions. Matching "roads repair, north district" to the correct active project requires probabilistic judgment — exactly what AI is brilliant at.

# AI runs FIRST — identifies the project probabilistically
Rule.early_row_event(on_class=models.Charge,
                     calling=identify_project_for_charge)

# Deterministic rules take over — constraint enforced, cascade fired, roll-ups maintained
Rule.early_row_event(on_class=models.Charge, calling=check_active_funding)
Allocate(provider=models.Charge, ...)

AI proposes. Rules govern what commits. The agent reasons freely; the commit gate enforces invariants. This is how you deploy AI safely: let it do what it's brilliant at — probabilistic reasoning, fuzzy inference, natural language — while the rules engine handles what it structurally cannot: transitive dependencies, completeness, commit-time enforcement.


The Missing Infrastructure

The governance gap in enterprise AI isn't a research problem. It isn't waiting for better models. Copilot told us so itself. It's an infrastructure problem — and infrastructure problems have infrastructure solutions.

GenAI-Logic provides three elements that each require the others:

Logic Automation — the rules engine, hooked into the ORM commit event, inside the transaction. Every path — APIs, agents, UIs, batch processes, future integrations — converges on one control point. Dependency graph computed at startup. Old and new parents handled automatically. Nothing bypasses it by architecture, not discipline. The audit trail is structural: requirement → rule → execution log.

Generative AI — translating business intent into governed rules. Five minutes from a business prompt. AI continues as development partner throughout: generating test suites from the rules, explaining any rule or transaction in plain language, supporting incremental changes in your IDE.

Context Engineering — ensuring AI generates rules, not FrankenCode. The bridge between business intent and what the rules engine needs. Without it, the same AI, the same prompt, produces procedural code that fails exactly as Copilot demonstrated.

Remove any one: the system degrades. Rules without AI requires deep declarative expertise — the weekend, not five minutes. AI without rules produces FrankenCode — the demo, not the governed system. Both without CE means AI generates procedural code even when rules are the intent.

You can't govern paths. You can govern the commit.


What This Means Now

Every enterprise deploying AI at scale is discovering the same gap. Agents that bypass business constraints. Calculations that drift. Audit trails that don't exist. Boards asking questions nobody can answer cleanly.

This isn't a future problem. It's happening now, in production systems, everywhere AI is being applied to real business logic.

The infrastructure to solve it exists today — open source, proven on real enterprise systems. The allocation system that two developers spent four years on and never shipped is now a five-minute business prompt. Governed from the first commit. Auditable by design. Correct on every path, including paths that don't exist yet.

AI alone generates broken code. AI + Declarative Rules generates working systems.

Copilot said it. The architecture proves it.


The reference implementation is open source: github.com/ApiLogicServer/charge-allocation-reference. The Copilot self-diagnosis: declarative-vs-procedural-comparison.md. GenAI-Logic combines Logic Automation, Generative AI, and Context Engineering to produce governed enterprise systems from natural language. Learn more at genai-logic.com.