Champaign Magazine

champaignmagazine.com


Ceiling, Floor, and Slope: A Falsifiable Dynamical Model of Synchronization for Gradual AGI

By W.H.L. and Claude (Sonnet 5)

Gradual AGI series #4 — Champaign Magazine

Abstract

Prior work in this series proposed Ceiling, Floor, and Slope as vocabulary for the rate at which societies assimilate advancing AI capability, reframing “Gradual AGI” as a civilizational assimilation rate rather than a single capability threshold. That vocabulary was descriptive: it named a gap without deriving its dynamics. This paper develops a minimal discrete-time dynamical specification of the same problem — a recurrence over the Ceiling-Floor gap with derived fixed points and stability conditions, a threshold-gated coupling between capability and adoption, a synchronization rate that decays as a function of the remaining gap rather than of elapsed time, and a two-tier Floor distinguishing continuous practice from discrete institutional gates subject to both AI-compressible and structurally fixed delay. Under threshold-heterogeneous adoption, the model recovers Rogers’ diffusion-of-innovations curve as a special case, situating the framework within a longer tradition — Ogburn’s cultural lag, David’s account of general-purpose-technology adoption lag, Perez’s installation/deployment periodization — rather than beside it. A loss function is derived directly from the state variable rather than assumed as a weighted sum, and six dynamical regimes, including an overtrust regime absent from the original vocabulary, are shown to follow from the model rather than being independently posited. The framework is tested against four concurrently-tracked domains — professional Go, AI-assisted drug development, frontier language model releases, and autonomous vehicle deployment — using real longitudinal adoption data rather than illustrative examples. Findings are reported symmetrically: confirmed threshold-gating and heterogeneity effects across all four cases, alongside two unresolved results — no domain examined produces a confirmed instance of the model’s overtrust regime, and whether the synchronization rate approaches zero or a permanent nonzero residual remains empirically open throughout.

1. Introduction

Gradual AGI reframes the question of transformative AI not as when a threshold is crossed, but as how fast a civilization assimilates capability once it has been. The Ceiling/Floor/Slope vocabulary introduced in prior installments of this series captures that reframing descriptively: frontier capability (Ceiling) advances largely unconstrained by institutional pace, while realized practice — professional, regulatory, everyday (Floor) — lags behind at a rate (Slope) set not by the technology but by the society absorbing it. The central claim is that the rate of civilizational assimilation, not the timing of any single capability milestone, is the more consequential and more measurable quantity. The object of study, stated precisely, is synchronization dynamics between any advancing capability and the institutions absorbing it; AI is this paper’s motivating domain and, as §5 shows, currently its empirically richest one, but nothing in Sections 2–4 is specific to artificial intelligence.

This is not a new problem, and AI is not its first occasion. The gap between what a technology can do and what a society has absorbed has been a recurring subject across sociology, economic history, and the economics of innovation for over a century. Ogburn (1922) gave it its first systematic name — cultural lag — arguing that a society’s material culture (tools, techniques, technologies) changes faster than its adaptive, non-material culture (laws, norms, institutions), and that the interval between the two is itself a legitimate object of study rather than a mere transitional footnote. Rogers (1962) gave the absorption side of that gap its most influential empirical shape, the S-curve of diffusion across a heterogeneous population of adopters. Economic historians studying general-purpose technologies found the same structure in harder numbers: David (1990) showed that the productivity gains from electrification took roughly three decades to materialize, not because factories were slow to install dynamos, but because realizing their value required reorganizing plant layout around distributed motors rather than centralized steam shafts — the technology’s ceiling had moved; the floor had not yet reorganized around it. Bresnahan and Trajtenberg’s (1995) treatment of general-purpose technologies formalized why this lag is not incidental: a technology whose value depends on complementary reinvention by its users will, in a decentralized economy, be adopted too little and too late relative to its technical potential. Perez (2002) found a version of the same two-phase structure recurring across five historical technological revolutions at the scale of entire economies — an Installation Period in which a new technology and its financial capital outrun the institutions meant to absorb them, a Turning Point, and a Deployment Period in which the surrounding economic and social fabric is reorganized around it.

Read together, this literature establishes that a capability-institution gap is a general feature of technological change, not a distinctive property of AI — and that the more durable contributions in this tradition are the ones that gave the gap a measurable structure (David’s three-decade lag; Perez’s periodization; Rogers’ adoption curve) rather than resting on the observation that a gap exists. Section 2 shows that Rogers’ diffusion curve is in fact recoverable as a special case of the dynamical model developed here — specifically, the case of a population with heterogeneous adoption thresholds absorbing a Ceiling that has locally stopped moving. Where this paper’s model departs from Rogers is precisely where the Ceiling does not hold still: repeated re-acceleration of adoption at each new capability threshold is a prediction this framework makes and that a fixed-target diffusion model has no mechanism to produce.

This literature is historical and sociological in character; a separate, more formal tradition addresses adoption dynamics directly, and it is worth being explicit about where this paper sits relative to it rather than presenting the recurrence in Section 2 as sui generis. Bass diffusion models and epidemic, SIR-style adoption models share this paper’s interest in population-level adoption curves but generally treat the underlying innovation or contagion source as fixed rather than advancing. System dynamics and adaptive control offer richer feedback and multi-variable machinery than the minimal two-variable recurrence developed here, at the cost of the tractability that makes falsification against real, concurrent data feasible. Network contagion models add a topology this paper’s population-level aggregation deliberately abstracts away. What distinguishes the present model is not superiority to these traditions but scope: a moving Ceiling coupled to threshold-gated, heterogeneous adoption, developed to be minimal enough to calibrate against real, concurrent, cross-domain data rather than a single retrospective case.

This paper is the latest installment of the Gradual AGI series, which has argued in prior work that AGI is better understood as a recursive, unevenly distributed resource than as a singular arrival event, and which introduced Ceiling, Floor, and Slope as this problem’s vocabulary for the specific case of artificial intelligence — most recently in Gradual AGI as Synchronization for Transformative Adoption (v1.2). That groundwork is presupposed here rather than re-argued.

What the vocabulary in v1.2 had not yet done was earn its place in the tradition described above. Most of that literature’s most durable results are domain-specific and retrospective — David’s three-decade estimate is a property of one historical technology; Perez’s periodization is read off completed cycles, not predicted in advance of them. Ogburn’s own framework, for its foundational value, remained a descriptive taxonomy rather than a generative one: it names the lag without producing testable claims about its rate, its stability, or the conditions under which it widens rather than closes. v1.2’s Ceiling/Floor/Slope vocabulary risked the same fate — an editorial review preceding its publication noted directly that the paper’s notation was descriptive rather than generative, and that nothing yet followed from C, F, S, and L(H,M) as stated. This paper treats that as the specific, tractable version of a general problem this literature has faced repeatedly: whether a real-time, cross-domain, falsifiable dynamics of technology-society synchronization is achievable at all, or whether the field’s best available accounts are necessarily retrospective. Sections 2–4 attempt the former. Section 5 tests it, concurrently, across four live domains rather than one completed historical case — which is itself a departure from how this literature has typically proceeded.

If this paper has a single central claim, it is that synchronization is threshold-gated: coupling between capability and adoption switches on only once a credible threshold is crossed, not gradually as capability merely increases (§2.2). Ceiling, Floor, and Slope are the vocabulary; threshold-gating is the mechanism that makes the vocabulary generate testable behavior. Building from that claim, the paper’s contribution is fourfold. First, a minimal dynamical specification (§2) in which the gap’s fixed point, stability regimes, threshold-gated coupling, and a non-constant, gap-dependent synchronization rate are derived consequences of the recurrence rather than independent assumptions. Second, a loss function (§3) defined as a monotonic transform of the gap itself, with the four qualitative drivers proposed in earlier drafts retained only as a diagnostic taxonomy for interpreting the rate parameter, not as an uncalibrated weighted sum. Third, a regime classification (§4), including an overtrust regime — adoption outrunning demonstrated capability — that the descriptive version of the framework had no mechanism to express. Fourth, four fully calibrated real-domain instantiations (§5): Go, a closed case with a saturated Ceiling and decade-long adoption data; AlphaFold-driven drug development, an open case with a non-saturating Ceiling and an explicit two-tier institutional floor; frontier model releases, the richest case for heterogeneous and multi-actor gating; and Tesla FSD, a jurisdiction-gated case. Negative and unresolved findings are reported as substantive results rather than omitted: no case examined produces a confirmed instance of the overtrust regime the model predicts should exist, and whether the synchronization rate decays to zero or to a permanent nonzero residual remains empirically open in every domain tested. Section 6 treats both as the paper’s principal open questions rather than as loose ends.

This paper is deliberately scoped to synchronization dynamics — how the system evolves under the assumptions of §2 — not to optimization or control: how a trajectory among those dynamics should be deliberately steered. The distinction matters and is maintained throughout: §3.4 treats minimization as descriptive, a property the recurrence already has, not a policy lever any actor pulls. A planned companion paper, provisionally titled Gradual AGI as Optimization, will take up the normative question — which trajectory a society should intentionally pursue, under multiple stakeholder objectives and a genuinely non-convex landscape — building on the dynamics this paper establishes rather than anticipating its answers. Positioned against the broader Gradual AGI research program, this paper occupies the Dynamics layer of a progression running from Ontology (AGI as abundant, recursive resources, established in prior installments) through Dynamics (the present model) to Control and Governance (subsequent work); it does not anticipate results from those later stages.

The remainder of the paper proceeds as follows. Section 2 develops the dynamical specification. Section 3 derives the loss function. Section 4 classifies dynamical regimes. Section 5 presents the four domain instantiations and a cross-case synthesis. Section 6 discusses limitations and open questions, including the gate-oscillation phenomenon identified in Section 5 that the current regime classification does not yet cover.

2. A Minimal Dynamical Specification

The model developed in this section rests on three primitive components. The Ceiling (C) is the frontier capability state of an evolving technological system. The Floor (F) is the realized societal state reflecting practical assimilation of that capability. Synchronization is the dynamical process governing how the Floor evolves relative to the Ceiling — encompassing not a single number but the threshold-gating (§2.2), decay behavior (§2.3), and institutional gating (§2.4) that jointly determine it. Slope, operationalized below as the realized rate S_t, is this paper’s empirical window onto that process, not a synonym for it: the phenomenon under study is synchronization; Slope is one of its measurable observables, in the same sense that heat flux is a measurable observable of the broader process of heat transfer. The remainder of this section develops a minimal, falsifiable dynamical realization of synchronization so defined.

2.0 State variables

Let C_t (Ceiling), F_t (Floor), and S_t (Slope) be discrete-time variables, with gap G_t = C_t - F_t. S_t is the realized synchronization rate — the empirically measurable fraction of the current gap closed in one period:

      S_t \equiv \Delta F_t/ G_t = (F_{t+1} - F_t)/ (C_t - F_t)  (1)

\beta _t, introduced below, is the structural parameter the model predicts S_t should equal; where measured S_t diverges from predicted \beta _t, that divergence is diagnostic (§3.3). One clarification here is load-bearing for everything that follows. Within a single period, \beta _t and S_t cannot diverge: rearranging F_{t+1} = F_t + \beta G_t for \beta yields exactly S_t‘s definition, so a single-period structural estimate is tautologically identical to the realized rate. This is not a flaw to patch over — it means the model’s falsifiable content never lives in a single-period comparison. It lives in whether a constrained functional form of \beta across multiple periods — a constant \bar{\beta} (§2.2), or the gap-dependent decay \beta (G_t) (§2.3) — predicts the observed sequence of S_t values better than the alternatives it competes against. Made explicit, the comparison is a residual sequence, not a single number:

   e_t = \hat{\beta} (G_t) - S_t, t = 1, \ldots, T                         (2)

where \hat{\beta} (G_t) is a candidate functional form’s fitted parameters, evaluated at period t‘s observed gap — not a per-period estimate, but the prediction a single fitted form makes at each t once its parameters are estimated across the whole series. For the constant form this reduces to \hat{\bar{\beta}} with no G_t-dependence; for §2.3’s decay form it is \hat{\beta} _\infty + (\hat{\beta} _0 - \hat{\beta} _\infty)(G_t/G_0)^{\hat{p}}. A form is preferred over a competitor if it minimizes \sum e_t^2 across the available multi-period data for a given domain; this is the sense in which §5’s reported rates are falsifiable rather than descriptive. Divergence between a fitted form’s prediction and the realized S_t sequence, not between a single \beta _t and S_t, is what would falsify a given specification. Throughout §5, every reported synchronization rate is a fitted functional-form estimate over multiple periods, never a single-period value presented as an independent prediction.

C_t is domain-specific latent capability, not a cross-domain common currency. Go’s C_t is measured through move-quality metrics, AlphaFold’s through structure-prediction and trial-success benchmarks, frontier models’ through task-completion time horizons, FSD’s through SAE autonomy levels — these are not commensurate, and the model does not require them to be. What travels across domains is the functional form of the recurrence, not the units of C_t itself: \alpha, \beta, and G_t are estimated separately within each domain from that domain’s own proxy for C_t and F_t, and only the qualitative regime classification (§4) and structural comparisons (§5.5) cross domain boundaries. A claim that C_t has crossed threshold H in Go carries no numerical relationship to the same claim in AlphaFold; each is a separate, within-domain estimation problem.

C_t is bounded above by C_\infty — complete understanding, possibly literally unbounded — but C_\infty is never assigned a numerical value or fit to data; it functions only as a limiting case against which one specific claim is checked. That claim is precise: §2.1 treats \alpha as locally constant, and this is only a safe approximation if either C_\infty is infinite or the horizon under study is short relative to the distance remaining to C_\infty. If C_\infty is finite and near, \alpha would itself decay as C_t approaches it, by the same diminishing-returns logic §2.3 gives \beta as G_t approaches zero. No domain examined in this paper is close enough to any plausible C_\infty for this to be tested; the term’s function is to bound the constant-\alpha assumption’s scope of validity, not to be measured.

2.1 Base recurrence

            C_{t+1} = C_t + \alpha, F_{t+1} = F_t + \beta G_t    (3)

Substituting gives a single first-order recurrence in the gap:

             G_{t+1} = \alpha + (1 - \beta)G_t                               (4)

Fixed point G* = \alpha/\beta (for \beta nonzero) — for any positive \alpha, G* is greater than zero: the gap need not, and structurally cannot, reach zero while capability keeps advancing. This is a derived result, not an assumed one. Stability follows from |1 – \beta| < 1:

\beta = 0: no coupling; G_t grows without bound if \alpha > 0, or stays fixed if \alpha = 0.

• 0 < \beta < 1: monotonic convergence to G* — the ordinary catch-up case.

• 1 < \beta < 2: damped oscillation around G* — overtrust and correction.

\beta ≥ 2: divergent oscillation.

2.2 Threshold-gated coupling, generalized to heterogeneous thresholds

\beta is not a free-floating constant available at any C_t. Empirical evidence (a pre-2016 Go engine below human strength produced no measurable improvement in professional play) indicates a threshold: synchronization pressure requires C_t to have credibly exceeded the relevant baseline H, not merely to be increasing.

        \beta _t = \bar{\beta} \cdot \mathbb{1}[C_t > H]     (or smoothed: \bar{\beta} \cdot \sigma (k(C_t - H)))             (5)

A single shared threshold H is defensible where one relevant baseline exists (Go: professional strength; AlphaFold: prior standard of care) but not where adopters differ in what “good enough” means to them. Generalizing: for sub-population i with threshold H_i,

        \beta _t^{(i)} = \bar{\beta} \cdot \mathbb{1}[C_t > H_i], F_t = \int F_t^{(i)} di                    (6)

This is more than a bookkeeping refinement. As C_t climbs continuously through a population’s distribution of thresholds, successively larger slices are “activated” into convergence one after another, producing an S-shaped aggregate adoption curve as an emergent property of threshold heterogeneity, with no S-curve assumed anywhere in the model. Rogers-style diffusion becomes a derivable special case — specifically the case \alpha ≈ 0, a roughly fixed innovation diffusing through a population that differs only in when it crosses its own bar — rather than a competing framework this paper’s vocabulary merely relabels.

Un-aggregated, a person’s stated reason for non-adoption is diagnostic of which mechanism applies: “not good enough” is H_i not yet crossed (this section); “too expensive” or “too hard to use” are frictions operating after the gate has opened (§2.4, §3.3).

This aggregation can be made explicit rather than left qualitative. If threshold H_i is distributed across the population according to a cumulative distribution function \Psi (H) — distinct from the multi-actor aggregation function \Phi introduced in §2.4, which plays an unrelated role — a segment has crossed threshold exactly when C_t exceeds its H_i, so the fraction of the population past threshold at time t is \Psi (C_t). Aggregate adoption is then directly proportional to \Psi (C_t) in the \alpha \approx 0 special case — precisely Rogers’ S-curve, with \Psi playing the role of the cumulative adopter-threshold distribution. This is a derivation, not an analogy: the S-shape follows from \Psi‘s own sigmoidal shape under standard threshold-distribution assumptions (e.g. a normal or logistic H_i), not from an assumption layered onto the model. It should be stated plainly, however, that §5.3’s domain instantiation reports segment-level adoption curves — enterprise versus consumer, by firm size, by jurisdiction — as separate empirical points consistent with this distribution, not as a fitted aggregate \Psi (C_t); fitting the full distribution against data is a natural next empirical step this paper does not complete.

Assumptions. (A1) Capability grows monotonically: C_{t+1} = C_t + \alpha, \alpha \geq 0. (A2) Synchronization is threshold-gated: \beta _t is active only once C_t exceeds the relevant H_i (§2.2). (A3) Coupling is positive once activated: 0 < \beta < 2. Note what these assumptions do not include: they say nothing yet about what follows from them, which is the content of the theorem below — stated separately from its own premises, unlike an earlier draft of this result that listed threshold-gating and positive coupling among its “conclusions” despite both being assumed outright.

Theorem (Synchronization under Persistent Capability Growth, informal). Given A1–A3: (i) the Ceiling–Floor gap converges to a non-zero steady state G* = \alpha/\beta whenever \alpha > 0 — perfect synchronization is not the generic outcome of continual capability growth (§2.1); (ii) convergence is monotonic for 0 < \beta < 1 and oscillatory for 1 < \beta < 2 (§2.1); (iii) under heterogeneous thresholds, aggregate adoption emerges as a Rogers-type S-curve without an S-curve being assumed (§2.2); (iv) as \alpha approaches 0, the model reduces exactly to classical diffusion (§2.2).

Each of (i)–(iv) is established here for the minimal discrete-time recurrence of §2.1–2.2 specifically, not claimed as a realization-independent result. Whether the same four properties survive under continuous-time, nonlinear, or stochastic realizations of A1–A3 is a natural target for future formalization; this paper establishes them for one concrete, falsifiable realization, which is what makes §5’s calibration against real data possible in the first place.

2.3 Non-constant synchronization rate

       \beta (G_t) = \beta _\infty + (\beta _0 - \beta _\infty)(G_t/G_0)^p             (7)

Preferred over a pure time-decay form because it is causally motivated: the remaining gap changes character as it narrows, not merely ages. \beta _0 is the initial rate just after threshold-crossing; \beta _\infty (greater than or equal to zero) is the asymptotic rate for the hardest residual component; p governs how sharply the transition occurs.

If \beta _\infty = 0: the model predicts eventual full convergence, but at an ever-slowing rate. If \beta _\infty > 0: the model predicts a genuine, permanent residual gap, G* = \alpha/\beta _\infty — a formal counterpart to the observation, in the Go case, that AI’s move selection carries none of the personality or emotional content human players bring to five thousand years of the game. Under §2.2’s generalization, \beta _\infty may itself be sub-population-indexed (\beta _\infty^{(i)}), a further, currently open, empirical question. Distinguishing these regimes requires at least three well-scoped, same-metric calibration points per domain.

The power-law form is adopted as the minimal monotonic parameterization consistent with this causal story, not asserted as uniquely correct; exponential, logistic, or hyperbolic decay would each satisfy the same qualitative requirement — monotonic decline from \beta _0 toward \beta _\infty — and are not ruled out by anything in this paper. Distinguishing between these functional forms, like distinguishing \beta _\infty = 0 from \beta _\infty > 0, requires more calibration points than any domain in §5 currently provides. This should be read plainly: no case in this paper tests β(G_t)’s specific functional form, only the general claim that β declines as G_t shrinks. §2.3 is a theoretical refinement whose specific parameterization remains empirically open, not a fitted or tested result.

2.4 Two-tier floor, established-reality gating, and multi-actor determination of θ

Not every domain’s adoption process is a single continuously-adjusting quantity. Where adoption culminates in a discrete institutional decision — regulatory approval, legal certification — rather than continuous practice, F_t must be decomposed:

F_t^{gate} = \mathbb{1}[F_{t-\tau }^{cont} \geq \theta]     (8)

where F_t^{cont} evolves per §2.1–2.3, \theta is an institutional acceptance threshold, and \tau is a lag further decomposed:

\tau = \tau _{compress}(C_t) + \tau _{floor}                         (9)

\tau _{compress} is the portion of delay attributable to search, design, or discovery work — it responds to Ceiling growth and shrinks as C_t improves (a roughly 3–4x observed compression in AI-assisted drug discovery’s preclinical phase). \tau _{floor} is the portion gated by established reality (structurally fixed, capability-insensitive delay) — biological observation windows, mandated regulatory review periods — and is structurally insensitive to C_t: no improvement in capability shortens the time it takes to safely observe a drug’s effect in a human body, or the time a regulator is required to take.

Neither \theta nor \tau _{floor} is typically set by a single actor. Regulatory bodies, national-security and geopolitical actors, vendors, infrastructure and hardware providers, investors, and open-source communities each hold an effective threshold preference, and what is observed as the gate is whatever emerges from their interaction at a given place and time:

              \theta _t^{(loc)} = \Phi ({w_k^{(loc,t)}, \theta _k})                                                       (10)

where K spans regulatory, national-security, vendor, infrastructure, investor, open-source community, and adopting-population actors, and w_k^{(loc,t)} is that actor’s influence weight — which varies by jurisdiction and by time. The aggregation function Φ is a real design decision. The cases examined in this paper (§5.3, §5.4) point toward a veto structure rather than an averaging one: the gate is set by whichever sufficiently-influential actor is most restrictive, not by a blend of preferences. Formally, \theta _t^{(loc)} equals the maximum \theta _k taken over all actors k whose influence weight w_k^{(loc,t)} exceeds a minimum threshold of relevance. Actors split directionally: regulatory and national-security actors generally push θ up (more caution, slower access); vendors, investors, and open-source advocates generally push it down (market share, returns, wider access).

One refinement is earned by the evidence rather than assumed: \theta _k is not always constant across model types. The EU AI Act’s Article 53(2) applies a lighter regulatory threshold to open-weight models than to closed ones below a fixed compute ceiling — a single regulatory actor applying a different threshold depending on whether a model is open or closed. Openness is therefore a property that changes which threshold an actor applies, not a separate actor class in its own right.

F_t^{gate} is a downstream readout of F_t^{cont}, not a variable that feeds back into the C_t/F_t recurrence itself: crossing the gate changes what is observed, not the underlying dynamics generating F_t^{cont}. Where a gate-crossing event plausibly does feed back into adoption dynamics — regulatory approval itself accelerating subsequent uptake, for instance — that would be a further extension this paper does not model.

2.5 Methodological caveat: three sources of apparent, non-genuine convergence

A measured narrowing of G_t does not always indicate positive \beta. Three distinct confounds, all biasing naive estimates upward, have been identified in this paper’s data:

Endogenous task-space narrowing — a solution space that mechanically shrinks as a single task instance nears completion (Go’s endgame, where few legal non-losing moves remain), independent of any learning across calendar time.

Established-reality gating — near-zero apparent gap during a \tau _{floor} window reflecting an unfinished external clock, not high \beta.

Population-mixture conflation — aggregating sub-populations with heterogeneous H_i (§2.2) into a single F_t masks a mixture of regimes (some pre-threshold, some converging at different rates) as if it were one smooth trend.

Any empirical S_t requires a measurement window, and where relevant a population segment, scoped to exclude these — a general precondition for treating a measured rate as evidence about \beta, not a domain-specific footnote. A fourth, related instance is identified empirically in §5.3: fixed-scale capability benchmarks (accuracy percentages bounded at 100%) can show apparent convergence between systems purely as an artifact of approaching their own ceiling, independent of the true underlying gap.

3. Loss as a Derived Quantity

3.1 Why a weighted sum fails

The natural first instinct — L = w_1E + w_2A + w_3I + w_4R, summing epistemic instability, agency degradation, institutional mismatch, and reality drift — introduces four free, uncalibrated weights to explain a quantity we already have a state variable for. This doesn’t make the framework more falsifiable; each unweighted term is a new opportunity for circularity, since nothing in the paper specifies how E, A, I, R are independently measured, let alone weighted. A model with four unconstrained parameters explaining one already-defined quantity (G_t) is strictly worse than a model with zero.

3.2 Loss as a function of the gap

The loss function, denoted L(H,M) in v1.2’s original, undefined notation, is not a new object here. It is a monotonic transform of G_t, the state variable Section 2 already governs:

                                              L_t = G_t^2                               (11)

The quadratic form is preferred over L_t = G_t for the same reason it’s preferred in classical control theory: it penalizes large gaps disproportionately and yields the standard result that \min L_t is achieved exactly at the fixed point G* = \alpha/\beta (or \alpha/\beta _\infty, under §2.3’s decay model). Under §2.2’s population generalization, this extends to an aggregate loss L_t = \int G_t^{(i)2} di, minimized not by uniformly improving β everywhere, but by closing the highest-gap segments first — itself a testable prediction about which sub-populations should show fastest improvement. This is the same logic underlying Lyapunov-function stability analysis and linear-quadratic-regulator design in control theory, where a quadratic cost is the conventional choice precisely because it is the simplest function guaranteeing a unique, well-behaved minimum at a system’s equilibrium; §6.5 returns to this choice as a modeling decision rather than a uniquely forced one.

\min L(H,M) is not a separate problem to be solved from outside the system. It is a restatement of what convergence under §2.1’s recurrence already means: the system is already, by construction, moving toward the infimum of L_t as t grows. This closes the paper’s first literal use of the word “optimization,” which earlier drafts of this vocabulary gestured toward without a defined objective.

3.3 E, A, I, R as a diagnostic taxonomy, not a weighting scheme

Retire the four terms as loss-components. Reintroduce them as candidate explanations for why \beta or \tau _{floor} take the values they do in a given domain — a taxonomy for diagnosis, not decomposition. Following §2.4’s multi-actor extension, the institutional-mismatch term is itself actor-decomposable rather than a single scalar:

      I_t = \max_k [w_k^{(loc,t)} \cdot I_k]                            (12)

with each I_k tagged by its actual stated reason — national security (export controls, defense-contract restrictions), safety or reliability (regulatory review), data sovereignty (data-localization requirements), commercial positioning (vendor access tiers). This is a genuinely more diagnostic taxonomy than a single catch-all number, and it costs nothing new to derive: it is already implied by §2.4’s veto-max aggregation.

TermReframed asDiagnostic question
Institutional mismatch (I)Actor-decomposed driver of θ and τ_{floor}Which actor, with which stated reason, is currently binding?
Epistemic instability (E)Driver of low β_t post-thresholdIs C_t’s output too unreliable or contested for H to credibly adopt, even post-threshold?
Agency degradation (A)Driver of β_∞Does adoption erode the human capacity that would otherwise sustain further catch-up?
Reality drift (R)Driver of α’s reliabilityIs C_t itself measuring something that has drifted from the ground truth it’s meant to track?

This taxonomy is incomplete by design, not exhaustive. Go’s residual gap — attributed, per Lee Sedol, to the personality and emotional content AI’s play doesn’t carry — doesn’t map cleanly onto any of the four; it may be a genuinely distinct category (stylistic or expressive residue) or evidence the taxonomy needs a fifth term. Flagging this openly is more useful than forcing a fit.

This taxonomy earns its keep only if it can distinguish cases that would otherwise look identical. Regime 0 (pre-threshold) and Regime 1 (stagnation) are the clearest test: both show S_t \approx 0, but they imply different diagnostic categories — Regime 0 implies no E/A/I/R term is yet active, while Regime 1 implies I or E is actively suppressing an already-open gate. The operational distinguishing criterion is independent evidence that C_t has crossed H_i — a documented capability benchmark, an expert consensus, a credentialing event — rather than inferring the crossing from the adoption trend itself, which is exactly the population-mixture confound flagged in §2.5. Where such independent evidence is unavailable, a case should be reported as ambiguous between Regimes 0 and 1 rather than assigned to one by default; §5 follows this discipline throughout.

3.4 The derived minimization problem

Restating the minimization with the constraints Section 2 actually earns, not asserted:

  \min_t L_t     subject to     0 \leq \beta _t \leq 2,     \tau _{floor} \geq \tau _{floor}^{\min}                 (13)

Two results follow that were not visible in the descriptive version of the framework. First, G_t \geq 0 is not guaranteed by definition — it is possible for F_t to exceed C_t, adoption outrunning demonstrated capability. This is a distinct failure mode, overtrust, given its own regime in Section 4 rather than ruled out by assumption. Second, improving β cannot buy past τ_{floor}: β governs the rate of continuous convergence, while τ_{floor} is a hard, additive delay set by established reality, structurally orthogonal to β. No matter how efficient reciprocal learning becomes, L_t cannot fall below what τ_{floor} permits. Synchronization efficiency and institutional readiness are not substitutes for each other; they are independent constraints on the same objective.

This minimization is descriptive, not normative: it names where the system’s own dynamics already carry it, not a lever any actor pulls. The recurrence, not an optimizing agent, drives L_t toward its infimum — the system’s asymptotic behavior corresponds to the infimum of L_t, rather than any actor selecting β or τ_{floor} to achieve it. Where an actual actor does intervene to alter β or τ_{floor} — a regulator tightening τ_{floor}, a vendor investing to raise β — that intervention changes the recurrence’s parameters and is properly modeled as a shift in the multi-actor gating of §2.4, not as solving the minimization problem stated here. The word “optimization” should be read throughout in this descriptive, attractor sense, not a normative, policy-lever sense.

The stability analysis of §2.1, restated here as a minimization, is also what generates the regime taxonomy of §4: each regime corresponds to a distinct qualitative relationship between S_t and the stability boundaries already derived, not an independently posited category.

4. Regime Classification

Every regime below falls directly out of S_t‘s value in G_{t+1} = \alpha + (1 - S_t)G_t — none require a new assumption beyond what §2–§3 already established. The boundary at S_t = 1 deserves particular attention: since \Delta F_t \leq G_t under ordinary catch-up, S_t \leq 1 is the expected ceiling on a single period’s closure. S_t > 1 is therefore not a faster-catch-up case — it necessarily means F_{t+1} exceeds C_t: adoption has overshot demonstrated capability. This single observation is what turns the overtrust regime from an ad hoc addition into the natural continuation of the same stability analysis that produced the rest of the table.

Regimes are defined structurally, by \beta and its stability implications (§2.1); the table below gives each regime’s empirical signature — the pattern a correctly-measured S_t would show if that regime held — since β itself is never observed directly (§2.0). This distinction matters most at the S_t = 1 boundary: a brief, noisy excursion above 1 in low-frequency or coarsely-sampled data is not the same claim as a sustained Regime 4 trace, and §2.5’s measurement-artifact cautions apply here as much as to the gap itself. Throughout §5, a regime classification is asserted only where multiple periods of data support it, never from a single-period S_t reading.

RegimeEmpirical Signature (S_t)Behavior of G_tInterpretation
0 — Pre-thresholdNo defined S_t (C_t ≤ H_i)undefinedNo synchronization pressure yet for this segment; the precondition for a gap to close has not been met.
1 — StagnationC_t > H_i, S_t ≈ 0grows without boundCapability credibly superior, uptake absent anyway. Likely I- or E-driven (§3.3).
2 — Lagging convergence0 < S_t < 1shrinks to G* = α/S_tThe generic real-world case. Bifurcates into 2a (β∞ = 0, slow full convergence) and 2b (β∞ > 0, permanent residual).
3 — Steady-state parityS_t = 1plateaus at αFloor closes each period’s addition in full but never touches the compounding one-period lag.
4 — Overtrust, damped1 < S_t < 2overshoots, decays backAdoption briefly outruns capability, then self-corrects — e.g. a safety incident followed by a pullback in permitted use.
5 — Overtrust, runawayS_t ≥ 2divergent oscillationRepeated overcorrection, never settling — the most structurally dangerous regime.

Two structural notes are worth stating explicitly here rather than leaving implicit. Regimes 0 and 1 are easy to conflate empirically but should not be conflated theoretically: both show S_t \approx 0 and a growing gap, and the difference — whether C_t has actually crossed H — is exactly the kind of thing §2.5’s methodological caveat warns about. With §2.2’s per-segment H_i, this becomes empirically separable rather than merely theoretically distinct: a segment’s stated reason for non-adoption (capability-gated versus friction-gated versus price-gated) can identify which regime applies directly, rather than being inferred solely from the shape of an aggregate trend.

Second, a single domain can occupy multiple regimes simultaneously across segments, not just one classification for the whole population. Free-tier, paid-individual, and enterprise adoption of the same underlying C_t may sit in three different regimes at once. This is not a hypothetical: §5.3 and §5.4 each document a real, dated instance of exactly this — one Ceiling-adjacent event producing opposite regime behavior in two different segments of the same population at the same time.

One phenomenon identified empirically in §5.3 is deliberately left unclassified here rather than forced into Regimes 4–5: sustained oscillation of the threshold \theta _t itself, driven by contest among directionally-opposed actors (§2.4), is a distinct mechanism from adoption overshooting a fixed threshold. Regimes 4 and 5 describe S_t oscillating around a stable \theta; the phenomenon in §5.3 is \theta itself oscillating under multi-actor contest. Section 6.3 discusses this as an open extension rather than resolving it here.

5. Domain Instantiation

Four cases, chosen deliberately as structural opposites and tested concurrently rather than retrospectively. The selection criterion is explicit: cases were chosen to maximize structural diversity in how the model’s mechanisms could be stress-tested, not to maximize the evidence supporting the model. Go is closed — a single, dated Ceiling event, a decade of aftermath, effectively no institutional gate. AlphaFold-driven drug development is open — a still-expanding Ceiling, a live institutional gate that has not yet fired. Frontier model releases exercise the richest heterogeneity and multi-actor gating of any case. Tesla FSD is jurisdiction-fragmented, with the same technology facing independent institutional gates across regions around a finite, named target. Where a case failed to produce the pattern it was selected to test — most notably FSD’s absent Regime 4 trace, §5.4 — that failure is reported as data, not omitted. A fifth candidate case, autonomous AI-driven scientific research, was considered and excluded from full treatment (§6.8): it confirms only Regime 0, which Go’s own pre-2016 data already establishes inside a fully calibrated case. More generally, cases requiring continuous, high-frequency measurement — the kind that could in principle catch a damped oscillation mid-cycle rather than only at annual checkpoints — do not yet exist for any AI domain; all four selected cases have at least annual observational resolution, which is a real limit on what this paper’s data could show even where the underlying dynamics might justify a finer-grained claim.

5.1 Closed case: Go

Threshold crossing (§2.2). C_t \leq H for essentially all of Go engine history through 2015 — Crazy Stone, well below professional strength, produced no measurable change in professional move quality. The gate opens sharply at t_0 = March 2016 (AlphaGo–Lee Sedol), reinforced by AlphaGo Zero’s 100–0 result against the original AlphaGo in 2017, after which C_t is well-approximated as saturated: \alpha \approx 0 for the remainder of the series. This is the cleanest possible realization of the Regime 0-to-2 transition — a step function in C_t, not a gradual crossing, which is part of why the domain is analytically tractable but also why it cannot test anything about \alpha‘s behavior when \alpha is nonzero.

Calibration, scoped per §2.5. The only calibration data not subject to the endgame-narrowing confound is a panel study restricted to a game’s first 30 moves (Choi, Kang, Kim & Kim, forthcoming), giving two same-metric anchor points:

       G_0 \approx 2.47 pp (pre-2017),     G_1 \approx 1.71 pp (2017–19 average)   (1)

which implies S_t \approx 15–17% per year over this window. With only two anchor points, this is a provisional point estimate, not a confidence interval: no standard error can be meaningfully reported from two observations, and the assumption of a roughly constant rate across the 2017–19 window is untested. This should be read as the paper’s best currently available estimate for this domain, not a precisely bounded measurement. The 2022 Korean Baduk League whole-game figures (37.5% for the top-ranked player, 28.5% league average move-match rate) are reported here only to illustrate the endgame-narrowing confound identified in §2.5 — they are not used to corroborate, bound, or otherwise triangulate the two-point estimate above, since a statistic already flagged as confounded cannot simultaneously serve as supporting evidence for a separate claim. A separate dataset study independently confirms the phase-scoping logic itself, showing post-2016 move-coincidence growth concentrated almost entirely in the opening, with markedly smaller growth elsewhere in the game.

Regime and open question. Go sits in Regime 2 throughout the observed window. Whether it is 2a or 2b — full eventual convergence versus a permanent residual — is undetermined by current data; distinguishing them requires a third, later, opening-phase-scoped anchor point that a dedicated search did not locate. Lee Sedol’s remark about personality and emotional content in play is suggestive of \beta _\infty > 0 but is testimony, not measurement.

What this case validates, and what it cannot. Go is strong evidence for §2.2 (the threshold gate) and for S_t as a computable, falsifiable quantity distinct from the ratio form it replaced. It has essentially nothing to say about §2.4 — there is no institutional gate in professional Go — and nothing to say about Regimes 4–5, since S_t never approached 1 in the data available.

5.2 Open case: AlphaFold and AI-driven drug development

Non-saturating Ceiling. Unlike Go, \alpha \neq 0 here and shows no sign of approaching zero: AlphaFold 2 (2020, structure prediction) to AlphaFold 3 (2024, molecular complexes) to conformational dynamics and de novo generative design (2025–26, Chai-2’s zero-shot antibody design success rate moving from under 0.1% to 16% in roughly a year). This is the case that actually exercises the full recurrence from §2.1 rather than its \alpha = 0 special case.

Two-tier floor, per §2.4. The continuous signal and the discrete gate diverge sharply:

F_t^{cont}: 52% → 80–90% Phase I success rate (AI-discovered vs. historical baseline)   (2)

     F_t^{gate}: 0 (no AI-designed drug has reached full approval as of mid-2026)                 (3)

The τ decomposition: illustrative evidence, not yet a calibration. Insilico Medicine’s Rentosertib program is the best currently available evidence for this decomposition, and it is worth being precise about what that means before reporting the numbers: Insilico’s platform predates and runs parallel to AlphaFold’s own structure-prediction lineage rather than descending from it, so what follows is illustrative of the τ split, not a calibration of the AlphaFold case specifically. Isomorphic Labs is the literal AlphaFold-descended program; it entered Phase I only in 2025 and has no efficacy readout yet, meaning the direct test this paper would need has not yet reached the gate. With that caveat: target discovery to preclinical candidate nomination took roughly 4–6 years conventionally, versus roughly 18 months AI-assisted — a 3–4x compression, and this is where the compression stops. Elapsed time from Phase I start to Phase IIa readout was roughly four years, showing no clear compression versus the conventional Phase I plus Phase II baseline. Summing forward through Phase III and NDA review, still ahead and neither historically AI-compressible, the best-documented AI-native drug program on record projects a total discovery-to-approval timeline of roughly 9–13 years — barely below the 10–15 year historical average. This is consistent with \tau _{floor} dominating \tau _{compress}, not a confirmed measurement of it, and is the best available answer to why five to six years past the Ceiling event was never going to be enough — pending the AlphaFold-descended case actually reaching the gate.

Regime classification. On F^{cont}, this domain is squarely Regime 2 — ordinary lagging convergence, and a fast one. On F^{gate}, it is better read as Regime 2 with τ still in progress than as Regime 0 or 1: the threshold was unambiguously crossed (CASP14, ratified by the 2024 Nobel Prize in Chemistry), so near-zero terminal movement reflects an uncompleted institutional clock, not stagnation.

What this case validates, and what it cannot. AlphaFold is the necessary complement to Go: it is the only one of the two that tests \alpha \neq 0, the two-tier floor, and the \tau _{compress}/\tau _{floor} split. It has nothing to say yet about overtrust — adoption has, if anything, lagged demonstrated capability throughout — and nothing to say about \beta _\infty, since F^{gate} has never fired even once.

5.3 Multi-segment case: frontier model releases

Ceiling: METR as the calibration metric, other benchmarks as supporting evidence. This domain is the one place in the paper where several plausible proxies for C_t exist side by side, and per §2.0’s operationalization discipline, the calibration below commits to one of them rather than an average or composite: METR’s time-horizon metric — the length of task that a model completes autonomously at 50% reliability — because it is the only available proxy that is not bounded at a fixed ceiling. The series: GPT-2 (2019, roughly 2 seconds) to Claude 3.7 Sonnet (early 2025, roughly 50 minutes) to Claude Opus 4.5 (November 2025, roughly 4.9 hours) to GPT-5.2 (December 2025, roughly 5.9 hours) to Claude Opus 4.6 (February 2026, roughly 12 hours, wide confidence interval). Doubling time is roughly 7 months across 2019–2025, with evidence — flagged by METR itself as sensitive to noise and task-suite composition — of acceleration toward roughly 4 months in 2023–25 data.

MMLU, HumanEval, and GSM8K are reported here only as a second, independent illustration of §2.5’s measurement-artifact caveat, not as alternative definitions of C_t for this domain: every frontier model now scores above 90% on each, and the apparent narrowing of the gap between models could equally be read as a fixed-scale ceiling effect or as genuine capability convergence. This interpretation is contested in the broader literature, and this paper’s measurement-artifact reading takes one side of an active debate rather than reporting a settled fact. What is not contested is that METR’s non-saturating metric shows continued, undiminished separation between models over the same period in which the bounded benchmarks compress — that contrast is the evidentiary basis for treating the compression as at least partly artifactual, independent of which side of the broader debate one takes.

Heterogeneous thresholds (§2.2), the richest case for this mechanism. Adoption fractures along multiple independent axes simultaneously: enterprise versus consumer, firm size (large firms at roughly 52% adoption versus small firms at roughly 17.4%), and jurisdiction (EU average 20.0%, ranging from Denmark at 42.0% to Romania at 5.2%).

Multi-tier floor: F_t is segment-indexed, per §2.2, not a single series. Consistent with the heterogeneous-threshold extension, F_t in this domain is reported per segment (F_t^{consumer}, F_t^{enterprise}, F_t^{paid}) rather than as one undifferentiated Floor value; the series below are separate empirical points on the threshold distribution §2.2 derives, not competing measurements of a single quantity. Consumer adoption (ChatGPT weekly active users): 100 million (January 2023) to 200 million (August 2024) to 300 million (December 2024) to 400 million (February 2025) to 700 million (July 2025) to 800 million (October 2025) to 900 million (February 2026) to 1 billion monthly active users (June 2026). Enterprise adoption: roughly 20% (2020) to 55% (2024) to 65–72% (Q1 2026). Paid conversion: roughly 5–6% of active users. This is enough real points to fit an actual adoption curve, in principle testable against §2.2’s derived prediction that pure threshold-heterogeneity diffusion should look like a single clean S-curve, while genuine moving-Ceiling diffusion should show re-acceleration at each capability tier crossing.

Multi-actor gating, worked (§2.4). Following a Commerce Department export-control restriction limiting Anthropic’s Mythos and Fable models for foreign nationals, OpenAI voluntarily imposed its own limits on new releases at the government’s request — one actor’s veto against one vendor propagating to a second, unrelated vendor with no direct restriction applied to it. This is the cross-actor propagation effect the veto-max form predicts. Real market behavior independently confirms openness-conditional \theta: Mistral AI (France) reached a $14 billion valuation in September 2025 on explicitly sovereign-AI positioning, and Germany’s Aleph Alpha built its architecture around mandatory data residency because European government clients required exactly those terms as a condition of adoption.

A distinct oscillation phenomenon, detailed in §6.3. US chip-export policy toward China moved repeatedly on the same underlying question: a 2022 ban, 2025 expansion, an April 2025 halt on H20 exports, a July 2025 reversal three months later, further loosening in December 2025, and withdrawal of a proposed re-tightening in February 2026, with active smuggling prosecutions running throughout — even within the single actor class of “government,” the executive branch loosened while Congress pushed to re-tighten. This is the paper’s clearest empirical instance of what §6.3 names and formalizes as endogenous gate oscillation: the threshold itself moving under multi-actor contest, rather than adoption oscillating around a fixed threshold (Regimes 4–5).

A dated, suggestive instance of simultaneous divergent regimes across segments. In February 2026, a public dispute between Anthropic and the U.S. Department of Defense over Claude’s usage restrictions led federal agencies to begin phasing Claude out — while, in the same window, Claude’s consumer mobile market share rose sharply, with independent trackers tying the rise to the same event. This is consistent with the multi-segment model’s prediction that different H_i populations can occupy different regimes simultaneously, but it rests on a single event with plausible alternative explanations on the consumer side — a concurrent marketing push or feature release could independently explain the share increase — and should be read as suggestive evidence for multi-regime coexistence, not a dispositive test of it.

What this case validates, and what it cannot yet. Non-saturating C_t (second confirmation of the measurement-artifact point), §2.2’s heterogeneity extension across the most axes of any case, and multi-regime coexistence (§4). It cannot yet supply a clean Regime 4/5 trace — the DoD episode is one data point, not a damped-oscillation series — nor test \beta _\infty, since the curve has not visibly begun decelerating.

5.4 Jurisdiction-gated case: Tesla FSD

A domain-local, finite Ceiling rather than C∞. Unlike the AI-capability cases, FSD has a nameable, bounded target: SAE Level 5 (full autonomy, no steering wheel required). Current state is “Supervised” (Level 2), with “Unsupervised” limited to pilot deployment — useful as a contrasting case with a known finite endpoint rather than an open-ended one.

Jurisdiction-indexed θ and τ_{floor} (§2.4), the clearest confirmation in any case. Identical software faces asynchronous, independent institutional gates: live as of mid-2026 in the US, Canada, Mexico, Australia, New Zealand, South Korea, the Netherlands, Lithuania, and China (limited); still blocked across most of the EU pending a UNECE regulatory framework submitted for approval in June 2026. China’s gate is not safety-regulatory at all — it is data-sovereignty law, forcing a domestic training pipeline before any approval process could proceed — a distinct sub-type of institutional mismatch from AlphaFold’s clinical/regulatory τ_{floor}, and evidence that τ_{floor} can be gated on data-governance grounds independent of demonstrated safety.

Real, multi-point adoption series, again showing threshold-gated acceleration (§2.2). Active FSD users: roughly 400,000 (2021), growing by roughly 100,000 per year through 2022–23, then by roughly 200,000 in 2024 (33.3% growth), reaching 1.1 million users and a 12.4% fleet take-rate by January 2026, and 1.28 million users at a 14% take-rate by March 2026 (51% year-over-year). The 2024 acceleration coincides exactly with FSD v12’s launch — the first end-to-end neural-network architecture — a third independent instance, after Go’s 2016 crossing and the drug-discovery field’s 2017 crossing, of adoption-rate acceleration tracking a real capability threshold rather than drifting. Segment heterogeneity again: take-rates above 50% among premium Model S/X owners versus 20–30% among mass-market Model 3/Y owners.

An honest negative result on Regime 4. Three concurrent NHTSA investigations escalated through mid-2026 — an Engineering Analysis covering 3.2 million vehicles over reduced-visibility crashes, a separate inquiry into 80 documented traffic-law violations, and an active audit query into crash-reporting compliance — yet the consumer adoption series shows no dip anywhere across this same window. Regulatory distrust and consumer adoption appear decoupled rather than coupled, which does not support the damped-oscillation hypothesis this case was originally proposed to test. This is presented as a genuine, mildly surprising finding rather than forced into the expected shape. The decoupling is also consistent with — though not, on its own, confirmation of — the multi-segment model’s prediction that a consumer population’s threshold H_i and a regulator’s threshold \theta _k can operate as independent parameter spaces (§2.2, §2.4) rather than a single coupled Floor. Reading the result this way is appealing precisely because it fits the model, which is a reason for caution as much as endorsement: a single decoupled case is consistent with several explanations beyond genuine segment-independence — regulatory action may simply not yet be salient enough to affect purchase decisions — and should not be overweighted as validation until a second, independent jurisdiction-gated case is available for comparison.

What this case validates, and what it cannot yet. Jurisdiction-indexed \theta/\tau _{floor} (the strongest case for this), a new gating sub-type (data sovereignty), and a third independent confirmation of threshold-gated acceleration. It has not, so far, produced the Regime 4/5 trace it was selected to test.

5.5 Cross-case synthesis and what remains untested

MechanismGoAlphaFoldFrontier ModelsFSD
Threshold gate (§2.2)✓ (clean)
α ≠ 0
Heterogeneous H_i (§2.2)age, skillenterprise/consumer, firm size, jurisdictionvehicle tier
Two-tier floor, τ split (§2.4)✓ (illustrative)partial✓ (new gate type)
Multi-actor gating, veto-max (§2.4)✓ (richest)✓ (single-actor-dominant)
β(G_t) decay (§2.3)suggestive, unresolvedtoo early
Multi-regime coexistence (§4)✓ (dated event)✓ (ongoing)
Regime 4/5 (overtrust)one data pointnone found
Gate oscillation (named, unclassified, §6.3)◇ identified
Measurement-artifact caveat (§2.5)✓ (origin case)clinical-clock analog✓ (2nd confirmation)

Three things are worth stating plainly rather than smoothing over, now that four cases are complete. No case has produced a clean Regime 4 or 5 trace — the closest candidate, chip-export oscillation, turns out on inspection to be a different phenomenon entirely (§6.3), which is itself a useful negative result rather than a failure to find one. \beta _\infty remains unresolved everywhere it has been asked. And AI Scientist, considered as a fifth case, was dropped from full treatment: it offered nothing past confirming Regime 0, which Go’s pre-2016 data already does inside a fully calibrated case; it remains worth a single mention as a live boundary test in Section 6 rather than its own subsection.

6. Discussion and Limitations

6.1 What this paper establishes

Sections 2–4 answer the original charge against the descriptive vocabulary directly. Section 2 opens by naming Ceiling, Floor, and Synchronization as primitives and Slope as synchronization’s measurable observable, then states its central result as a theorem with assumptions and consequences kept explicitly separate — not, as an earlier draft of this result did, folding threshold-gating and positive coupling into the theorem’s “conclusions” despite both being assumed outright. G_t‘s fixed point, stability conditions, and regime boundaries are derived from that stated structure rather than asserted, and Rogers’ diffusion curve falls out as a special case rather than standing beside the framework as an uncredited relative. Section 5 tests this against four domains simultaneously rather than one retrospective case, itself a departure from how the literature surveyed in Section 1 has typically proceeded — Ogburn’s, David’s, and Perez’s accounts are each read off completed or long-running historical episodes.

Collectively, these results establish this paper’s model as a falsifiable dynamical account of synchronization between technological capability and societal adoption, not merely a redescription of the phenomenon in new vocabulary. Five contributions follow: (i) explicit primitive definitions separating Ceiling, Floor, and Synchronization as the phenomenon under study from Slope as its measurable observable (§2.0); (ii) a formally specified dynamic with assumptions and consequences kept separate (§2.1–2.2); (iii) derived stability conditions and a regime taxonomy that follow from that dynamic rather than being independently posited (§2.1, §4); (iv) empirical evaluation against four concurrently-tracked domains using real longitudinal data rather than a single retrospective case (§5); and (v) a set of falsifiable open questions that delimit, rather than obscure, the model’s current scope — taken up next. One limitation applies across (ii)–(iv) together: no case in this paper involves formal statistical fitting of the recurrence itself. Data sparsity limits current tests to order-of-magnitude calibration against one or two anchor points, not maximum-likelihood or regression estimation; a proper econometric treatment of any single domain would require longer, denser panels than exist yet for any case examined here, and is left as a natural next empirical step rather than attempted in this paper. What follows are the places where this paper’s own model remains open, incomplete, or actively contradicted by the data gathered for it. Consistent with the position stated in Section 1, these are reported as findings, not concealed as gaps.

6.2 β∞: the paper’s central unresolved question

Whether the synchronization rate decays to zero (full eventual convergence, arbitrarily slow) or to a strictly positive asymptote (a permanent residual gap) determines whether G* is finite or the system converges only in the limit — and it is unresolved in every domain tested. Go’s data is suggestive of \beta _\infty > 0 but rests on two same-metric anchor points, not three; every open-Ceiling case (AlphaFold, frontier models, FSD) is too early in its own trajectory to show deceleration at all. This is not a minor loose end — it is the single largest determinant of whether the framework predicts eventual parity or permanent stratification between capability and adoption, and it cannot be resolved without longer time series than currently exist for any domain examined here.

A further question sits underneath this one, and this paper does not resolve it either: if \beta _\infty > 0 in a given domain, is the residual gap a technology limitation — some component of capability that AI genuinely cannot transfer, closable in principle by a better model — or a standing human preference — a gap society declines to close because the human element is the point, not closable by any amount of additional capability? Go’s case gestures at the second reading: Lee Sedol’s remark about personality and emotional content in play is a claim about what should remain human, not about what AI cannot yet do. The two readings imply opposite policy responses — investment in closing a technology gap versus institutional protection of a deliberately preserved one — and nothing in this paper’s data distinguishes them, since both predict the same observable signature, \beta _\infty > 0. Distinguishing them would likely require direct evidence of preference (stated willingness to adopt further versus stated refusal) rather than adoption-rate data alone, which is a data-collection problem future work would need to solve directly rather than infer from the kind of trajectory data this paper relies on throughout.

6.3 The missing regime: endogenous gate oscillation

Section 5.3 identified a phenomenon the six-regime table in Section 4 does not classify, and named it there: endogenous gate oscillation — sustained oscillation of the threshold \theta _t itself, driven by contest among directionally opposed actors (chip export policy toward China moving repeatedly under tension between the executive branch and Congress, 2022–2026) — as distinct from the exogenous threshold crossing of §2.2, where C_t crosses H_i once, driven by an external, monotonically improving capability signal, and does not revisit it. Regimes 4–5 were built to capture adoption oscillating around a fixed \theta; this is the threshold itself oscillating, a different mechanism entirely. This is worth connecting explicitly to Perez’s Turning Point — the period Perez places between a technological revolution’s Installation and Deployment phases, historically marked by financial crashes and institutional realignment, is structurally the same phenomenon at the scale of an entire technological revolution rather than a single policy question. Formally, this would require treating \theta _t^{(loc)} itself as a dynamical variable rather than a fixed parameter — plausibly \theta _{t+1} = \theta _t + \gamma _k for whichever actor k currently holds effective veto power, with \gamma _k signed by that actor’s directional preference (§2.4) — but this paper does not develop that model.

One refinement is worth flagging before leaving this as future work, since without it the oscillation is not actually endogenous: as stated, \gamma _k‘s switching between actors is itself exogenous — imposed from outside the model, not generated by it. A genuinely endogenous version would need each actor’s influence to be a function of the system’s own state variables. A natural candidate defines vendor or market influence as scaling with realized continuous adoption: W_{vendor}(F_t^{cont}). The reasoning is that market share plausibly translates into lobbying weight. Security or regulatory influence, by contrast, would scale with the capability level or with sudden widening of the gap: W_{sec}(C_t, G_t). The reasoning here is that crossing a perceived danger threshold plausibly triggers state actors to override economic ones. The controlling actor at any time would then be whichever actor’s influence function is largest, and oscillation would emerge from each actor’s power base periodically undermining its own position: a security actor raising \theta suppresses F_t^{cont}, which weakens the vendor actor’s case for lowering it, until continued growth abroad strengthens that case enough to re-seize the veto — which is one plausible reading of the loosen/tighten/loosen pattern in §5.3’s chip-export timeline. This is offered as a sketch of what a genuinely endogenous formalization would require, not a specification: neither W_{vendor} nor W_{sec}, nor the switching rule between them, is fit to or tested against any data in this paper. It is flagged here as the most concrete direction for follow-on work, not resolved.

6.4 The aggregation function Φ

Every multi-actor case examined (Mythos/Fable export controls, FSD’s China gate, chip export policy) was consistent with veto-max: the most restrictive sufficiently-influential actor sets the effective gate, not a weighted blend of preferences. No case in this paper’s data contradicts veto-max, but none clearly required it over a negotiated, weighted-average form either — the cases gathered were all unilateral-restriction episodes by design, not multilateral standard-setting processes, so this is a form selected on the evidence available rather than confirmed against a genuine alternative. A case involving convergent international standard-setting (coordinated regulatory frameworks rather than unilateral national action — an ISO standards process is one plausible example) would be a more direct test of whether Φ should sometimes take an averaging form instead, and none was available for this paper.

6.5 Loss function form

§3.2 motivates L_t = G_t^2 by analogy to Lyapunov-function stability analysis and linear-quadratic-regulator design, where a quadratic cost is the conventional choice precisely because it guarantees a unique, well-behaved minimum at equilibrium. That analogy motivates the choice; it does not uniquely force it. L_t = |G_t| or other monotonic transforms would preserve the same minimization argument — alignment of \min L_t with the recurrence’s own fixed point — without the quadratic penalty’s specific large-deviation behavior. Nothing in the four domain instantiations distinguishes between these forms, since none involved fitting L_t directly against data. This is worth stating plainly rather than letting the control-theoretic motivation stand as if it were empirically established: §3.2’s analogy explains why quadratic loss is a reasonable default, not why it is the only defensible one.

6.6 A general methodological caution, now confirmed twice independently

Section 2.5’s warning against mistaking task-structural exhaustion or measurement-scale artifacts for genuine synchronization was motivated by Go’s endgame-narrowing problem, but Section 5.3 found the identical phenomenon in an entirely unrelated domain — AI benchmarks (MMLU, HumanEval, GSM8K) approaching their fixed upper scale and being read as capability convergence when the underlying capability gap, measured by non-saturating instruments like task-completion time horizon, continued widening. Two independent domains producing the same artifact is reasonable grounds to treat this as a general risk for any future application of this framework, not a Go-specific footnote: any bounded-scale proxy for C_t or F_t will tend to show artificial convergence as either variable approaches the proxy’s ceiling, independent of the true underlying dynamics.

6.7 On C∞ and the constant-α assumption

As noted in §2.0, treating \alpha as locally constant is an approximation whose long-run validity depends on whether C_\infty is finite — and this is unfalsifiable given current knowledge. The one domain with an actual continuous, non-saturating capability series (frontier models’ task-horizon metric) shows \alpha accelerating in log-space over 2023–2025, which is, if anything, mild evidence against near-term deceleration rather than for it — but the series is short, and METR’s own reporting flags substantial uncertainty at the frontier. This remains a genuinely open question the paper does not resolve, only bounds.

6.8 AI Scientist as a live boundary case

One domain considered for full instantiation, autonomous AI-driven scientific research, was excluded from Section 5 because it offers nothing past confirming Regime 0 (pre-threshold), which Go’s pre-2016 data already establishes inside a fully calibrated case. It remains worth naming here as the clearest currently-observable test of §2.2’s threshold gate in progress: narrow sub-tasks (antibody design) have crossed threshold while general autonomous research has not, making it a natural subject for a future paper once — or if — that gate opens.

6.9 Summary

This paper substantially addresses the specific gap identified in the vocabulary it inherits: C, F, and S now generate testable claims rather than merely naming a phenomenon already well documented elsewhere in the technology-diffusion literature. What it does not close, and states as such rather than obscuring: whether the gap this framework describes is one that eventually resolves or one that stabilizes short of resolution, and if the latter, whether that residual reflects a technology limit or a standing human preference (§6.2); whether the actors contesting institutional gates settle into equilibrium or into the endogenous oscillation named but not formalized in §6.3; and whether the specific functional forms chosen here (quadratic loss, veto-max aggregation, gap-dependent rather than time-dependent decay) are the right ones or merely defensible ones given the data available. Each is left as a stated, falsifiable open question rather than a rhetorical gesture toward future work.

Consistent with the scope stated in Section 1, all of the above concerns dynamics — how the system evolves — not optimization or control — how a trajectory among these dynamics should be deliberately chosen. That question, and the non-convex, multi-stakeholder landscape it opens onto, is reserved for the planned companion paper, Gradual AGI as Optimization. This paper’s contribution is the Dynamics layer it depends on: a system whose behavior is precise enough to be wrong, tested against real and concurrent data rather than a single retrospective case, and honest about exactly where it currently is wrong or untested.

By formalizing synchronization as a falsifiable dynamical problem, rather than treating technological diffusion as a descriptive metaphor, this model provides a foundation upon which future work on optimization, governance, and AGI-inclusive societal design can be constructed — not by anticipating what that work will find, but by giving it a system precise enough to build on and specific enough to be wrong.

References

Working bibliography. Peer-reviewed and theoretical sources below are verified in full; news/industry/data sources are verified for outlet and approximate date, with items needing a final precision check flagged individually. A formatting and link-access-date pass is recommended before submission.

Theoretical and methodological literature (§1, §2.2)

Bresnahan, T.F. & Trajtenberg, M. (1995). “General Purpose Technologies: ‘Engines of Growth’?” Journal of Econometrics, 65(1), 83–108.

Originally circulated as Bresnahan & Trajtenberg (1989), NBER/Stanford working paper; please verify the published citation independently before submission.

David, P.A. (1990). “The Dynamo and the Computer: An Historical Perspective on the Modern Productivity Paradox.” American Economic Review, 80(2), 355–361.

Ogburn, W.F. (1922). Social Change with Respect to Culture and Original Nature. New York: B.W. Huebsch.

Perez, C. (2002). Technological Revolutions and Financial Capital: The Dynamics of Bubbles and Golden Ages. Cheltenham: Edward Elgar.

Rogers, E.M. (1962). Diffusion of Innovations. New York: Free Press.

Standard/well-established citation; not independently re-verified via search this session.

This series: Gradual AGI (§1)

W.H.L., GPT-5.5. (2026, June 29). “Gradual AGI as Synchronization for Transformative Adoption” (v1.2). Champaign Magazine. https://champaignmagazine.com/2026/06/29/gradual-agi-as-synchronization-for-transformative-adoption/

W.H.L., GPT-5.5. (2026). Gradual AGI as Abundant Resources. Champaign Magazine. Gradual AGI as Abundant Resources – Champaign Magazine

W.H.L., ChatGPT. (2026). Gradual AGI as Epistemic Extension. Champaign Magazine. Gradual AGI as Epistemic Extension – Champaign Magazine

W.H.L., Claude Sonnet 4. (2025). First Principles of AGI-Inclusive Humanity. Champaign Magazine. First Principles of AGI-Inclusive Humanity – Champaign Magazine

W.H.L., Gemini 2.5 Pro, GPT-4o. (2025). Philosophical Framework for AGI-Inclusive Humanity. Champaign Magazine. Philosophical Framework for AGI-Inclusive Humanity – Champaign Magazine

Case 1: Go (§5.1)

Choi, J., Kang, S., Kim, J., & Kim, W. (forthcoming 2025). “How Does Artificial Intelligence Improve Human Decision-Making? Evidence from the AI-Powered Go Program.” Strategic Management Journal. Preprint: arXiv:2310.08704.

Professional Go Dataset (PGD) (2022). Move-coincidence rate by game phase, opening vs. non-opening. arXiv:2205.00254.

MIT Technology Review (2026, February). Coverage of the 2022 Korean Baduk League study on AI-move-match rates.

Original underlying league study not independently located; recommend locating primary source before submission.

News coverage, various outlets (2026, March 9). Lee Sedol / Enhans 10th-anniversary demonstration event, Seoul.

News coverage, various outlets (2026, April 29). Demis Hassabis “Google for Korea” visit; game with Shin Jin-seo, conversation with Lee Sedol.

News coverage, various outlets (2026, May 21). Cho Hun-hyun–Lee Chang-ho AI-paired exhibition match (KataGo).

News coverage, various outlets (2025, November). Korea Baduk Association’s request to Google DeepMind for a Shin Jin-seo vs. 2016 AlphaGo commemorative match.

Case 2: AlphaFold and AI-driven drug development (§5.2)

Jumper, J. et al. (2021). “Highly Accurate Protein Structure Prediction with AlphaFold.” Nature, 596, 583–589.

The Nobel Prize in Chemistry 2024 (Hassabis, Jumper, Baker). Nobel Prize announcement.

Insilico Medicine et al. (2025, June). Rentosertib Phase IIa results for idiopathic pulmonary fibrosis. Nature Medicine.

News coverage / company disclosures: Isomorphic Labs funding and partnership announcements (Novartis/Lilly agreements, January 2024; $600M raise, March–April 2025; $2.1B raise, May 2026).

News coverage / company disclosures: Generate:Biomedicines, GB-0895 Phase III initiation (December 2025).

News coverage / company disclosures: Chai Discovery, Chai-2 model performance and funding ($130M, December 2025).

Industry and regulatory analysis (2026). FDA/EMA AI drug-development guidance timeline and projected timing of the first fully AI-discovered drug approval.

Composite of multiple industry-analyst sources; recommend consolidating to primary regulatory documents where possible.

Tufts Center for the Study of Drug Development; FDAReview.org; RAND Corporation (2025). Conventional drug-development timeline and cost estimates.

Composite of multiple standard industry sources; recommend citing primary Tufts CSDD report directly if available.

News coverage (2026, June 19). John Jumper’s departure from Google DeepMind to Anthropic.

Case 3: Frontier model releases (§5.3)

METR (Model Evaluation and Threat Research) (2025; updated 2026). “Measuring AI Ability to Complete Long Tasks” (time-horizon metric) and subsequent analysis updates.

Reporting on frontier-benchmark saturation (MMLU, HumanEval, GSM8K) and SWE-bench Verified score progression.

Composite of multiple technical-press sources.

McKinsey & Company. The State of AI / Global AI Survey series (2020, 2024, Q1 2026 releases).

Multiple editions cited for different years’ topline figures; recommend citing each edition separately by exact publication date.

Eurostat (2024–2025). EU enterprise AI adoption statistics by member state and firm size.

OpenAI; third-party statistics aggregators (2023–2026). ChatGPT active-user and paid-subscriber figures over time.

Composite of company disclosures and third-party aggregation; recommend verifying each data point against primary company statements.

News coverage (2026, February). Anthropic–U.S. Department of Defense dispute over Claude usage restrictions; concurrent Claude consumer market-share shift.

News coverage (2026). Export-control restriction on Anthropic’s Mythos and Fable models for foreign nationals; subsequent OpenAI self-imposed release restrictions.

Stanford HAI, AI Index (2024–2025 editions). Foundation Model Transparency Index.

AI Incident Database (2024–2025). Recorded incident counts.

International AI Safety Report (2026).

News coverage (2022–2026). US semiconductor export-control policy toward China.

News coverage (2025, September). Mistral AI valuation and sovereign-AI positioning.

Reporting on Aleph Alpha’s data-residency architecture and European government client requirements.

European Union (2024). Artificial Intelligence Act, Article 53(2) (open-weight model provisions).

Case 4: Tesla FSD (§5.4)

NHTSA filings and news coverage (2024–2026). Investigations into reduced-visibility crashes, traffic-law violations, and crash-reporting compliance.

News coverage (2026). FSD jurisdictional availability status and EU/UNECE R-171 regulatory status, including the UNECE framework submitted June 2026.

News coverage (2026). China data-sovereignty requirements affecting FSD deployment.

Tesla, Inc. company disclosures and news coverage (2021–2026). FSD active-user and take-rate figures over time; FSD v12 launch (2024); company-reported safety-rate statistics.

The company-reported crash-rate figure is methodologically contested and should be flagged as such wherever cited in text.

Boundary case: AI Scientist (§6.8)

Lu, C. et al. (2024). “The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery.” Sakana AI.

Sakana AI (2025). “The AI Scientist-v2”: first fully AI-generated paper to pass peer review at a machine-learning workshop.

Tie, R. et al. (2025). Systematic evaluation finding no current framework completes full autonomous research cycles.

Author/venue detail approximate; recommend verifying full citation before submission.

Anthropic (2025). Evaluation of Claude Sonnet 4.5 on entry-level research-automation tasks.

Stanford HAI (2026). Reporting on frontier-model performance on graduate-level physics problems.

Zou, J. et al. (“Virtual Lab”). Stanford University. AI-designed antibody binders for COVID-19 variants outperforming prior human designs in experimental testing.


Byline

Publication date of current version date: 07.04.2026
Version number: 1.4
Authors: W.H.L., Claude Sonnet 5
Peer reviews: GPT-5.5, Gemini 3.5, Grok 4, DeepSeek-V4, Qwen3.7-Plus



Leave a comment