The Geometry of Pricing: Turning Prediction Into a Decision

Nish · February 5, 2026

⏱️ 21 min read

Table of Contents

Pricing looks like a prediction problem until you try to ship it. The obvious question is “what price will this customer accept?” but that is usually the wrong level of abstraction. In practice, the useful question is “for each price I am allowed to offer, what is $P(\text{accept} \mid p)$, and which one should I choose under the business constraints?”

That framing turns pricing from a single-number prediction problem into a decision problem. The model estimates behaviour. A separate optimisation layer chooses the action. A constraint layer makes sure the action is commercially safe.

This post walks through that system in the context of broadband regrade pricing: customers moving between product tiers, prices being recommended in real time, and the messy reality that the mathematically clean answer still has to survive business rules, price floors, product ladders, locked offers, and evaluation uncertainty.

TL;DR

  • Do not train a model to predict the “right price” directly. Train it to estimate $P(\text{accept} \mid p)$ as a function of price.
  • Once you have an acceptance curve, price selection becomes an explicit expected-value optimisation problem.
  • The hard part is not just the model. It is the constraint system around it: floors, ceilings, product hierarchy rules, current-price anchors, overrides, and evaluation.
  • The most useful production design is modular: model for probabilities, optimiser for candidate selection, ladder module for coherence, and evaluation engine for counterfactual impact.
  • In production, operational constraints can compress the model’s freedom so much that the theoretical uplift becomes small and volatile. That is not a modelling failure. It is a system design reality.

The Setup

Imagine you run a broadband business. You have millions of customers on different plans. Some are in contract, some are out of contract, some are close to renewal, and some are likely to churn. Every day, a subset of them enter a regrade journey: they consider switching to another product in the portfolio.

At that point, the business has to answer a very practical question: what price should we show?

It is tempting to treat this as a regression task. Take customer features, product features, channel features, and historical pricing data, then predict a price. That sounds simple enough.

The problem is that pricing is not just prediction. It is prediction plus choice under constraints.

If you predict one price, you have not said what trade-off you are making between acceptance probability and revenue. You have not said whether you are willing to lose a little conversion for a higher monthly price. You have not said how to handle a faster product being accidentally cheaper than a slower product. You have not said what happens if the customer’s current product is priced above what they already pay.

Those are not modelling details. They are the actual decision.

Why Predicting The Price Directly Fails

The most natural first instinct is to train a regression model on historical accepted prices.

customer, product, channel, context -> accepted_price

This fails for a subtle reason: the model has no incentive structure.

It learns the average historical price associated with acceptance. That price is not necessarily revenue-maximising. It is not necessarily conservative. It is not necessarily aligned with the business’s current risk appetite. It is just the average result of past decisions, with all of their biases, rules, overrides, and historical quirks baked in.

You can try to patch this after the fact. Cap the output. Floor it. Add a margin. Nudge certain segments up or down. But now the model is optimised for one objective and the decision layer is bending it toward another. Over time, the corrections stack up, interactions become hard to reason about, and nobody can clearly explain why a customer received a particular price.

This is a sign that the framing is wrong.

The Better Question

Instead of asking:

What price will this customer accept?

we ask:

For every price we could offer, what is $P(\text{accept} \mid p)$ for this customer?

This gives us an acceptance curve rather than a single price.

For a given customer and product, the model estimates:

\[P(\text{accept} \mid \text{customer}, \text{product}, \text{channel}, p)\]

Once you have that curve, the pricing decision becomes explicit. For each candidate price, compute expected value:

\[\operatorname{EV}(p) = P(\text{accept} \mid \text{customer}, \text{product}, p) \cdot p\]

Then choose the candidate with the highest expected value, subject to the constraints the business cares about.

That is the key shift. The model does not decide the price directly. The model estimates the probability of acceptance at each possible price. The optimiser decides which price to serve.

Chart showing acceptance probability decreasing as price increases, price increasing, and expected value peaking at the chosen candidate price.
Expected value reframes pricing as a decision over a candidate grid: acceptance falls as price rises, price rises directly, and their product creates an interior optimum.

Building The Acceptance Model

The core model is an XGBoost binary classifier. Each training row represents a historical recommendation event: a customer was shown a product at a specific price, and we observe whether they accepted it.

The target is simple:

accepted_offer = 1 if the customer accepted, else 0

The important feature engineering choice is how price enters the model.

We do not feed raw price directly. Raw price is too product-specific. A price of £35 might be cheap for one product and expensive for another. Instead, we calculate a relative pricing signal:

discount = offered_price - average_historical_price_for_product

The average historical price is a rolling weighted mean of recently accepted prices for the same product and channel. It acts as a market anchor: the typical price customers have recently seen and accepted for that product.

The discount feature then answers a more useful question:

How far away from normal is this offer?

If the discount is negative, the offer is cheaper than the recent market anchor. If it is positive, the offer is more expensive.

The Monotonic Constraint

There is one structural rule we want the model to respect:

As price increases relative to the market anchor, $P(\text{accept})$ should not increase.

In XGBoost, this can be expressed as a monotonic constraint on the discount feature. The model is allowed to learn nonlinear effects, interactions, and segment-specific behaviour, but it is not allowed to learn that making a product more expensive makes customers more likely to accept it.

This is more than regularisation. It is an economic invariant.

Without this constraint, the optimiser can become unstable. If the model has local bumps where a higher price appears to increase acceptance probability, the expected-value calculation may chase noise. A monotonic acceptance curve makes the downstream optimisation much more well-behaved.

The full model used around 200 features, including:

  • Pricing signals such as discount, ARPU deltas, and historical discount behaviour.
  • Customer lifecycle signals such as tenure, contract status, and current product holdings.
  • Geographic market signals such as local churn rates and competitive density.
  • Behavioural signals such as call history, complaints, and sentiment.
  • Channel and product encodings.

The model’s job is not to understand every business rule. Its job is narrower: estimate acceptance probability for a customer, product, channel, and candidate price.

From Probabilities To Prices

With an acceptance model in place, the decision layer can choose a price.

For each customer and product combination, the system evaluates a candidate grid:

  1. Start from the average historical price for that product and channel.
  2. Generate 41 candidate prices in £1 increments from baseline - £20 to baseline + £20.
  3. Remove candidates outside the product’s commercial floor and ceiling.
  4. Score every remaining candidate with one batched predict_proba call.
  5. Compute expected value for each candidate.
  6. Select the candidate with the highest score.

The scoring formula is:

\[\operatorname{score}(p) = P(\text{accept} \mid p) \cdot p\]

In practice, we can add a small weighting term:

\[\operatorname{score}(p) = P(\text{accept} \mid p) \cdot p \cdot \operatorname{weight}(d)\]

This weighting term acts as a configurable risk preference. More on that later.

The important engineering detail is that this is vectorised. We repeat the customer’s features 41 times, append the candidate-specific discount value, and make one batch inference call. There is no iterative search and no per-price loop through the model.

That makes the pricing path fast enough for real-time recommendation, while still letting the decision layer inspect the full local price curve.

The Product Ladder Problem

If the system priced one product at a time, the grid search would be enough. But regrade journeys usually show a menu of products at once.

Broadband products sit in a hierarchy, but the hierarchy is not a purely technical fact. It is partly a business decision.

One business might order products mostly by speed: a slower product, then a mid-tier product, then a faster product, then an ultrafast product. Another might treat features as the dominant axis: bundled services, router quality, support level, contract flexibility, or channel-specific offers. In practice, the ordering may be a matrix of speed and features rather than a single clean ladder.

That makes the constraint problem harder than it first looks. Before you can enforce a product ladder, you have to define what the ladder means. In other domains, the ordering axis might be storage, seats, coverage, service level, risk tier, or something else entirely.

Once that hierarchy is defined, it creates commercial rules such as:

A faster product must never be cheaper than a slower product.

The model does not know this rule. It optimises each customer-product pair independently. That means the raw optimal prices might look like this:

Tier Product Raw model price
1 Basic £32
2 Mid £29
3 Fast £41

The tier 2 price is lower than tier 1. That might be explainable from the model’s local expected-value surface, but it is incoherent as a customer-facing product ladder.

There is also a second hard rule:

The recommended price at the customer’s current tier must not exceed what they currently pay.

If someone is already on a product, you cannot frame “stay where you are but pay more” as a regrade recommendation. That is a price rise wearing a trench coat.

Each product also has its own floor and ceiling, and those bounds can conflict once prices have to be monotonic across the ladder. Tightening one tier can cascade into adjacent tiers.

This is where a deterministic constraint layer becomes necessary.

Isotonic Regression And Cumulative Constraints

The ladder module takes the model’s raw per-product optimal prices and makes them globally coherent.

It works in stages.

Stage 1: Smooth The Ladder

First, apply isotonic regression to find the closest non-decreasing sequence to the raw model prices.

In plain English: change the prices as little as possible while removing hierarchy violations.

For example:

raw prices:       £32, £29, £41, £44, £43
smoothed prices:  £30.5, £30.5, £41, £43.5, £43.5

The exact output depends on weights and implementation details, but the principle is the same. Isotonic regression gives a minimal-deviation monotonic sequence. It is a mathematically clean nudge rather than a hand-written patch.

Stage 2: Anchor The Current Tier

Next, tighten the ceiling at the customer’s current tier to their current price.

If the customer currently pays £38 for tier 2, then tier 2’s ceiling becomes at most £38, even if the product’s general commercial ceiling is higher.

This prevents the system from recommending an increase at the current product level.

Stage 3: Propagate Bounds Across The Ladder

Local floors and ceilings are not enough. If tier 3 has a floor of £40, then tiers above it must also be at least £40 if the ladder is non-decreasing. If tier 2 has a ceiling of £35, then tiers below it must not exceed £35.

So the system converts local bounds into cumulative bounds:

\[\operatorname{cumulative\_floor}_i = \max(f_0, f_1, \ldots, f_i)\] \[\operatorname{cumulative\_ceiling}_i = \min(c_i, c_{i+1}, \ldots, c_n)\]

If there are required price gaps between tiers, those gaps are included in the cumulative calculation too.

The final sequence is clipped into those cumulative bounds and checked for monotonicity again.

Figure idea: Show a five-product ladder before and after constraint enforcement. On the left, raw model prices with two hierarchy violations highlighted. On the right, the adjusted ladder inside shaded floor/ceiling bands, with the current tier anchored to the customer’s existing price.

This module is deliberately separate from the model. The model outputs row-level prices. The ladder module owns cross-product coherence.

In edge cases, the constraints cannot all be satisfied at once. When that happens, the system prioritises commercial safety over marginal expected-value gains. That is the right trade-off. It is better to leave a little revenue on the table than serve an incoherent or rule-violating ladder.

The Weighting Dial

The basic expected-value score is neutral:

\[\operatorname{score}(p) = P(\text{accept} \mid p) \cdot p\]

But pricing is rarely neutral in practice. Sometimes the business wants to be more conservative, especially for customers at risk of churn. Sometimes it wants to be more aggressive, especially where retention risk is low.

The system exposes this through a single weighting value:

\[\operatorname{score}(p) = P(\text{accept} \mid p) \cdot p \cdot (1 + w d)\]

When $w = 0$, the system maximises raw expected value. When the value is negative, above-anchor prices are penalised and lower prices become relatively more attractive. When the value is positive, the system leans toward higher prices.

This is useful because the risk preference is explicit. It is not hidden inside the model.

To tune it, we sweep the weighting value across a range for each customer segment. In this case, segments were defined by contract lifecycle stage and product type. For each value, we measure:

  • Mean price adjustment relative to the unweighted optimum.
  • Mean predicted acceptance probability.

The result is a trade-off curve. As the weighting becomes more conservative, prices fall and predicted acceptance rises. But the relationship is not linear. There is usually a point where acceptance improves meaningfully with little revenue sacrifice, followed by a region where the business gives away margin for almost no additional acceptance.

Different segments have different curves. An out-of-contract customer in a competitive area may have a steep trade-off. A recently renewed customer in a less competitive area may be much flatter.

Figure idea: Plot weighting value on the x-axis from -0.5 to 0. Use one y-axis for mean price adjustment and another for mean acceptance probability. Add one curve per segment so the reader can see which groups are more sensitive to conservative pricing.

This is one of the cleanest parts of the design. The model estimates the acceptance curve. The weighting dial expresses business risk appetite. The two are connected, but not tangled together.

Evaluating Counterfactual Prices

Offline evaluation is hard because the historical data only tells us what happened at the prices that were actually offered. It does not directly tell us what would have happened if we had offered a different price.

To evaluate a new pricing strategy, we need counterfactual acceptance probabilities.

The framework uses a simple monotonic behavioural assumption:

  • If a customer accepted price $A$, they would also have accepted any lower price $M < A$.
  • If a customer rejected price $R$, they would also have rejected any higher price $M > R$.

That gives us three regions for a model-recommended price $M$.

Region 1: Definitely Accepted

If the model recommends a price at or below a price the customer actually accepted, assign acceptance probability $1.0$.

\[M \le A \quad \Rightarrow \quad P(\text{accept } M) = 1.0\]

We know the customer accepted a higher or equal price, so the lower model price would have been accepted too. The downside is that we may have left money on the table.

Region 2: Definitely Rejected

If the model recommends a price at or above a price the customer actually rejected, assign acceptance probability $0.0$.

\[M \ge R \quad \Rightarrow \quad P(\text{accept } M) = 0.0\]

We know the customer rejected a lower or equal price, so the higher model price would not have worked.

Region 3: The Risk Region

The ambiguous region is between known accepted and rejected prices. Here, we estimate acceptance probability from historical behaviour rather than using the model’s own prediction.

If the customer accepted at price $A$ and the model proposes a higher price $M$:

\[P(\text{accept } M) = \frac{P(X \ge M)}{P(X \ge A)}\]

If the customer rejected at price $R$ and the model proposes a lower price $M$:

\[P(\text{accept } M) = 1 - \frac{1 - P(X \ge M)}{1 - P(X \ge R)}\]

The probabilities come from nearest-match lookups in historical grouped data, using dimensions such as channel, product, and ARPU band.

The important point is that we do not evaluate the model using its own predictions. That would be circular. Instead, the evaluation engine uses rule-first logic where outcomes are known and historical nearest-neighbour grounding where they are not.

Figure idea: Draw a number line for one session. Mark the accepted offer $A$, the rejected offer $R$, and a candidate model price $M$. Shade $M \le A$ green, $A < M < R$ amber, and $M \ge R$ red. Label the probability rule for each region.

This is intentionally conservative. We only claim certainty where monotonic behaviour gives us certainty. Everywhere else, we estimate from observed historical behaviour and avoid extrapolating beyond the data.

Where The Gains Come From

An average uplift number is not enough to understand a pricing system. You need to know what kind of uplift it is.

The three-region evaluation framework lets us decompose outcomes into four categories:

Category Meaning
Accepted + Higher Price We charged more and the customer still accepted.
Accepted + Lower Price We underpriced and left revenue on the table.
Rejected + Higher Price We overpriced and the customer walked away.
Rejected + Lower Price We were conservative, but the customer was not going to accept anyway.

For each category, gain is computed as:

\[\operatorname{gain} = \sum_i m_i \cdot \hat{q}_i - \sum_i b_i \cdot y_i\]

Here $m_i$ is the model price, $\hat{q}_i$ is the counterfactual acceptance probability, $b_i$ is the baseline price, and $y_i$ is the observed transaction indicator.

This decomposition matters because two strategies can have the same average uplift for very different reasons.

One strategy might win by finding customers who tolerate higher prices. Another might win by converting previously lost customers with slightly lower offers. Those are different commercial stories, and they imply different next steps.

Tuning The Acceptance Model

The model tuning process has one important trap: sessions contain multiple rows.

A single regrade session may show several products at several prices. If one row from a session lands in training and another lands in validation, the validation score is contaminated. The model has effectively seen part of the same customer decision context during training.

The fix is group-aware cross-validation, using session ID as the group. Every row from a session stays together.

session_123 rows -> training fold only
session_456 rows -> validation fold only

No split is allowed to break a session apart.

For Bayesian optimisation with Hyperopt, we also separate early stopping from cross-validation:

  1. Run early stopping once on a fixed held-out evaluation split.
  2. Use that to choose n_estimators.
  3. Fix n_estimators.
  4. Run group-aware cross-validation without fold-specific early stopping.

That keeps trials comparable. Otherwise, each fold can stop at a different number of trees, adding noise to the hyperparameter search.

This is a small design choice, but it avoids a surprising amount of evaluation instability.

Where The Model Lives

The ML model is only one component in a larger pricing machine.

In simplified form, the production path looks like this:

flowchart TD A["Customer enters<br/>regrade journey"] --> B["Prescriptive<br/>pricing"] B --> C["Business rule<br/>filtering"] C --> D["Price<br/>overrides"] D --> E["Tier<br/>sorting"] E --> F["ML pricing for<br/>eligible tier(s)"] F --> G{"Locked offer<br/>exists?"} G -->|Yes| H["Override with<br/>locked offer"] G -->|No| I["Final price<br/>served"] H --> I class F focus; class C,D,E guardrail; class G decision; class H guardrail; class I terminal;
The ML model is one decision step inside a larger pricing path, with rule filtering before it and locked-offer overrides after it.

That placement matters. The model does not see the whole product catalogue. It only sees products that have already survived filtering. Its recommendation can still be overridden by locked offers.

The product floor the model sees is also not a single simple number. It can be derived from multiple configuration layers:

  • Target price settings for the product.
  • Price bounds as a fallback.
  • Regrade policies that can raise the floor.

The final floor is the maximum of these sources. That is deliberate defence in depth. The business defines a safe window. The model optimises inside it. The ladder module ensures cross-product coherence. Locked offers provide a final manual override.

The probabilistic core is surrounded by deterministic safety layers.

Measuring The Full Revenue Picture

Session-level gain is useful, but the business ultimately cares about total revenue impact across the base.

That means accounting for at least three effects.

Effect What it captures
ARPU delta investment Revenue given up when a customer regrades to a lower monthly price.
Churn customer value Lifetime revenue lost when a customer churns.
Churn save value Value of customers retained by a successful regrade, net of processing cost.

The weighting parameter trades off between these effects.

An aggressive model extracts more revenue from customers who accept, but may save fewer customers at risk of churn. A conservative model may save more customers, but with a larger ARPU investment.

This is why the decision framing matters. The business is not really optimising model AUC or log loss. It is optimising a portfolio-level financial trade-off.

The Full System

Putting the pieces together, the system is:

flowchart TD A["Historical regrade<br/>sessions"] --> B["Feature engineering<br/>and economic anchors"] B --> C["XGBoost<br/>acceptance model"] C --> D["Candidate grid +<br/>expected-value maximisation"] D --> E["Ladder enforcement<br/>with isotonic smoothing<br/>and constraints"] E --> F["Production-safe<br/>prices"] class C,D focus; class E guardrail; class F terminal;
The system composes a probability model, an optimisation layer, and deterministic constraint enforcement into production-safe prices.

Each layer has a specific job.

The model estimates acceptance probabilities. The candidate grid converts those probabilities into row-level prices. The ladder module makes the full product menu coherent. The evaluation engine estimates what would have happened under alternative prices.

This modularity is what makes the system explainable. If a price changes, you can usually identify whether it came from the acceptance curve, the risk weighting, a floor, a ceiling, a ladder adjustment, or a locked offer.

That is much easier to reason about than a single model that tries to learn everything at once.

What I Find Elegant About This

The elegant part is not the model architecture. It is the reframing.

“Predict the best price” gives you a regression model with no explicit notion of trade-off.

“Predict acceptance probability as a function of price” gives you a behavioural model that supports downstream decision-making.

The monotonic constraint is also satisfying because it encodes one simple economic assumption: making the offer more expensive should not make it more attractive. That one structural assumption makes the expected-value surface much easier to optimise.

The ladder enforcement is another example of clean separation. Instead of bolting rules onto the side of the model, the system defines a deterministic module whose only job is to make the price vector commercially coherent.

And the counterfactual evaluation approach has the right kind of humility. It uses certainty where the data gives certainty, estimates only in the ambiguous region, and avoids pretending the model can validate itself.

The Gap Between Theory And Reality

In isolation, this system is clean. The acceptance model is calibrated around a useful economic signal. The expected-value optimisation is direct. The ladder enforcement is principled. The evaluation framework is conservative.

But production environments have opinions.

The ML model may only get to price certain tiers. Locked offers may override the exact customers where personalisation would matter most. Floor prices may ratchet upward because of commercial policies outside the model. The candidate grid may be narrow before the model ever sees it.

In practice, these constraints can compress the model’s effective pricing freedom.

In one representative weekly readout, the ML-priced strategy had a higher regrade rate than the control, roughly 27% versus 25%, but also a slightly larger ARPU delta, roughly -£1.64 versus -£1.02. Net revenue impact was broadly comparable across strategies, around -£107k to -£110k.

That is not failure, but it is humbling.

The model may want to price at the peak of the expected-value curve. After floors, ceilings, anchoring, locked offers, and tier restrictions, the served price can end up close to where a simpler heuristic would have landed anyway.

The main compression mechanisms were:

  • Tier restriction: ML pricing only influenced part of the ladder.
  • Floor ratcheting: regrade policies pushed floors upward and narrowed the candidate grid.
  • Locked offer overrides: high-value customers were often removed from the model’s control.
  • Metric sensitivity: weekly cohorts were too small to make revenue impact estimates stable.

This is an important lesson for production ML. Sometimes the bottleneck is not model quality. It is the amount of decision freedom the surrounding system gives the model.

What Comes Next

The next improvements are less about model sophistication and more about operational integration.

Useful next steps would be:

  • Expand ML pricing beyond a single tier so the model can influence more of the product ladder.
  • Revisit floor and ceiling configurations to make sure they reflect current market reality rather than historical conservatism.
  • Find safe ways to apply personalised pricing to high-value customers currently handled by locked offers.
  • Use longer evaluation windows to reduce weekly volatility and measure cumulative impact more reliably.

This is the unglamorous part of ML in production. The model can be ready before the organisation is ready to give it more room.

Closing that gap is not only an engineering problem. It is a trust problem. It requires evidence, stakeholder alignment, careful guardrails, and patience.

Final Thought

The general lesson is that decision models live inside systems.

It is not enough to ask whether the model can produce a better recommendation. You also have to ask whether the surrounding process allows that recommendation to matter.

In this case, the strongest idea was not a clever loss function or a more exotic algorithm. It was asking a better question:

Not “what is the right price?” but “what is the acceptance curve, and what decision should we take on top of it?”

That shift makes the trade-offs visible. Once the trade-offs are visible, the business can tune them, constrain them, evaluate them, and decide how much freedom the model has earned.

Resources

Citation Information

If you find this content useful & plan on using it, please consider citing it using the following format:

@misc{nish-blog,
  title = {The Geometry of Pricing: Turning Prediction Into a Decision},
  author = {Nish},
  howpublished = {\url{https://www.nishbhana.com/The-Geometry-Of-Pricing/}},
  note = {[Online; accessed]},
  year = {2026}
}

x.com, Facebook