Reduce Return Fraud in Ecommerce Without Triggering Chargebacks

Your Return Rate Is Above 15% and BFCM Is Five Weeks Out — Here's the Actual Problem

A 22% return rate on a new product line with a $4.17 per-order contribution margin means a single return wipes out the margin on five clean orders. At BFCM volume, that is not a customer service issue. It is a unit economics emergency.

The math compounds fast. Return rates spike 35–50% above baseline during November through January. A founder already running at 18% pre-BFCM is looking at a potential 27–28% return rate during the exact window they need margin to hold. If your per-order contribution margin on a high-return SKU is already compressed to single digits, a holiday-level return spike does not just hurt — it inverts the economics of running the promotion at all.

But the return rate is not the root cause. The root cause is the absence of any written rule that gives your support staff authority to say no.

When a VA or part-time support rep is handling your Gorgias queue with no written decision logic, every ambiguous return gets approved. Not because they are careless — because they have no written authority to deny anything without escalating to you, and escalating to you on every ticket is not sustainable. So they approve. Every time. And the return rate climbs.

The policy trap most DTC operators fall into is not being too generous or too strict. It is having no decision rule at all. Before you rewrite a single line of your policy page, pull your last 90 days of return data from Shopify admin and calculate your actual per-order contribution margin on your highest-return SKU. If the margin on that SKU divided by your return rate is negative, you have a unit economics problem that needs a written rule before BFCM opens — not a policy conversation with your VA every time a ticket comes in.

Why Tightening Your Return Policy Alone Creates a Chargeback Problem

The instinct when return rates climb is to tighten the policy. Shorter window. Stricter conditions. Fewer exceptions. That instinct is correct in direction and wrong in execution, because a blanket tightening does not distinguish between a serial returner and a first-time buyer who received a defective item.

When a legitimate buyer who received a defective item hits a blanket denial, they do not accept it. They file a chargeback. Now the merchant is facing a dispute with a 72-hour representment window, a potential Shopify Payments account health flag, and a permanently lost customer — all of which is worse than the original return cost.

The binary trap is real: a policy too lenient becomes a magnet for dishonest customers; a policy too strict drives away genuine buyers. But that trap only exists if the policy applies the same rule to every return request regardless of customer history or return reason. The trap is not inherent to strict policies — it is inherent to undifferentiated policies.

ASOS and other large retailers have added explicit serial-returner language to their policies — language like "if we notice an unusual pattern of returns activity that doesn't sit right, then we might have to deactivate the account" — not because they tightened across the board, but because they built a distinction. The policy is strict for customers with a documented pattern. It is fast and clean for customers without one.

That distinction is the entire mechanism. Review your last five denied or disputed returns. Identify whether each was a first-time buyer with a legitimate reason — QUALITY_DEFECT, DAMAGED_IN_TRANSIT — or a repeat customer with a documented pattern — SIZING cited on order 4 of 7. If you cannot answer that question from your current records, your policy has no decision logic. It has a blanket rule, and a blanket rule is what generates chargebacks when volume hits.

How to Identify Serial Returners Before They Hit Your BFCM Queue

You do not need a third-party fraud tool to identify serial returners. You need three data points from Shopify admin and a threshold you define in writing before BFCM opens.

The threshold that gives you a defensible written rule: three or more return requests within any rolling 90-day window, or a lifetime return rate above 40% of orders placed. A customer who has placed 7 orders and returned 5 — a 71% lifetime return rate — clears that threshold on both counts. That customer is not a sizing problem. That customer is a serial returner, and they need to be flagged before they place an order during your BFCM sale.

Time-to-return is the second signal, and Shopify admin exposes it without any additional tooling. A customer who submits a return request within 48 hours of confirmed delivery on multiple separate orders is not experiencing sizing issues. They are exhibiting a wardrobing pattern. Two sub-48-hour returns across two separate orders is a documented pattern, not a coincidence — and it is the kind of signal that holds up in a chargeback representment because it is timestamped in your order history.

Return reason consistency is the third signal. A customer who cites SIZING on every return across multiple product categories — activewear, outerwear, accessories — is either genuinely unlucky or using the most sympathetic reason code available. Cross-referencing return reason against product category and order frequency surfaces this pattern in a spreadsheet. It takes about 20 minutes on a 90-day export.

Before BFCM opens, export your Shopify order and return history for the last 90 days. For each customer with more than one return, calculate: (1) total returns divided by total orders placed, (2) average days between delivery confirmation and return request submission, (3) return reason consistency across orders. Flag anyone above the 40% lifetime return rate threshold or with two or more sub-72-hour returns. That is your pre-BFCM watchlist — built from data you already have, in a tool you already use.

The Risk Scoring Framework That Gives Your VA a Decision Rule, Not a Judgment Call

The reason your VA approves every return is not that they lack judgment. It is that judgment is not a written rule, and without a written rule, the path of least resistance is always approval. A 0–10 risk score built from seven return reason categories converts a judgment call into a written rule your VA can execute without calling you.

The seven return reason categories cover the full range of what comes through a support queue: SIZING, QUALITY_DEFECT, CHANGED_MIND, DAMAGED_IN_TRANSIT, NOT_AS_DESCRIBED, SUSPECTED_WARDROBING, and SERIAL_PATTERN. Each carries a different risk weight. QUALITY_DEFECT and DAMAGED_IN_TRANSIT score low — these are legitimate product failures and fast approvals. SUSPECTED_WARDROBING and SERIAL_PATTERN score high. The score is additive across a customer's history, not per-ticket, which means a customer who has submitted three SIZING returns in 90 days accumulates a higher score than a first-time buyer submitting the same reason code once.

Three risk tiers map directly to five decision outcomes:

LOW (0–3 points): APPROVE_REFUND or APPROVE_EXCHANGE — fast, clean resolution, no friction for legitimate buyers
MEDIUM (4–6 points): OFFER_STORE_CREDIT or APPROVE_EXCHANGE with a 15% restocking fee deducted from the refund
HIGH (7–10 points): DENY_WITH_REASON or ESCALATE_TO_FOUNDER

The restocking fee structure — 15% of item price, waived if the customer accepts store credit or exchange — is not punitive for legitimate buyers. A customer with a genuine sizing issue will usually take the exchange. A serial returner who wants cash back, not store credit, will push back. That pushback is the signal. The fee creates friction exactly where friction is warranted.

Your VA picks from the outcome list. They do not decide the outcome — the score decides the outcome. That is the difference between a policy and a decision rule.

Map your last 10 return requests to the seven reason categories and assign a risk tier to each using the 0–3 / 4–6 / 7–10 scale. Then identify which of the five decision outcomes each should have received. If your actual outcomes — what your VA approved — do not match the scored outcomes, you have a policy inconsistency problem that a written decision tree would close before BFCM opens.

The connection note variations, first message templates, follow-up script, and outreach tracker — copy-paste ready — are in The Black Friday Returns Playbook kit. $23, instant download.

How to Deny a Fraudulent Return Without Triggering a Chargeback

A denial that follows a documented decision rule, uses non-accusatory language, and offers a documented alternative is defensible in a chargeback representment. A denial that reads like a blanket refusal is not.

The chargeback defense window is 72 hours from notification to submit representment evidence. The evidence that wins a representment is not your policy page text — it is the documented decision trail: the risk score on file, the return reason the customer submitted, their order and return history, and the written outcome your policy specifies for that risk tier. If you cannot produce that trail within 72 hours, the representment fails on documentation, not on merit.

Denial language that names the specific policy clause being applied is harder to dispute than language that reads as discretionary. "Your return request falls outside our 30-day return window" or "this account has exceeded the return frequency threshold defined in our policy" gives the customer a specific clause to contest — and if the clause is disclosed at purchase, the contest fails. Vague language like "we are unable to process your return at this time" gives the customer nothing to contest and everything to dispute.

Offering a documented alternative in the denial response — store credit, exchange, or a partial refund with the restocking fee applied — creates a paper trail showing the merchant acted in good faith. A customer who rejects a documented alternative and files a chargeback is in a materially weaker representment position than a customer who received a flat denial with no alternative offered. The alternative offer is not a concession. It is chargeback defense documentation.

Write one denial response template for each of the three scenarios where denial is appropriate: (1) outside the return window, (2) serial returner threshold exceeded, (3) suspected wardrobing based on time-to-return pattern. Each template should cite the specific policy clause, offer a documented alternative, and include a line confirming the customer was notified of the policy at purchase. That template set is your representment package — built before you need it, not assembled in a 72-hour window while volume is still running.

Setting Your New Product Line Return Window Before BFCM Exposes the Sizing Problem

A new product line with an undiagnosed sizing issue needs a tighter return window and a separate risk scoring baseline — not the same 30-day policy as your established SKUs. The return rate data you collect in the first 21 days is the only signal you have before BFCM volume amplifies the problem into something you cannot diagnose in real time.

A 21-day return window for any SKU launched within the prior 90 days is tighter than your standard 30-day window, but it is defensible because it is disclosed at purchase and applied consistently. The tighter window also accelerates the return data you need. Returns that come in within 21 days tell you whether the problem is sizing — returns cluster around specific size codes — or quality — returns are distributed across sizes but cite QUALITY_DEFECT. Those two diagnoses require different responses, and you need the data before BFCM opens, not after.

A 22% return rate on a new activewear line where every return cites SIZING is a product data problem, not a fraud problem. But it becomes a fraud problem the moment serial returners learn that SIZING is the reason code that always gets approved. The fix is not to stop approving SIZING returns — it is to apply the serial returner threshold to SIZING-coded returns the same way you apply it to every other reason code, and to add a fit guide to the product page that removes the legitimate sizing ambiguity. If the size chart is accurate and a customer still cites SIZING on their third return in 60 days, that is not a sizing problem.

The pre-BFCM audit for a new product line has two outputs: a return rate baseline by SKU and size code that tells you whether to pull a SKU from BFCM promotions, and a flagged customer list of anyone who has already hit the serial returner threshold on the new line before the sale opens. For every SKU launched in the last 90 days, pull return rate by size code from Shopify admin. If any single size code is above 30%, that is a fit problem — add a fit note to the product page before BFCM and consider excluding that size from promotional pricing. Then run the serial returner threshold check on the new line separately from your full catalog, because the return rate baseline is different and a customer who has returned twice from a new line in 30 days is already at threshold.

The Actual Goal Is Not a Better Policy — It's a Policy Your Staff Can Execute Without You

Return fraud is cheap to commit against stores where every denial requires escalation. The serial returner knows that the path of least resistance for an under-resourced support team is approval. The friction that stops serial returners is not a stricter policy — it is a policy with written decision rules that a VA can apply without calling the founder. The policy is only as effective as the person executing it, and that person needs a rule, not a judgment call.

The five decision outcomes — APPROVE_REFUND, APPROVE_EXCHANGE, OFFER_STORE_CREDIT, DENY_WITH_REASON, ESCALATE_TO_FOUNDER — cover every return scenario a support ticket will surface. The critical design point is that ESCALATE_TO_FOUNDER is not a fallback for ambiguity. It is a specific outcome for a specific condition: typically a HIGH-tier customer with a chargeback history, or a disputed defective item claim above a defined dollar threshold. If ESCALATE_TO_FOUNDER is the outcome whenever your VA is uncertain, you have not written a policy — you have written a routing rule that still puts every hard decision on your desk.

The measure of whether your policy is working is not your overall return rate in isolation. It is your return rate segmented by risk tier. If your LOW-tier return rate is stable and your HIGH-tier return rate is falling, the policy is working. If your overall return rate is falling but your chargeback rate is rising, you have tightened the wrong variable — you are denying legitimate buyers and sending them to their bank.

Define the one condition under which your VA must escalate to you. Not a list of conditions — one condition. Write it as a single sentence and put it at the top of your decision tree. Everything else should resolve to one of the other four outcomes without your involvement. If you cannot write that sentence in under two minutes, your policy still has a judgment gap — and that gap is what makes return fraud cheap to commit at BFCM volume, when your queue is running at 4x baseline and your VA is making 40 decisions a day instead of 10.