DeliveredCleo CICSuiteScript 2.xMap/Reduce+1

Retail EDI Integration & Stabilization

How we eliminated 2AM fire drills and built predictable EDI posting

8 weeksLead DeveloperLast revised: 2025-01-07

2AM incidents

WeeklyNear zero

~95% reduction

Manual intervention

DailyException-only

~80% reduction

Partner onboarding

2-3 weeks3-5 days

70% faster

1The Problem

The operations team was drowning in EDI failures. Every week, someone would get paged at 2AM because a batch posting had failed silently, leaving orders stuck in limbo. The existing integration had no retry logic, no exception handling, and no way to identify which specific transactions had failed. Partner onboarding was a nightmare—each new retail partner meant weeks of mapping adjustments and prayer.

2Investigation

  • Mapped existing data flows from Cleo CIC to NetSuite (846 inventory, 850 orders, 856 ASNs)
  • Identified 7 distinct failure modes in the existing scripts
  • Found that 60% of failures were due to item matching issues (GTIN/UPC mismatches)
  • Discovered governance limit violations during peak posting times
  • Documented undocumented field mappings across 4 retail partners

3Architecture

We designed a resilient pipeline with three core principles: fail safely, recover automatically, and surface exceptions with context. The new architecture introduces an exception queue as a first-class citizen, retryable Map/Reduce jobs with exponential backoff, and a matching engine that gracefully handles GTIN/UPC variations.

Retail EDI Integration Architecture

EDI 850
JSON

Tap cards to see details

1// Exception queue handler with context
2function handleException(context) {
3 const exception = {
4 transactionId: context.txnId,
5 failureType: context.error.type,
6 payload: context.rawPayload,
7 attemptCount: context.attempts,
8 lastAttempt: new Date().toISOString(),
9 suggestedAction: inferAction(context.error)
10 }
11
12 record.create({
13 type: 'customrecord_edi_exception',
14 values: exception
15 })
16
17 // Alert only on business-critical failures
18 if (isCritical(context.error)) {
19 notify.ops(exception)
20 }
21}

Exception handling with context preservation

4Implementation

  • Built GTIN/UPC matching engine with fallback logic and fuzzy matching
  • Implemented Map/Reduce with governance-aware throttling (pauses before limits)
  • Created exception queue with replay capability and suggested actions
  • Added comprehensive logging with transaction correlation IDs
  • Deployed incrementally with shadow mode validation before cutover

5Outcome

Three months post-deployment, the team has had exactly one 2AM incident (caused by an upstream partner format change, not our code). The exception queue now handles edge cases gracefully, giving the ops team clear next steps instead of mystery failures. Partner onboarding dropped from weeks to days thanks to the new matching engine.

Lessons Learned

  • Shadow mode deployment was critical—caught 3 edge cases before production
  • Exception queue design matters more than retry logic
  • Governance limits are real constraints that need first-class handling

First written: 2024-06-15 · Last revised: 2025-01-07

Facing similar integration challenges?

Let's discuss how to stabilize your EDI or NetSuite integrations.

Jorge Muñoz | Senior Full-Stack Integration Engineer | NetSuite + EDI (Cleo CIC)