Skip to content

Master Data Cleanup: Lessons from a Carve-Out Disaster

Featured Image
 Featured Image

After a four-year career at Faurecia, I transitioned to a new challenge at a company carved out from Kodak called Carestream. I joined as Chief Data Officer and Enterprise Architect, and one of my first major challenges was tackling their legacy SAP environment—and the poor quality of master data within it.

The Kodak Legacy

If you’ve ever been to Rochester, NY, you’ve seen Kodak’s presence—its massive headquarters dominates the skyline. It’s an ironic story: the very company that pioneered digital photography ultimately went out of business because of it. Kodak’s leadership clung to the film business, a market that had brought immense profit, even as digital camera technology—ironically based on Kodak’s own invention—exploded worldwide. In protecting their golden goose, they missed the next revolution.

Their opulent headquarters reflected a company at its peak: lavish interiors, executive luxury—especially on the top floors. But after the downfall, Kodak’s healthcare division, which produced X-ray equipment, was sold to a private equity firm in Toronto and reborn as Carestream.

A Walkway, a Split, and a Mess

The transition from Kodak to Carestream was literal and symbolic. Carestream took over a building adjacent to Kodak’s headquarters, connected by a glass walkway. After the deal closed, they physically rolled servers on wheels across walkway servers containing SAP data, licenses, and systems. Once the migration was done, the connection between the two buildings was sealed permanently. It was a fitting metaphor: Kodak was history, and Carestream had to make its own future.

Unfortunately, that future began with a nightmare of bad data.

The Master Data Disaster

Carestream inherited SAP 4.5, an ancient and heavily customized system that was no longer upgradeable. The system contained 153,000 customer records, of which 39% were duplicates—around 55,000 in the U.S. and another 46,000 internationally. Vendor data was just as bad, with duplicate vendor accounts, conflicting invoices, and inconsistent contract records. The SAP environment was so corrupted that moving to SAP HANA—the company’s aspiration—was impossible in its current state.

I started with department reviews to understand the scope of the problem. What I found was shocking. In the billing department, a sales order entry operator showed me a dropdown list of customer types in SAP—with nearly 200 options. These ranged from photo-developing kiosks to hospitals, liquor stores, and tobacco distributors. No one could—or would—scroll through 200 choices. People just picked the few they recognized, compounding the inconsistency every single day.

In 30 years of IT leadership, I’d seen my share of bad systems, but this one ranked near the top. Over-customization, poor governance, and a lack of discipline had turned what should have been a structured ERP system into chaos.

Finding a Way Out

The path forward came from an SAP event, where I discovered a company called Winshuttle. Their platform sat on top of SharePoint and could integrate directly with SAP—even legacy systems like ours. With some coding extensions, it allowed us to validate and clean master data in real time.

We connected Dun & Bradstreet’s API into Winshuttle, giving operators access to accurate, authoritative company data. When a user entered a new customer, Winshuttle would display all possible duplicate accounts side-by-side—sometimes with subtle differences like:

  • “ABC Engineering”
  • “ABC Engineering, Inc.”
  • “ABC Engineering Corporation”

These inconsistencies have plagued us for years. Now, the operator could see them all at once, compare Dun & Bradstreet’s verified data, and decide which record was correct. Winshuttle provided guidance based on programmed rules, but the final decision rested with a human—this was before AI-based data matching.

Once a record was confirmed, the system automatically updated SAP, and batch jobs would run overnight to merge duplicates and enforce standardized master data records. The same process was applied to vendors, purchasing, and inventory data.

The Results

Within nine months, we had transformed Carestream’s data landscape.

  • Duplicate customer and vendor records were virtually eliminated.
  • Master data processes were standardized and automated.
  • The company could finally consider moving toward SAP HANA.

It was one of the most satisfying data quality turnarounds of my career demonstration of how process design, automation, and discipline can rescue even the most tangled ERP environments.

 

If you’re struggling with mastering data quality in SAP or any ERP system and want to learn more about this approach, feel free to reach out. My email is below. If there’s enough interest, I’ll publish a follow-up blog with technical details and process documentation for implementing a similar cleanup strategy.