What a Broken Coffee Table Taught Me About Distribution's Data Problem
A few years ago, I broke a caster wheel on my parents' coffee table. Simple fix. I just needed a replacement with the right bolt spacing. So, I did what any normal person does now - went online to buy it.
I checked distributor website after distributor website with no luck. Turns out, only one website would let me filter by bolt spacing: Grainger. Every other site was a dead-end. To find what I needed anywhere else, I would've had to call a rep. Here's what makes that embarrassing: I work in this industry. I know these products. If I couldn't find it self-serve, your customers have no shot.
And according to Gartner, 67% of B2B buyers prefer a rep-free buying experience. The problem isn't that customers don't want to buy online. It's that our website won't let them.
The PIM didn't fix the problem. It housed It
Most distributors invested in a PIM, thinking it would solve their product data problem. It didn't.
That's not a knock on any specific vendor. It's just what PIMs are. Digital filing cabinets. They store data. They don't enrich it, validate it, or fill in the blanks. Up until now, the unspoken deal when you license a PIM has been that you bring clean data, and the PIM will help you manage it.
Except that deal falls apart immediately in distribution.
And how could it not? We’re juggling 1,200 vendors sending data in wildly inconsistent formats. Spreadsheets, PDFs, supplier portals, the occasional photo of a product label emailed by a sales rep. And that’s just the intake problem. We then have to map, enrich, and normalize every piece of incoming data across anywhere from tens of thousands to millions of individual SKUs.
When a single vendor changes their data format, it can ripple across thousands of affected SKUs, and that’s just one vendor out of 1,200. Multiply that across your entire supplier base, and you have a task that is not just time-consuming but functionally impossible to do well by hand.
Some companies build internal data teams to do this manually. But most can't justify the headcount, so they try the alternatives.
Option one. Shared data services. Technically accurate, but they give every other distributor member the exact same data. This creates SEO issues when Google sees duplicate content across dozens of sites and penalizes everyone. You paid for data and got worse SEO rankings in return.
Option two. offshore agencies. Slow and chronically wrong on technical products. One distributor we talked to in the construction supply space hired a firm to enrich its mortar product line. After paying them thousands of dollars, the firm came back with specs for military mortar instead of mortar, the cement paste for building supplies. These types of expensive mix-ups happen all the time. Neither option solves the underlying issue.
The $5B problem you can’t see
The cost of lost sales, customer returns, and hours of manual data cleanup add up quickly. NAED research puts the cost of bad product data for electrical distributors and manufacturers at $5 billion annually. And that's just one vertical.
Every customer who can't find what they need on your site is part of that number. Here's how it plays out: a customer searches "3/4 inch brass fitting" and nothing comes up. Not because you don't carry it. But because it's cataloged as "0.75 in. brass connector." That customer doesn't call to flag the mismatch. They just leave your site, and you never knew they were there.
What AI changes for your product data
For a long time, there was genuinely no good answer. Automation at the scale and accuracy distribution requires simply didn't exist. Your options were bad data or expensive humans slowly producing bad data. That's now changed.
New AI models can scrape manufacturer websites, parse spec sheets, and normalize attributes across your entire catalog automatically. It can also normalize your entire product taxonomy at scale, and notices when different suppliers are describing the exact same spec in different ways. Take color: one supplier calls it "Navy," another calls it "00-Blue," a third calls it "Dark Blue." AI resolves all three to "Blue" in milliseconds, consistently, across every category in your catalog. That is what makes the search filters on your ecommerce site actually work.
This allows you to 10x the work of every employee manually enriching or organizing your product data. And when your product data is clean and consistent, everything downstream works better.
Search returns results because the terminology matches what customers actually type. Filters work because attributes are normalized across categories, not just within a single product line. Product pages convert more because the specs, images, and descriptions give customers all the information they need to make a confident purchase. And your SEO improves because your content is unique, not the exact same copy as the twelve other distributors pulling descriptions from the same data broker.
What to do next: Be your own customer for 10 minutes
If you're not sure where to start, do what I did when I was looking for that caster wheel. Go to your own website and try to buy something with a specific spec using only search and filters.
Can you identify the right item without calling a rep? From there, pull your zero results search report. Most eCommerce platforms log every search that returns nothing. That report is a direct map of your data problems, written in your customers' words, not yours.
Next, pull up five of your best-selling SKUs next to Grainger's. Note what they show that you don't. The missing specs, the absent filters, the thin product pages.
This is a great way to start mapping what's broken on your website. The next question is where the problem lives. Pick your worst-performing category and trace the data back to its source. A buying group feed? An offshore agency? A supplier portal nobody has touched in two years? That's where to start enriching your data.
That used to mean hiring a team and doing it by hand, category by category, SKU by SKU. Now there are AI tools that scrape, parse, and normalize your catalog automatically, at a fraction of the time and cost.
Sure, the gap between what you carry and what your customers can find is a data problem. But now for the first time, it's solvable at scale.
Every day you wait, someone is doing exactly what I did, going from site to site, looking for a part they know you carry. Most of them aren't waiting around to find it on yours.
OTHER ARTICLES BENJ COHEN HAS WRITTEN FOR ELECTRICAL WHOLESALING (OR ABOUT PROTON.AI)
The Art & Science of Artificial Intelligence
EW first met Benj at a 2019 NAW meeting and was so impressed with his background we wanted or readers to learn about him and proton.ai in this article in April 2019.
Proton.ai Launches New Product Information Management (PIM) System
October 2025
New Proton Report Reveals Why Distributors Can’t Afford to “Wait and See” on Tariffs
June 2025
Proton.ai Becomes First Distribution CRM to Achieve SOC 2 Type II Certification
September 2024
Salespeople Can Learn to Love Your New CRM
July 2024
How to Grow Your Business Organically
August 2023
May 2023
How AI Can Personalize Sales Experiences for Your Customers
February 2023
June 2023
September 2022
About the Author
Benj Cohen
Benj Cohen grew up in his family’s distribution business, Benco Dental, a dental supply business started by his great-grandfather in the 1930s. He blended this background with a Harvard degree in applied math to found proton.ai (www.proton.ai), a company dedicated to bringing artificial intelligence to distribution companies and others in the B2B world to deliver large ROI. Cohen was the subject of a 2019 Electrical Wholesaling feature on AI, “The Art & Science of Artificial Intelligence,” and has written frequently on the topic for EW. You can contact him at at [email protected].


