I'll never forget the look on the IT director's face when we found credit card numbers in their HR system.
It was 2017, and I was conducting a PCI DSS pre-assessment for a regional restaurant chain. They were confident about their scope—payment terminals connected to their payment processor, and that was it. Clean. Simple. Manageable.
Then we found 847 credit card numbers in their employee expense reimbursement database.
"But... how?" the IT director stammered. "We never store card data!"
Famous last words I've heard more times than I can count in my 15+ years doing PCI assessments.
Here's the uncomfortable truth: organizations rarely know where all their cardholder data actually lives. They know where it's supposed to be. But data has a nasty habit of spreading like water, flowing into every crack and crevice of your IT environment.
And under PCI DSS 4.0, if you can't find it, you can't protect it. And if you can't protect it, you're one breach away from catastrophic fines, losing your ability to process cards, and potentially closing your doors.
Let me show you how to find that hidden payment data before an attacker does.
Why Cardholder Data Discovery Is Your Most Critical PCI Task
I've assessed over 200 organizations for PCI compliance, and I can tell you with absolute certainty: data discovery failures are the #1 reason companies fail their first PCI assessment.
Not weak passwords. Not missing patches. Not inadequate firewalls.
It's cardholder data in places nobody knew about.
"You can't secure what you can't see. And in my experience, most organizations can only see about 60% of their actual cardholder data footprint."
The Real Cost of Unknown Data Locations
Let me share a cautionary tale that still makes me wince.
In 2019, I was called in post-breach to help a mid-sized e-commerce company. They'd been breached six months earlier—attackers stole 23,000 credit card numbers. The company had passed their PCI assessment just four months before the breach.
How?
The assessment covered their production payment systems. But developers had copied production data to a staging environment for testing. That staging environment wasn't in the documented cardholder data environment (CDE). It wasn't protected. It wasn't even mentioned in the assessment.
The attackers found it in about eight hours.
The aftermath:
$2.7 million in card brand fines
$4.1 million in forensic investigation and customer notification
$890,000 in legal settlements
Payment processor terminated their merchant account
Business closed 14 months later
All because of cardholder data they didn't know they had.
Understanding What You're Really Looking For
Before we dive into discovery techniques, let's be crystal clear about what constitutes cardholder data under PCI DSS.
Primary Account Number (PAN): The Crown Jewel
The PAN is the card number itself—typically 13-19 digits, though 16 is most common. This is what attackers want, and this is what PCI DSS protects above all else.
Important: Even a partial PAN counts. If you're storing the first 6 and last 4 digits in separate locations, you're storing cardholder data. I've seen organizations fail assessments because they thought masked data wasn't in scope.
Sensitive Authentication Data: The Forbidden Fruit
Here's where many organizations get into trouble. PCI DSS absolutely prohibits storing certain data after authorization, even if encrypted:
Data Element | Storage Permitted | Common Violation |
|---|---|---|
Full magnetic stripe data | ❌ Never | Legacy POS systems logging swipes |
CAV2/CVC2/CVV2/CID | ❌ Never | E-commerce platforms "for convenience" |
PIN/PIN Block | ❌ Never | Custom payment applications |
I once found CVV codes in a customer service ticketing system where agents had copied-and-pasted card details from phone calls. The company had been storing thousands of CVV codes for over three years, a PCI DSS violation that would have resulted in massive fines if discovered by their acquirer.
Cardholder Data Elements: The Supporting Cast
Beyond the PAN, you also need to track:
Data Type | Example | PCI DSS Requirement |
|---|---|---|
Cardholder Name | John Smith | Must be protected if stored with PAN |
Service Code | 3-4 digit code on magnetic stripe | Must be protected if stored |
Expiration Date | 12/25 | Must be protected if stored with PAN |
"Every piece of cardholder data is a liability. The less you store, the less you have to protect, and the smaller your PCI scope becomes."
The Hidden Places Where Cardholder Data Lurks
In my years of assessments, I've found cardholder data in the most unexpected places. Let me walk you through the usual suspects—and the surprising ones.
1. Application and Web Server Logs
This is where I find cardholder data about 40% of the time.
A healthcare provider I worked with had a perfectly secure payment application. But their web server was configured to log all POST requests—including the payment form submissions. They had 18 months of credit card numbers sitting in plain text log files.
Where to look:
Web server access logs (Apache, Nginx, IIS)
Application logs (custom software, commercial platforms)
Error logs (these often dump entire requests during failures)
Debug logs (developers love verbose logging)
API gateway logs
Load balancer logs
Real example: I once found over 12,000 credit card numbers in Apache access logs because the payment form was using GET instead of POST, putting card numbers in the URL. Every web request was logged with the full PAN visible.
2. Database Backups and Archives
Organizations religiously protect their production databases but completely forget about backups.
I assessed a restaurant chain that had excellent encryption on their production payment database. But they backed up to an unencrypted NAS device every night. Six years of backups. Over 2 million credit card numbers in plain text.
Where to look:
Local backup directories
Network-attached storage (NAS) devices
Tape backup archives
Cloud backup storage (S3, Azure Blob, etc.)
Snapshot copies
Archived database dumps
Old server images before decommissioning
3. Development and Testing Environments
This is my #2 most common finding—developers using production data for testing.
The logic seems sound: "We want realistic test data." The execution is catastrophic: production cardholder data copied to less secure environments.
Real story: A payment gateway provider I assessed had three separate environments where developers had copied production data:
Local development machines (12 developer laptops)
Shared test server (no access controls)
Demo environment for sales (accessible from the public internet)
They thought they were protecting two databases. They were actually protecting seventeen.
Where to look:
Development databases
QA/testing environments
User acceptance testing (UAT) systems
Developer workstations
Docker containers
Virtual machine images
Demo systems for sales/training
4. Email Systems and Mailboxes
People email payment information. They shouldn't, but they do.
I've found cardholder data in emails in every assessment I've ever conducted. Customer service sends card details to accounting. Sales emails failed transaction information to IT. Customers email their card numbers when they can't complete a web form.
Where to look:
Exchange/Office 365 mailboxes
Gmail/Google Workspace
Archived email systems
PST files on user computers
Email server logs
Helpdesk/ticketing systems
Shared mailboxes
Shocking discovery: At a hotel chain, I found that the night audit process involved the desk clerk emailing daily transaction reports—including full card numbers—to the accounting department. Every single day. For four years. Over 500,000 credit card numbers in email.
5. Customer Service and Ticketing Systems
Support agents deal with payment issues, and they often document everything.
I assessed a SaaS company where customer service agents were pasting entire card details into Zendesk tickets to troubleshoot payment failures. They had nearly 8,000 tickets containing credit card numbers, accessible to 47 different employees.
Where to look:
Zendesk, Freshdesk, ServiceNow tickets
CRM systems (Salesforce, HubSpot)
Internal chat systems (Slack, Teams, Discord)
Knowledge base articles
Training documentation
Screen recordings and screenshots
6. File Shares and Document Management
The wild west of unstructured data.
Where to look:
Windows file shares
SharePoint sites
Google Drive/Dropbox folders
OneDrive for Business
Document management systems
Scanned documents
PDF receipts and invoices
Excel spreadsheets
Word documents
Nightmare scenario: I found a "Payment Issues" folder on a shared drive with over 2,300 Excel spreadsheets containing customer payment information, including full PANs, going back seven years. No encryption. No access controls. Anyone in the company could access it.
7. Third-Party and Cloud Services
Data you've sent to vendors is still your responsibility under PCI DSS.
Where to look:
CRM systems
Marketing automation platforms
Analytics platforms (Google Analytics, Mixpanel)
Chat systems (Intercom, Drift)
Payment facilitators
Subscription management systems
Fraud prevention services
Building Your Data Discovery Strategy
Now that you know where to look, let's talk about how to actually find this stuff.
Phase 1: Documentation Review (Week 1)
Start with what you think you know.
Action items:
Document all systems that should handle cardholder data
Review data flow diagrams (create them if they don't exist)
Interview process owners in every department
Review payment workflows end-to-end
Document all third-party payment services
Pro tip: Don't trust existing documentation. I've never found documentation that was 100% accurate. It's a starting point, not the truth.
Phase 2: Network Discovery (Week 2)
Map what's actually connected to what.
I use a combination of tools:
Tool Type | Purpose | Examples |
|---|---|---|
Network mapping | Discover all connected systems | Nmap, Nessus, Qualys |
Traffic analysis | See what communicates with payment systems | Wireshark, tcpdump, NetFlow |
Asset management | Inventory all systems | Lansweeper, ServiceNow CMDB |
Real technique I use: Set up network traffic capture on your payment system for 30 days. Watch where data flows. You'll be shocked at what connects to your payment systems that nobody documented.
Phase 3: Automated Data Discovery (Weeks 3-4)
Time to hunt for actual cardholder data.
The tools I recommend:
Tool Category | Purpose | Cost Range |
|---|---|---|
Data Discovery Tools | Scan for PAN patterns | $5K-$50K/year |
DLP Solutions | Continuous monitoring | $10K-$100K/year |
Custom Scripts | Targeted searches | Free (time investment) |
Tools I've used successfully:
Ground Labs Card Recon: Best overall data discovery tool I've used
Spirion (formerly Identity Finder): Great for file shares and endpoints
GTB Technologies: Good for large enterprise environments
Varonis: Excellent for unstructured data in file shares
BigID: Strong for cloud environments
Phase 4: Manual Investigation (Weeks 5-6)
Automated tools find maybe 80% of data. The last 20% requires human intelligence.
My manual discovery checklist:
Databases:
-- Search for potential PAN patterns in all text columns
-- This is a simplified example - adjust for your database
SELECT column_name, table_name
FROM information_schema.columns
WHERE data_type IN ('varchar', 'char', 'text')
AND character_maximum_length >= 13;
Then search those columns for number patterns matching credit card formats.
Log files:
# Search for potential 16-digit PANs
grep -r -E '\b[3-6][0-9]{15}\b' /var/log/File systems:
# Find files modified in last 2 years that might contain payment data
find /path/to/search -type f -mtime -730 \
-exec grep -l -E '\b[3-6][0-9]{15}\b' {} \;
Phase 5: Validation and Documentation (Week 7)
Not every 16-digit number is a credit card number.
I learned this the hard way when my scanner flagged 47,000 "credit card numbers" that turned out to be:
ISBN numbers for books
Tracking numbers
Customer account IDs
Random 16-digit strings
Validation techniques:
Luhn Algorithm Check: All valid credit card numbers pass the Luhn checksum
BIN Range Validation: First 6 digits identify the card issuer
Context Analysis: Is it stored with cardholder name, expiration date?
Format Verification: Check for proper spacing, separators
Key insight: A 16-digit number that passes Luhn, starts with 4 (Visa) or 5 (Mastercard), and is stored alongside a name and expiration date? That's almost certainly real cardholder data.
The Discovery Tools Comparison: What Actually Works
After using dozens of data discovery tools, here's my honest assessment:
Tool | Best For | Limitations | My Rating |
|---|---|---|---|
Ground Labs Card Recon | Comprehensive discovery across all systems | Expensive, complex setup | ⭐⭐⭐⭐⭐ |
Spirion | File shares, endpoints, email | Less effective for databases | ⭐⭐⭐⭐ |
GTB Technologies | Large enterprises, continuous monitoring | Overkill for small orgs | ⭐⭐⭐⭐ |
Varonis | Unstructured data in file shares | Limited database scanning | ⭐⭐⭐⭐ |
Custom Scripts | Targeted, specific searches | Requires technical expertise | ⭐⭐⭐ |
Manual Grep/Regex | Small-scale investigations | Not scalable | ⭐⭐⭐ |
"The best data discovery tool is the one that finds data nobody knew existed. The second-best tool is the one your team will actually use consistently."
Common Discovery Pitfalls (And How I've Learned to Avoid Them)
Pitfall #1: Trusting Your Initial Scope
A retail company told me their cardholder data was "only in the payment terminal and processor." I found it in 14 locations, including:
Email server (complaint forwards from customers)
HR system (corporate card data)
Accounting system (vendor payments)
Sales CRM (manual order entry for phone sales)
Lesson: Assume your initial scope is wrong. It usually is.
Pitfall #2: Ignoring "Old" Systems
"We don't use that server anymore."
Famous last words before I find three years of cardholder data on a "decommissioned" system that's still powered on and connected to the network.
Real example: Found 340,000 credit card numbers on a Windows 2003 server that had been "retired" in 2015. It was still running, still backing up, and still accessible from the network. Nobody had checked in five years.
Lesson: If it's powered on and has a network connection, it's in scope until proven otherwise.
Pitfall #3: Overlooking Cloud Services
I assessed an organization that had "no cloud infrastructure." Then I found:
Salesforce with payment data in custom fields
Google Analytics tracking payment page URLs (with card numbers in the URL)
Intercom chat logs with customer service payment discussions
Zapier workflows moving payment data between systems
AWS S3 bucket with database dumps
They weren't lying—their IT department didn't use cloud. But marketing, sales, and customer service sure did.
Lesson: Survey every department, not just IT. Shadow IT is everywhere.
Pitfall #4: Assuming Encrypted = Safe to Ignore
Encryption doesn't remove data from PCI scope. It reduces requirements, but you still need to track and protect encrypted cardholder data.
I've seen organizations encrypt their database, think they're done, and completely ignore:
Encryption key storage (often less secure than the data)
Application memory (decrypted data in RAM)
Log files (often logging before encryption)
Backup systems (encryption often not applied)
Lesson: Encrypted data is still cardholder data. Track it, protect it, minimize it.
Building a Continuous Discovery Program
Here's a truth that'll save you immense headaches: data discovery isn't a one-time project. It's an ongoing process.
I've watched organizations invest heavily in comprehensive data discovery, find everything, document everything, and think they're done. Eighteen months later at their next assessment, there are six new data stores nobody knew about.
My Recommended Continuous Discovery Approach
Frequency | Activities | Owner |
|---|---|---|
Monthly | Automated scans of CDE systems<br>Review new system additions<br>Spot checks of high-risk areas | Security Team |
Quarterly | Full network-wide discovery scan<br>Update data flow diagrams<br>Interview process owners<br>Test tool effectiveness | Compliance Manager |
Annually | Comprehensive manual + automated discovery<br>Third-party assessment<br>Update procedures<br>Team training | CISO |
After Changes | New application deployment<br>System migrations<br>Process changes<br>Vendor changes | Change Owner |
The Documentation That Saves Your Assessment
Your assessor needs to see that you actually know where your data lives. Here's what I look for:
Data Flow Diagrams
Show me:
Every system that touches cardholder data
How data flows between systems
Where data is stored (even temporarily)
Network boundaries and security controls
Third-party connections
Pro tip: Use tools like Lucidchart or Draw.io. Update them every time you discover new data locations or system changes.
Cardholder Data Inventory
System Name | Data Elements Stored | Location | Encryption Status | Business Owner | Discovery Date |
|---|---|---|---|---|---|
Production DB | PAN, Name, Exp Date | On-prem datacenter | AES-256 at rest | IT Director | 2024-01-15 |
Payment Gateway | PAN (tokenized) | Cloud - AWS | Vendor-managed | CFO | 2024-01-15 |
Backup Server | Full DB backups | On-prem datacenter | AES-256 | IT Director | 2024-02-03 |
Email Archive | Historical customer communications | Cloud - Microsoft 365 | Microsoft-managed | IT Director | 2024-02-10 |
Discovery Tool Reports
Keep evidence that you actually ran discovery scans:
Tool outputs and reports
False positive analysis
Remediation tracking for findings
Scan schedules and completion proof
What To Do When You Find Cardholder Data
Finding data is step one. Dealing with it is step two.
Decision Framework
For every instance of cardholder data you find, ask:
1. Do we have a business need to store this?
YES: Implement PCI controls → Continue to #2
NO: Securely delete → Document deletion
2. Can we reduce the data stored?
Store only what's absolutely necessary
Truncate/mask wherever possible
Implement data retention policies
3. Can we remove it from our environment entirely?
Use tokenization services
Point-to-point encryption (P2PE)
Let payment processor store it
4. If we must store it, how do we secure it?
Encryption at rest
Encryption in transit
Access controls
Monitoring and logging
The Deletion Process
When you find cardholder data that shouldn't exist:
Don't just hit delete.
I've seen organizations create bigger problems by hastily deleting data:
Deleted from production but forgotten in backups
Deleted files recoverable with forensic tools
Deletion violated data retention requirements
Deletion broke application functionality
My deletion checklist:
Document: Screenshot/log what you found and where
Verify: Confirm it's real cardholder data, not false positive
Assess: Check if any legitimate business need exists
Get approval: Business owner must approve deletion
Delete from all locations: Production, backups, archives, logs
Secure deletion: Use secure file deletion tools (not just "delete")
Verify deletion: Scan again to confirm removal
Document completion: Record what was deleted and when
Review process: Why was data there? How prevent recurrence?
The Remediation Tracker
Track every finding through resolution:
Finding ID | Location | Data Type | Risk Level | Remediation Plan | Owner | Due Date | Status |
|---|---|---|---|---|---|---|---|
CD-001 | Email server | 847 PANs in archived emails | Critical | Secure delete + email policy update | IT Manager | 2024-03-15 | In Progress |
CD-002 | Dev environment | Test database with real PANs | High | Data sanitization + process change | Dev Lead | 2024-03-30 | Not Started |
CD-003 | Log files | PANs in Apache error logs | Critical | Log scrubbing + application fix | App Team | 2024-03-10 | Complete |
CD-004 | File share | Excel files with payment data | Medium | Delete files + user training | Finance Mgr | 2024-04-05 | Not Started |
Real-World Discovery Success Story
Let me share a success story that demonstrates the power of thorough data discovery.
In 2021, I worked with a hospitality company—120 hotel properties processing about $400 million in card transactions annually. They'd failed their previous PCI assessment due to unclear scope.
We implemented a comprehensive discovery program:
Phase 1: Initial Discovery (Week 1-4)
Automated scanning found cardholder data in 47 locations
Manual investigation revealed 12 additional locations
Total: 59 data stores nobody had fully documented
Phase 2: Analysis (Week 5-6)
23 locations: legitimate business need, properly secured
29 locations: no business justification, scheduled for deletion
7 locations: business need existed, but data could be reduced
Phase 3: Remediation (Week 7-16)
Securely deleted data from 29 locations (4.7 million PANs)
Implemented tokenization for 7 systems (eliminated actual PAN storage)
Enhanced controls on remaining 23 systems
Reduced PCI scope by 67%
Results:
Passed PCI assessment with zero findings
Reduced annual compliance costs by $240,000
Cut breach risk exposure by estimated 70%
Simplified operations for IT and security teams
The CFO told me: "We were treating PCI like a checkbox. The discovery process showed us we had massive risk exposure we didn't even know about. You didn't just help us comply—you prevented a disaster we didn't know was coming."
Advanced Discovery Techniques for Complex Environments
For larger, more complex organizations, basic scanning isn't enough.
Database Deep-Dive Techniques
Most scanners look for structured data in obvious places. But databases hide data in surprising ways:
Encrypted fields: Scan for fields that might contain encrypted cardholder data, then trace encryption key locations
JSON/XML columns: Modern databases store semi-structured data in JSON fields. Traditional scanners miss this.
Binary objects: PANs can be embedded in PDFs, images, or other binary objects stored in blob fields
Audit tables: Change tracking tables often duplicate cardholder data
Temp tables and staging: ETL processes create temporary copies
Application Memory Analysis
Data exists in application memory even if not persisted to disk.
I use memory forensics tools to capture and analyze application memory dumps:
Identify if applications hold cardholder data in memory longer than necessary
Check if data is properly scrubbed from memory after use
Verify encryption keys aren't exposed in memory
Network Traffic Analysis
Sometimes the only way to find data flows is to watch network traffic:
My technique:
Mirror traffic from critical segments
Capture 30 days of network flows
Analyze for PAN patterns in cleartext
Identify previously undocumented systems communicating with payment infrastructure
Real discovery: Found that a "reporting only" system was receiving full PANs over the network, even though nobody thought it stored cardholder data. It did—in cache, logs, and temp files.
Tools and Scripts for DIY Discovery
If you can't afford commercial tools, you can build effective discovery capabilities with open-source tools and scripts.
Card Brand BIN Ranges Reference
Understanding card number patterns helps validate findings:
Card Brand | Starting Digits | Length | Example Pattern |
|---|---|---|---|
Visa | 4 | 13, 16, or 19 | 4xxx xxxx xxxx xxxx |
Mastercard | 51-55, 2221-2720 | 16 | 5xxx xxxx xxxx xxxx |
American Express | 34, 37 | 15 | 3xxx xxxxxx xxxxx |
Discover | 6011, 622126-622925, 644-649, 65 | 16 | 6xxx xxxx xxxx xxxx |
JCB | 3528-3589 | 16 | 35xx xxxx xxxx xxxx |
Diners Club | 36, 38, 300-305 | 14 | 3xxx xxxx xxxx xx |
Free and Open-Source Tools
Tool | Purpose | Best Use Case |
|---|---|---|
grep/egrep | Basic text search | Quick log file analysis |
ripgrep | Fast file searching | Large directory scans |
bulk_extractor | Forensic data extraction | Comprehensive file analysis |
Nmap | Network discovery | Mapping payment systems |
Wireshark | Network traffic analysis | Identifying data flows |
The Compliance Assessor's Perspective
Let me pull back the curtain and share what I look for as a QSA (Qualified Security Assessor).
Red Flags That Fail Assessments
Red Flag | Why It Fails | How Often I See It |
|---|---|---|
"We don't know" responses | Can't prove compliance for unknown systems | 60% of failed assessments |
No recent discovery evidence | Stale data = unknown current state | 45% of failed assessments |
Undocumented system connections | Scope gaps create vulnerabilities | 40% of failed assessments |
Surprised reactions | Indicates poor control environment | 35% of failed assessments |
Incomplete remediation | Partial fixes don't count | 30% of failed assessments |
What Impresses Assessors
1. Continuous discovery evidence Show me monthly scan reports, quarterly reviews, documented processes.
2. Proactive findings and remediation "We found this during our quarterly scan and here's how we fixed it" is music to my ears.
3. Clear data flow diagrams Updated within the last 90 days, showing all systems and connections.
4. Strong change management New systems can't be deployed without data discovery scan first.
5. Training and awareness Everyone understands why cardholder data discovery matters and their role in it.
"The best PCI assessments are boring. Everything is documented, controls are in place, discovery is ongoing, and there are no surprises. Make your assessment boring."
Building a Culture of Data Awareness
Technical tools are important, but culture is critical.
Training Your Team
Everyone who might encounter cardholder data needs to understand:
What cardholder data looks like:
PAN formats and variations
Associated data elements
Masked vs. unmasked data
Why it matters:
Business impact of breaches
Personal liability (yes, individuals can be held responsible)
PCI DSS requirements and penalties
What to do when they find it:
Report immediately to security team
Don't attempt to handle it themselves
Document where found and circumstances
Process Changes I've Implemented Successfully
1. Pre-deployment data scanning New systems must be scanned for cardholder data before production deployment.
2. Quarterly data discovery reviews Security team presents findings to management quarterly.
3. Employee reporting program Reward (don't punish) employees who report finding cardholder data in unexpected places.
4. Developer data sanitization standards All test data must be generated or properly sanitized, never copied from production.
5. Vendor data questionnaire Any new vendor must complete questionnaire about how they'll handle cardholder data.
Your 90-Day Data Discovery Roadmap
Let me give you a practical plan to implement comprehensive data discovery:
Timeline | Phase | Key Activities | Deliverables |
|---|---|---|---|
Days 1-7 | Foundation | Assemble team<br>Review documentation<br>Define scope<br>Select tools | Project charter<br>Team roster<br>Tool selection |
Days 8-30 | Initial Discovery | Automated scanning<br>Network analysis<br>Manual investigation<br>Interviews | Initial findings report<br>Data inventory draft |
Days 31-60 | Analysis | Validate findings<br>Prioritize risks<br>Develop plans<br>Get approvals | Remediation roadmap<br>Risk assessment<br>Budget approval |
Days 61-90 | Remediation | Execute plans<br>Implement monitoring<br>Train team<br>Document | Updated inventory<br>Process documentation<br>Training records |
Day 91+ | Continuous | Monthly scans<br>Quarterly reviews<br>Ongoing training<br>Updates | Monthly reports<br>Quarterly assessments<br>Annual audit readiness |
Final Thoughts: The Discovery Mindset
After 15+ years and hundreds of assessments, here's what I've learned:
Data discovery isn't a technical problem—it's a business problem that requires technical solutions.
The organizations that excel at data discovery share common traits:
Executive sponsorship and support
Cross-functional collaboration
Continuous improvement mindset
Investment in tools and training
Culture of security awareness
The organizations that struggle treat it as a checkbox exercise. They run a scan once, document what they find, and move on. Two years later, they're shocked to discover cardholder data everywhere.
"You can't protect what you don't know you have. And what you don't know you have is exactly what attackers will find first."
I started this article with the story of finding credit card numbers in an HR system. I want to end with a different story—one with a better outcome.
A retail organization I worked with built comprehensive data discovery into their DNA. Every new system gets scanned. Every quarter they review their entire environment. When employees find unexpected data, they report it immediately.
Last year, an employee noticed customer payment data in a supplier invoice system—a system that wasn't supposed to handle cardholder data at all. She reported it. Within 24 hours, the security team had identified the issue (a misconfigured integration), remediated it, and prevented what could have been a massive scope expansion for their PCI assessment.
The employee got recognized in a company meeting. The CISO told me: "That's the culture we wanted. Security isn't just the security team's job—it's everyone's responsibility. And it started with teaching people what cardholder data looks like and empowering them to speak up when they find it."
That's the goal: an organization where everyone is a sensor, where data discovery is continuous, and where protecting cardholder data is instinctive rather than imposed.
Your breach won't come from the systems you know about. It'll come from the cardholder data you didn't know existed, stored in a place you didn't know to protect.
Find it first. Protect it properly. Or lose it painfully.
The choice is yours.