data-testid is not a code smell - a nuanced reply to Testing Library
Kent C. Dodds and the Testing Library community argue data-testid is an a11y smell. They have a point. They also miss one. Here's why both camps are right, and how to combine them in practice.
data-testid is not a code smell: a nuanced reply to Testing Library
If you've spent any time in the testing community over the last few years, you've seen the argument: data-testid is a code smell. Don't use it. Use getByRole, getByLabel, semantic queries. Add a data-testid only as a last resort, and feel slightly bad about it.
The most articulate version of this argument comes from Kent C. Dodds and the Testing Library team. A more pointed version comes from TkDodo, who frames test IDs as an accessibility smell specifically.
I want to make two claims in this article:
- They're right. Their core argument is sound and most teams should listen to it.
- They're missing one thing. There's a category of element where role-based queries genuinely don't work, and the answer there isn't "fix the UI" but "have a stable handle for tests."
The result is a pragmatic middle ground that I think most teams should adopt.
The Testing Library argument, steel-manned
Let me state their case as strongly as possible, because I want to argue against the strong version.
The argument goes like this. When you write a test, you're encoding what your users care about. Users don't care about data-testid. They care about clicking on a button labeled "Save", filling in a field labeled "Email", reading a heading that says "Welcome".
If your test queries by data-testid, you're testing the implementation, not the experience. Worse: if your test can find an element via data-testid but a screen reader can't find that same element via its accessible name, you've shipped an accessibility bug and the test didn't catch it. The data-testid lets the test pass while real users (those using assistive tech, those reading the UI without prior context) get blocked.
Hence: a data-testid is often the symptom of an inaccessible UI. Adding the data-testid lets you ship the inaccessibility unchallenged. Hence: it's a smell. The right fix isn't to add the data-testid, it's to make the element accessible (proper <label>, aria-label, role, etc.), which makes it both queryable by Testing Library and usable by everyone.
This is a strong argument. I genuinely agree with most of it.
Where the argument breaks down
The argument assumes that for every UI element your QA team needs to interact with, there exists a sensible accessible name that the dev team can give it.
This is true for buttons with visible text. It's true for inputs with labels. It's true for headings and links and anything that already has semantic markup.
It's not true for a non-trivial slice of real UIs. Examples:
Multiple identical buttons
Your dashboard lists user rows, each with a "Delete" button. They're all <button>Delete</button>. They're all accessible. A screen reader user navigates by row context: "row containing Marie Dupont, Delete button". They have everything they need.
But your test? getByRole('button', { name: 'Delete' }) returns a list of 50 elements. You need to disambiguate. Testing Library's recommended escape hatch is within(row).getByRole('button', { name: 'Delete' }) where you first scope to a parent element. Which means you now need a stable way to identify the row.
If the row has an ID like user-row-marie-dupont, you can use it. If it doesn't, you're back to nth-child. Or to adding a data-testid on the row. Which the Testing Library philosophy considers a smell.
The thing is: there's no accessibility fix here. The UI is accessible. Multiple identical buttons in distinguishable contexts is a perfectly valid pattern. The only test-side disambiguation requires a stable contractual handle on the row.
Icon-only buttons with aria-label
Common pattern: a row of icon buttons (close, expand, settings). Each has aria-label="Close", aria-label="Expand row", aria-label="Open settings". Accessibility-wise, perfect.
Now your QA team uses Playwright's getByRole('button', { name: 'Close' }). It works.
Six months later, the design system team renames aria-label="Close" to aria-label="Dismiss" because UX research shows it's clearer. Every test breaks. The dev who made the change had no idea this aria-label was load-bearing for the test suite.
Notice what happened: the team did everything right from an accessibility standpoint. The team also did everything right from a Testing Library standpoint. And yet the contract between dev and QA was implicit, brittle, and broke silently.
A data-testid="close-button" would have survived this rename. Not because data-testid is morally superior, but because it's purpose-built for the test contract. Renaming it requires opening the test file. Renaming an aria-label doesn't.
Inputs without visible labels
A search bar with placeholder text "Search..." and no <label>. Accessibility-correct? Probably not, but it's a real pattern that ships in production every day. Testing Library would say: add a proper label first.
Sure. But sometimes the design fights back. The placeholder is the visual label by design. Adding a visible <label> would change the UX. Adding a hidden <label> (visually hidden but readable by screen readers) is the right a11y fix, but the QA team needs to wait for the design and front team to land that.
Meanwhile, the test needs to run today. data-testid="search-input" unblocks the test, and the a11y improvement can ship later as a separate change.
The pragmatic synthesis
Here's the position I think most teams should adopt:
-
Default to role-based queries. Try
getByRole,getByLabel,getByPlaceholderText,getByTextfirst. They're more aligned with user experience and they push the team toward accessibility. This is the Testing Library philosophy and it's correct. -
When role-based queries are ambiguous or fragile, prefer fixing the markup. If you can't disambiguate two buttons, give the parent row a stable
idor use a sub-tree scope. If a button has no accessible name, add one. The Testing Library team is right that this should be the first reflex. -
Use
data-testidas the explicit test contract for the cases where (1) and (2) don't apply. Multiple identical buttons in similar contexts. Highly dynamic content where thearia-labelis owned by the design system team and changes for UX reasons. Elements where the visible text is the brand-controlled CTA copy ("Sign up now" today, "Get started" next quarter).
This isn't a betrayal of Testing Library's philosophy. It's an acknowledgment that the philosophy works perfectly for ~80% of elements and needs a fallback for the remaining 20%.
How a tool can help
Here's where I'll mention the product, briefly. TestID Hunter implements exactly this hierarchy in its scoring system:
- Solid: element has a
data-testid, a stableid, OR anaria-label. All three count equally. If your team prioritizes accessibility, thearia-labelwins; if your team prioritizes pragmatism, thedata-testidwins; the test suite is robust either way. - Usable: element has a
roleplus visible text, or anameattribute, or aplaceholder. Tests work today, they may break next sprint when the copy team rewrites things. - Weak: element has no stable attribute. Generated UUID, nth-child, deeply nested CSS path.
When the extension generates a ticket asking the dev to "stabilize" an element, it suggests data-testid as the default and mentions that an aria-label would rank just as Solid and improve accessibility. The dev gets to pick based on context.
This way the team isn't forced into a religious choice between two valid camps. Both options improve testability. One also improves a11y.
Where I'd push back on the steel-manned argument
The Testing Library argument has one subtle issue: it conflates two things. "Test what users care about" and "use queries that match assistive tech selectors". These overlap a lot but they're not the same.
What users care about is that the button works when they click it. They care about the outcome, not the locator. A test that reliably finds the right button via data-testid and reliably exercises the workflow tests what users care about, even if the locator doesn't match how a screen reader navigates.
The accessibility argument is real but it's a separate concern. It deserves its own dedicated tooling: axe-core, automated accessibility audits, screen reader testing in CI. Trying to fold accessibility validation into the e2e test suite via locator strategy is a leaky way to do both jobs poorly.
Use role-based queries because they're robust to copy changes and they nudge teams toward accessibility. Don't use them because they're "the only ethical way" to write tests. They're not, and treating them as such tends to break in the real world.
The takeaway
Both camps have a point. The Testing Library philosophy is right that role-based queries should be your default and that data-testid shouldn't be the first tool you reach for. The data-testid-first camp is right that some elements genuinely need a dedicated test contract and that's fine.
Pick the right tool for each element. Make the contract explicit either way. Don't fight a war when you can have both.