Summary
Add a rule to detect SELECT DISTINCT that may indicate a query design problem rather than a legitimate deduplication need.
Why This Matters
DISTINCT is often a band-aid for broken queries:
- Hides JOIN problems — incorrect joins produce duplicates that DISTINCT masks
- Sorting overhead — requires sorting or hashing all rows
- Symptom, not solution — the real fix is usually fixing the JOIN logic
- Performance cost — adds O(n log n) sort or O(n) hash overhead
The "DISTINCT Smell"
-- This usually means the JOIN is wrong:
SELECT DISTINCT u.name, u.email
FROM users u
JOIN orders o ON u.id = o.user_id;
-- What you probably wanted:
SELECT u.name, u.email
FROM users u
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
-- Or if you need order data:
SELECT u.name, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name, u.email;
Rule Specification
| Property |
Value |
| Rule ID |
PERF014 |
| Name |
Potentially unnecessary DISTINCT |
| Category |
Performance |
| Severity |
Info |
| Default |
Enabled |
Detection Patterns
-- Should trigger (DISTINCT with JOIN — suspicious)
SELECT DISTINCT a.* FROM a JOIN b ON ...
SELECT DISTINCT users.name FROM users JOIN orders ON ...
-- Should NOT trigger (legitimate uses)
SELECT DISTINCT status FROM orders; -- Getting unique values
SELECT DISTINCT category FROM products; -- Enumeration query
Heuristics
SELECT DISTINCT + JOIN = likely smell (trigger rule)
SELECT DISTINCT column from single table = probably intentional (don't trigger)
SELECT DISTINCT * = almost always a problem (higher severity)
Suggested Message
"DISTINCT with JOIN often indicates incorrect join conditions producing unwanted duplicates. Consider: (1) reviewing JOIN conditions, (2) using EXISTS instead of JOIN, or (3) using GROUP BY if aggregation is needed."
References
Implementation Notes
- Trigger when:
DISTINCT + JOIN keywords both present
- Higher severity for:
SELECT DISTINCT *
- Lower severity / skip when: single table query
- Suggestion should mention: check JOIN conditions, consider EXISTS/GROUP BY
Summary
Add a rule to detect
SELECT DISTINCTthat may indicate a query design problem rather than a legitimate deduplication need.Why This Matters
DISTINCT is often a band-aid for broken queries:
The "DISTINCT Smell"
Rule Specification
PERF014Detection Patterns
Heuristics
SELECT DISTINCT+JOIN= likely smell (trigger rule)SELECT DISTINCT columnfrom single table = probably intentional (don't trigger)SELECT DISTINCT *= almost always a problem (higher severity)Suggested Message
References
Implementation Notes
DISTINCT+JOINkeywords both presentSELECT DISTINCT *