Summary
Add a rule to detect IN clauses with excessive number of values, which can cause performance issues and even query failures.
Why This Matters
Large IN clauses are problematic for several reasons:
- Query parsing overhead — each value must be parsed and validated
- Optimizer confusion — planners struggle with thousands of values
- Parameter limits — databases have max parameters (e.g., SQL Server: 2100, MySQL: ~65535)
- Network overhead — query text becomes huge
- Memory pressure — internal structures to hold all values
Thresholds and Limits
| Database |
Practical Limit |
Hard Limit |
| PostgreSQL |
~10K |
~32K (stack) |
| MySQL |
~1K optimal |
65535 |
| SQL Server |
~1K optimal |
2100 params |
| Oracle |
1000 |
1000 (IN list) |
Performance Degradation
-- Query with 10 values: 5ms
SELECT * FROM products WHERE id IN (1,2,3,4,5,6,7,8,9,10);
-- Query with 1000 values: 200ms (40x slower)
SELECT * FROM products WHERE id IN (1,2,3,...,1000);
-- Query with 10000 values: timeout or error
Rule Specification
| Property |
Value |
| Rule ID |
PERF019 |
| Name |
Large IN clause |
| Category |
Performance |
| Severity |
Warning |
| Default |
Enabled |
Detection Patterns
-- Should trigger (too many values)
SELECT * FROM users WHERE id IN (1,2,3,4,5,...100+ values);
-- Severity levels based on count:
-- > 50 values: Info (worth noting)
-- > 200 values: Warning (likely problem)
-- > 1000 values: Error (definitely wrong)
Suggested Message
"IN clause contains N values. Large IN lists cause performance issues. Consider: (1) temporary table with JOIN, (2) batch queries in application, (3) EXISTS with subquery, or (4) ANY(ARRAY[...]) in PostgreSQL."
Better Alternatives
-- Alternative 1: Temporary table
CREATE TEMPORARY TABLE temp_ids (id INT);
INSERT INTO temp_ids VALUES (1), (2), (3), ...;
SELECT * FROM products p JOIN temp_ids t ON p.id = t.id;
-- Alternative 2: Batch in application
-- Split into chunks: WHERE id IN (1..100), WHERE id IN (101..200), ...
-- Alternative 3: PostgreSQL array
SELECT * FROM products WHERE id = ANY(ARRAY[1,2,3,4,5]);
-- Alternative 4: Subquery if values come from another table
SELECT * FROM products WHERE id IN (SELECT product_id FROM cart WHERE user_id = 123);
References
Implementation Notes
- Count values in IN clause (count commas + 1, or parse properly)
- Thresholds: 50 (Info), 200 (Warning), 1000 (Error)
- Make thresholds configurable
- Detect both
IN (1,2,3) and NOT IN (1,2,3) patterns
- Consider: parameterized queries show as
IN (?, ?, ?) — still count
Summary
Add a rule to detect
INclauses with excessive number of values, which can cause performance issues and even query failures.Why This Matters
Large IN clauses are problematic for several reasons:
Thresholds and Limits
Performance Degradation
Rule Specification
PERF019Detection Patterns
Suggested Message
Better Alternatives
References
Implementation Notes
IN (1,2,3)andNOT IN (1,2,3)patternsIN (?, ?, ?)— still count