Histograms in Oracle Statistics
Histograms in Oracle Statistics: A Practical DBA Perspective
In Oracle databases, a histogram is a special type of column statistic that helps the optimizer understand how data is distributed within a column. While standard statistics tell Oracle the number of rows and distinct values, they assume the data is evenly spread. In reality, data is often skewed, and that is where histograms become essential.
After years of managing Oracle environments, I have seen many performance problems caused by the optimizer making poor decisions simply because it assumed uniform data distribution. Histograms help solve that problem by giving Oracle a more accurate picture of the data.
⸻
What is a Histogram?
A histogram stores information about the frequency of values in a column.
For example, in a STATUS column:
• ACTIVE = 95%
• INACTIVE = 4%
• PENDING = 1%
Without a histogram, Oracle may assume each value appears equally.
With a histogram, Oracle understands the real distribution and can choose a better execution plan.
⸻
Why Histograms Matter
• Better cardinality estimates
• Improved execution plans
• Smarter index usage
• More accurate join order
• Reduced full table scans
Example query:
SELECT *
FROM orders
WHERE status = 'PENDING';
If Oracle knows only 1% of rows are PENDING, it may choose an index instead of a full table scan.
⸻
Types of Histograms in Oracle
Oracle supports several histogram types:
• Frequency Histogram
Stores each distinct value separately. Best for columns with few distinct values.
• Top-Frequency Histogram
Tracks most common values when some values appear very frequently.
• Height-Balanced Histogram
Older method used when many distinct values exist.
• Hybrid Histogram
Introduced in newer versions for better skew handling.
⸻
How to Check Existing Histograms
You can view histogram information:
SELECT column_name,
histogram,
num_buckets
FROM dba_tab_col_statistics
WHERE owner = 'HR'
AND table_name = 'EMPLOYEES';
Output may show:
• NONE
• FREQUENCY
• TOP-FREQUENCY
• HYBRID
⸻
Gather Statistics with Histograms
To allow Oracle to create histograms automatically:
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
ownname => 'HR',
tabname => 'EMPLOYEES',
method_opt => 'FOR ALL COLUMNS SIZE AUTO'
);
END;
/
Oracle decides whether a histogram is needed.
⸻
Create Histogram for Specific Column
To explicitly collect histogram for one column:
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
ownname => 'HR',
tabname => 'EMPLOYEES',
method_opt => 'FOR COLUMNS SIZE 254 DEPARTMENT_ID'
);
END;
/
This creates a histogram on DEPARTMENT_ID.
⸻
Remove Histograms
Sometimes too many histograms can create unstable plans.
To remove them:
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
ownname => 'HR',
tabname => 'EMPLOYEES',
method_opt => 'FOR ALL COLUMNS SIZE 1'
);
END;
/
SIZE 1 disables histograms.
⸻
Check Histogram Buckets
To inspect bucket values:
SELECT endpoint_number,
endpoint_value
FROM dba_tab_histograms
WHERE owner = 'HR'
AND table_name = 'EMPLOYEES'
AND column_name = 'DEPARTMENT_ID';
This helps understand how Oracle stores value distribution.
⸻
When to Use Histograms
Histograms are useful when:
• Data is heavily skewed
• SQL filters on selective columns
• Columns contain popular values
• Optimizer chooses poor plans
• Indexes are ignored unexpectedly
⸻
When Not to Use Histograms
Avoid unnecessary histograms when:
• Column values are evenly distributed
• Column changes frequently
• Too many literals create unstable plans
• Bind variables dominate workload
Excessive histograms can lead to plan instability.
⸻
Example of Performance Improvement
Without histogram:
SELECT *
FROM customers
WHERE region = 'SINGAPORE';
Oracle may estimate:
• 20% rows returned
Actual:
• only 1%
After histogram:
• Oracle correctly estimates 1%
• Uses index
• Faster execution
⸻
Final Thoughts
Histograms are one of the most powerful but often misunderstood parts of Oracle statistics. They help the optimizer understand real data patterns instead of making assumptions.
As a DBA, understanding when to use histograms—and when to avoid them—can make a major difference in SQL performance. Properly maintained statistics combined with well-designed histograms can prevent many performance issues before they ever reach production.
In many cases, a single well-placed histogram can solve a performance issue that hours of tuning cannot.
Comments
Post a Comment
Please do not enter any spam link in comment Section suggestions are Always Appreciated. Thanks.. !