Histograms in Oracle Statistics

Histograms in Oracle Statistics: A Practical DBA Perspective


In Oracle databases, a histogram is a special type of column statistic that helps the optimizer understand how data is distributed within a column. While standard statistics tell Oracle the number of rows and distinct values, they assume the data is evenly spread. In reality, data is often skewed, and that is where histograms become essential.


After years of managing Oracle environments, I have seen many performance problems caused by the optimizer making poor decisions simply because it assumed uniform data distribution. Histograms help solve that problem by giving Oracle a more accurate picture of the data.



What is a Histogram?


A histogram stores information about the frequency of values in a column.


For example, in a STATUS column:

ACTIVE = 95%

INACTIVE = 4%

PENDING = 1%


Without a histogram, Oracle may assume each value appears equally.

With a histogram, Oracle understands the real distribution and can choose a better execution plan.



Why Histograms Matter

Better cardinality estimates

Improved execution plans

Smarter index usage

More accurate join order

Reduced full table scans


Example query:


SELECT

FROM orders

WHERE status = 'PENDING';


If Oracle knows only 1% of rows are PENDING, it may choose an index instead of a full table scan.



Types of Histograms in Oracle


Oracle supports several histogram types:

Frequency Histogram

Stores each distinct value separately. Best for columns with few distinct values.

Top-Frequency Histogram

Tracks most common values when some values appear very frequently.

Height-Balanced Histogram

Older method used when many distinct values exist.

Hybrid Histogram

Introduced in newer versions for better skew handling.



How to Check Existing Histograms


You can view histogram information:


SELECT column_name,

       histogram,

       num_buckets

FROM dba_tab_col_statistics

WHERE owner = 'HR'

AND table_name = 'EMPLOYEES';


Output may show:

NONE

FREQUENCY

TOP-FREQUENCY

HYBRID



Gather Statistics with Histograms


To allow Oracle to create histograms automatically:


BEGIN

  DBMS_STATS.GATHER_TABLE_STATS(

    ownname    => 'HR',

    tabname    => 'EMPLOYEES',

    method_opt => 'FOR ALL COLUMNS SIZE AUTO'

  );

END;

/


Oracle decides whether a histogram is needed.



Create Histogram for Specific Column


To explicitly collect histogram for one column:


BEGIN

  DBMS_STATS.GATHER_TABLE_STATS(

    ownname    => 'HR',

    tabname    => 'EMPLOYEES',

    method_opt => 'FOR COLUMNS SIZE 254 DEPARTMENT_ID'

  );

END;

/


This creates a histogram on DEPARTMENT_ID.



Remove Histograms


Sometimes too many histograms can create unstable plans.

To remove them:


BEGIN

  DBMS_STATS.GATHER_TABLE_STATS(

    ownname    => 'HR',

    tabname    => 'EMPLOYEES',

    method_opt => 'FOR ALL COLUMNS SIZE 1'

  );

END;

/


SIZE 1 disables histograms.



Check Histogram Buckets


To inspect bucket values:


SELECT endpoint_number,

       endpoint_value

FROM dba_tab_histograms

WHERE owner = 'HR'

AND table_name = 'EMPLOYEES'

AND column_name = 'DEPARTMENT_ID';


This helps understand how Oracle stores value distribution.



When to Use Histograms


Histograms are useful when:

Data is heavily skewed

SQL filters on selective columns

Columns contain popular values

Optimizer chooses poor plans

Indexes are ignored unexpectedly



When Not to Use Histograms


Avoid unnecessary histograms when:

Column values are evenly distributed

Column changes frequently

Too many literals create unstable plans

Bind variables dominate workload


Excessive histograms can lead to plan instability.



Example of Performance Improvement


Without histogram:


SELECT

FROM customers

WHERE region = 'SINGAPORE';


Oracle may estimate:

20% rows returned


Actual:

only 1%


After histogram:

Oracle correctly estimates 1%

Uses index

Faster execution



Final Thoughts


Histograms are one of the most powerful but often misunderstood parts of Oracle statistics. They help the optimizer understand real data patterns instead of making assumptions.


As a DBA, understanding when to use histograms—and when to avoid them—can make a major difference in SQL performance. Properly maintained statistics combined with well-designed histograms can prevent many performance issues before they ever reach production.


In many cases, a single well-placed histogram can solve a performance issue that hours of tuning cannot. 

Comments

Popular posts from this blog

Oracle Block Corruption - Detection and Resolution

How to clone Pluggable Database from one container to different Container Database

Restart Innodb MySQL Cluster after Complete outage(All node Down)