Personal Programming Notes

To err is human; to debug, divine.

Vertica Tip: Find Empty Tables

This post is a reminder of using Vertica’s system tables for administrating and monitoring our own tables. One common house-cleaning operation when developing/testing in Vertica is to find and drop tables that are empty (truncated) and never used again.

You might ask why the tables are not dropped directly when I truncated the table in the first place. The answer is that all those tables have some specific designs on projection segmentation and partition, and those information will be lost if I drop the tables. These tables are frequently populated with data and cleared for testing purposes, and truncating and inserting with direct hint will give a significant performance boost (see Best practices).

v_monitor schema and COLUMN_STORAGE system table

The COLUMN_STORAGE system table in v_monitor schema returns the “amount of disk storage used by each column of each projection on each node”. Therefore, to get the size of each table, you only need to aggregate the used_byte data, grouped by schema name and table name.

Query to list tables' sizes in a schema
1
2
3
4
select anchor_table_schema, anchor_table_name, sum(used_bytes)
FROM v_monitor.column_storage
where anchor_table_schema = 'some_schema'
group by anchor_table_schema, anchor_table_name

According to here, the number from the above query is the compressed size of the Vertica tables. To get the raw size of the tables, which probably only matters for license limit, perform a license audit, and query the system table license_audits in v_catalog schema. However, the most important takeaway is that empty tables will not appear in this COLUMN_STORAGE system table.

v_catalog schema and TABLES system table

The TABLES system table is probably more well-known. It contains all the information about all the tables in all the schemas. For example, to list all the tables in some schema:

Query to list all tables in a schema
1
2
select table_schema, table_name from tables
where table_schema = 'some_schema'

Another useful system table in v_catalog schema is USER_FUNCTIONS which lists all user-defined functions and their function signatures in the database.

Find all the empty (truncated) tables

Having all the tables in v_catalog.tables table and only non-empty tables in v_monitor.column_storage table, finding empty tables is pretty straight-forward in SQL:

Query to find empty tables in a schema
1
2
3
4
5
6
7
select table_name
from v_catalog.tables
where table_schema = 'some_schema'
EXCEPT
select anchor_table_name
from v_monitor.column_storage
where anchor_table_schema = 'some_schema'

External Links

  1. Finding table’s compressed size
  2. Vertica License audit
  3. COLUMN_STORAGE system table
  4. TABLES system table
  5. USER_FUNCTIONS system table