EXPORT TO PARQUET exports a table, columns from a table, or query results to files in the Parquet format. These Parquet files use Snappy compression by default.
Starting in Vertica 10.1.1, EXPORT TO PARQUET supports the GZIP, Brotli, and ZSTD compression types!
Let’s see how these compression types compare in disk usage:
verticademos=> SELECT COUNT(*) FROM big;
COUNT
-----------
134217728
(1 row)
Snappy compression:
verticademos=> EXPORT TO PARQUET (directory = '/home/dbadmin/parq_snappy') AS SELECT * FROM big;
Rows Exported
---------------
134217728
(1 row)
verticademos=> \! du --summarize -h /home/dbadmin/parq_snappy
2.6G /home/dbadmin/parq_snappy
GZIP compression:
verticademos=> EXPORT TO PARQUET (directory = '/home/dbadmin/parq_gzip', compression='GZIP') AS SELECT * FROM big;
Rows Exported
---------------
134217728
(1 row)
verticademos=> \! du --summarize -h /home/dbadmin/parq_gzip
1.9G /home/dbadmin/parq_gzip
Brotli compression:
verticademos=> EXPORT TO PARQUET (directory = '/home/dbadmin/parq_Brotli', compression='Brotli') AS SELECT * FROM big;
Rows Exported
---------------
134217728
(1 row)
verticademos=> \! du --summarize -h /home/dbadmin/parq_Brotli
1.7G /home/dbadmin/parq_Brotli
ZSTD compression:
verticademos=> EXPORT TO PARQUET (directory = '/home/dbadmin/parq_ZSTD', compression='ZSTD') AS SELECT * FROM big;
Rows Exported
---------------
134217728
(1 row)
verticademos=> \! du --summarize -h /home/dbadmin/parq_ZSTD
1.7G /home/dbadmin/parq_ZSTD
Hint: Although we can see in the example that the Brotli and ZSTD compression methods offer similar savings on disk space, there are other factors to keep in mind about using these methods in practice; in reality, the ZSTD performs much, much better than Brotli.
Have fun!