What’s New in OpenText Vertica 23.3

The latest updates include integration with Apache Iceberg for a smarter data lakehouse

This release contains several months’ work and a lot of changes. The first change you’ll notice is the versioning system. The last Vertica release was 12.0.4, so you might have expected this to be version 13. However, OpenText releases are timed to one per quarter every year, and they’re numbered according to the year and quarter — hence this is OpenText™ Vertica™ release is 23.3, after the 3rd quarter of 2023. You’ll also notice Management Console and other visual aspects of Vertica have changed color and logo to reflect our new company and brand. 

Beyond the cosmetics, and even beyond all the improvements you expect in performance, security, and the rest provided in every release of Vertica, major new features now allow you to: 

  • Re-shard your Vertica database whenever you need to as data and workloads change.  
  • Save snapshots of the database at moments in time that you can revert to as needed, without overburdening your storage budget with multiple copies of the same data.  
  • Automate routing of workloads to the node or sub-cluster that makes the most sense for that type of work.  
  • Start doing machine learning (ML) workflows with Vertica easily with new VerticaPyLab with all dependencies, examples, and lessons in a single installation with an easy-to-use JupyterLab interface. 

Most notably, with the addition of read and analysis capabilities on external data using Apache Iceberg as the semantic layer, Vertica is now a fully functional data lakehouse. In past versions, Vertica unified business intelligence, machine learning, and other types of advanced analytics like geospatial, event pattern, and time series data analysis into a single point of contact for any analytics. Vertica also gave you the ability to analyze any data, from structured data in our own ROS format, to semi-structured and complex data in external data lake formats like Parquet, JSON, and ORC. 

OpenText Vertica data lakehouse with Apache Iceberg Integration.

Analyzing this data with OpenText Vertica, through the Apache Iceberg metadata layer, gives you the advantage of ACID compliance and rapid findability of that data in the lake. Vertica lets you analyze even complex data through Iceberg quickly, even if another application altered it since the last analysis, even adding or removing columns or changing data types. Vertica’s focus on performance at scale has provided several ways to optimize queries on data lake data. Each release will bring that performance closer and closer to the equivalent blazing speed you expect from querying internal Vertica ROS data.  

Vertica’s smarter data lakehouse removes the limits from analytics 

It lets you analyze your data lake at the speed and concurrency you’re accustomed to in a data warehouse. Here are some things you can now do with OpenText Vertica 23.3: 

Start using Vertica machine learning easily with new VerticaPyLab – New fast install of VerticaPy with all dependencies at once, and an easy JupyterLab interface for choosing applications, examples, data science lessons, etc. 

Authenticate new users with just-in-time provisioning via OAuth2.0. The organization’s OAuth2 identity verification will add a new Vertica user with specified role(s) on the fly saving dbadmins a great deal of time. When a person logs in to Vertica via their preconfigured SSO OAuth token, there’s no need to create user accounts or grant roles manually.  Also, OAuth users who have not used Vertica in a while are automatically removed. 

When executing long running queries, use less memory and wait less in the queue for resources to free up with multi-part query plans – Vertica now breaks long running queries into parts, and allocates only the memory and compute resources needed to execute the largest part of a multi-part plan. Any unused resources within the allocated block are used to optimize the query further, so the query executes faster overall. 

Start large clusters faster using HTTPS instead of SSH – Use thin Golang clients, Cluster Operations Library (vclusterops) and vcluster.exe, which decouple the Kubernetes operator from the details of the cluster operations, and creates databases, especially large databases, faster than admintools. Many functionalities including administration operations that were formerly only available in admintools via SSH are now in the Vertica server itself, so you can use them via HTTPS instead of SSH. No special client needs to be installed. It is all handled via the NMA (Node Management Agent.) 

Automate multi-step database maintenance or machine learning pipelines or ML model retraining when a threshold of declined accuracy is reached. Stored procedures can now call meta-functions and nested stored procedures up to a call depth of 100, and session parameter changes made by stored procedure now persist after the procedure has completed. 

Run Management Console (MC) in Linux with Core support – Use less expensive Linux cloud instances, rather than being required to use a Windows instance for MC. 

Dynamically control which hardware/nodes/instances are used for different purposes with workload routing – Admins can create rules that route the execution of queries coming from clients with a particular workload to a separate subcluster, de-coupling connection from execution, client can connect to one node, execute on a different set of nodes. Clients can set a workload name by adding a workload parameter to their connection string or with SQL syntax post connection.  

Revive to a previous state by saving “restore points,” a snapshot of the database at a point in time. (Eon Mode only) – Store a copy of the catalog and any changed data, not a full extra copy of data direct from the database server. VBR (Vertica Backup and Restoration tool) is not required. 

And this just scratches the surface of the many improvements in this latest version of Vertica.  

Read the OpenText Vertica Release Notes to learn more. 

Paige Roberts

In over 25 years in the data management industry, I have worked as an engineer, a trainer, a marketer, a product manager, and a consultant. Co-author of “Accelerate Machine Learning with a Unified Analytics Architecture” and contributor to “97 Things Every Data Engineer Should Know,” both by O’Reilly Media, I now promote understanding of OpenText Analytics and AI, high scale data architecture trade-offs, open source, and how the AI revolution is changing the world.

Related Posts

Back to top button