Page 246 – OpenText Blogs

Query Tuning with Vertica: Dos and Don’ts
Query tuning in Vertica is not an exact science. Recommendations differ based on your database. This document assumes that all nodes in the cluster are UP, your Vertica configuration is ok, and that v*perf tools have been executed.

The following diagram shows the query flow in Vertica:

Vertica Optimizer

Queries can be executed in many ways. The Vertica optimizer quickly finds the best way to execute a query. Vertica uses a cost-based optimizer. The cost model represents the costs as a function of the amount of data flowing through the plan. Each query plan alternative is associated with a cost that estimates the amount of resources Vertica needs to execute the query, including CPU, disk, memory and network. The query optimizer selects the plan with the lower costs, which is also usually the one that is faster.

The query optimizer relies on statistics and heuristics to determine the execution plan costs, including the following:
- Number of rows in the table
- Cardinality of each column
- Min/max values of each column
- Values distribution histogram for each column
- Column footprint
- The access path with the fewest expected I/O operations and lowest CPU, memory, and network usage
- Join types based on different projection choice
- Join order
- Predicates selectivity
- Data redistribution algorithms across nodes in the cluster
Do: Check the SQL

Start by looking at the SQL itself. Try to reduce complications as much as possible. For example, this SQL with 5 nested function calls and two string concatenations can be reduced:
to_char(YEAR_ISO(period_key)) ||'-W’|| lpad(to_char(WEEK_ISO(period_key)),2,'0')
Replace it with 1 function call and no string concatenations:
to_char(period_key, 'IYYY-"W"IW')
Avoid passing UDx arguments. Instead, use parameters. Keep in mind that inequality predicates and OR operators are slow.

Do: EXPLAIN Your Query

The EXPLAIN plan describes how the optimizer would like to execute a query, before the query is actually executed. You should check the following:
- GLOBAL RESEGMENTATION
- BROADCAST
- JOIN ORDER
- JOIN TYPE
- GBY TYPE
- COSTs and ROWs
- Projections being used
- Columns being materialized
Do: Update Statistics

You should update your statistics:
- After a consistent table load or update
- After a table is altered
- When a projection is refreshed
You can also run ANALYZE_STATISTICS immediately before running your benchmark.

Do: Run Your Query Using vsql

Use vsql to run your perf test:
$ vsql -AXtnqi -f query.sql -o /dev/null
Axtnqi means:
- Use unaligned output mode (A)
- Do not run commands in the vsql initialization file (X)
- Disable printing column names (tuples only) (t)
- Disable command line editing (n)
- Work quietly (q)
- Print timing information (i)
Do: Check QUERY_EVENTS

QUERY_EVENTS contains very useful information generated during either the OPTIMIZATION or EXECUTION of event categories.

Do: DDLs and Projections

DDLs and projection definitions are some of the most important optimization techniques. DDLs are used to profile your data and ensure it uses the right data types. Consider replacing fat joining or grouping columns with slick integers. Also consider flattening tables to avoid or reduce joins. Take advantage of LAPs when possible.

Avoid creating too many projections, because loads will be slower. Use the SEGMENTED BY clause to avoid resegmentation with either joins or GROUP BY. Each node should be able to group or join its own data without looking into other nodes.

Use ORDER BY to influence the GROUP BY and join type:
- Joins: projections are sorted on the joining column(s). You get a MERGE JOIN rather than a HASH JOIN. MERGE joins never spill to disk.
- GROUP BY: if grouping columns are a subset of the ones in the SORT BY clause, you get a PIPELINED GROUP BY rather than a HASH GROUP BY. Pipelined GROUP BYs never spill to disk.
Do: Profile Your Query

The query profile provides very detailed information about each single operator used during the execution. Data profiling is available in the V_MONITOR.EXECUTION_ENGINE_PROFILE if you explicitly profiled the query. Even a simple query can easily produce thousands of EXECUTION_ENGINE_PROFILEs.

The EXECUTION_ENGINE_PROFILE contains the following information:

Node name

User information

Session, transaction, and statement IDs

Plan information

Operator name

Counter name

Counter value

Counters change from one operator to another.

Do: Update System Config (If needed)

You might want to change some system parameters to improve performance. Do this with caution.

Don’t: Underestimate Data Extraction

If your query returns a large result set, moving data to the client can take a lot of time. Redirecting client output to /dev/null still implies moving data to the client. Consider instead storing the result set in a LOCAL TEMPORARY TABLE.

Useful Queries

The following query checks the data distribution for a given table. This is often useful to look into a plan when no statistics are available:
select projection_name, node_name, sum(row_count) as row_count, sum(used_bytes) as used_bytes, sum(wos_row_count) as wos_row_count, sum(wos_used_bytes) as wos_used_bytes, sum(ros_row_count) as ros_row_count, sum(ros_used_bytes) as ros_used_bytes, sum(ros_count) as ros_count from projection_storage where anchor_table_schema = :schema and anchor_table_name = :table

group by 1, 2
order by 1, 2;

The following query shows the non-default configuration parameters:
SELECT parameter_name, current_value, default_value, description FROM v_monitor.configuration_parameters WHERE current_value <> default_value ORDER BY parameter_name;

The following query checks encoding and compression for a given table:
SELECT cs.projection_name, cs.column_name, sum(cs.row_count) as row_count, sum(cs.used_bytes) as used_bytes, max(pc.encoding_type) as encoding_type, max(cs.encodings) as encodings, max(cs.compressions) as compressions FROM column_storage cs inner join projection_columns pc on cs.column_id = pc.column_id WHERE anchor_table_schema = :schema and anchor_table_name = :table GROUP BY 1, 2 ORDER BY 1, 2;

The following will retrieve the EXPLAIN PLAN for a given query:
SELECT path_line FROM v_internal.dc_explain_plans WHERE transaction_id=:trxid and statement_id=:stmtid ORDER BY path_id, path_line_index;

The following shows the resource acquisition for a given query:
SELECT a.node_name, a.queue_entry_timestamp, a.acquisition_timestamp, ( a.acquisition_timestamp - a.queue_entry_timestamp ) AS queue_wait_time, a.pool_name, a.memory_inuse_kb as mem_kb, (b.reserved_extra_memory_b/1000)::integer as emem_kb, (a.memory_inuse_kb-b.reserved_extra_memory_b/1000)::integer AS rmem_kb, a.open_file_handle_count as fhc, a.thread_count as threads FROM v_monitor.resource_acquisitions a inner join query_profiles b on a.transaction_id = b.transaction_id WHERE a.transaction_id=:trxid and a.statement_id=:stmtid ORDER BY 1, 2;

The following gives query events for a given query:
SELECT event_timestamp, node_name, event_category, event_type, event_description, operator_name, path_id, event_details, suggested_action FROM v_monitor.query_events WHERE transaction_id=:trxid and statement_id=:stmtid ORDER BY 1;

The following query shows transaction locks:
SELECT node_name,(time - start_time) as lock_wait, object_name, scope, result,description FROM v_internal.dc_lock_attempts WHERE transaction_id = :trxid ;

The following query shows threads by profile operator:
SELECT node_name, path_id, operator_name, activity_id::varchar || ',' || baseplan_id::varchar || ',' || localplan_id::varchar as abl_id, count(distinct(operator_id)) as '#Threads' FROM v_monitor.execution_engine_profiles WHERE transaction_id=:trxid and statement_id=:stmtid GROUP BY 1,2,3,4 ORDER BY 1,2,3,4;

The following query shows how you can retrieve the query execution report:
SELECT node_name , operator_name, path_id, round(sum(case counter_name when 'execution time (us)' then counter_value else null end)/1000,3.0) as exec_time_ms, sum(case counter_name when 'estimated rows produced' then counter_value else null end ) as est_rows, sum ( case counter_name when 'rows processed' then counter_value else null end ) as proc_rows, sum ( case counter_name when 'rows produced' then counter_value else null end ) as prod_rows, sum ( case counter_name when 'rle rows produced' then counter_value else null end ) as rle_pr_rows, sum ( case counter_name when 'consumer stall (us)' then counter_value else null end ) as cstall_us, sum ( case counter_name when 'producer stall (us)' then counter_value else null end ) as pstall_us, round(sum(case counter_name when 'memory reserved (bytes)' then counter_value else null end)/1000000,1.0) as mem_res_mb, round(sum(case counter_name when 'memory allocated (bytes)' then counter_value else null end )/1000000,1.0) as mem_all_mb FROM v_monitor.execution_engine_profiles WHERE transaction_id = :trxid and statement_id = :stmtid and counter_value/1000000 > 0 GROUP BY 1, 2, 3 ORDER BY case when sum(case counter_name when 'execution time (us)' then counter_value else null end) is null then 1 else 0 end asc , 5 desc ;

OpenText

June 18, 2018

Analytics, Aviator AI, Technologies

Checklists, Projections, SQL, Vertica

From identity provisioning to managing IoT ecosystems

More and more enterprises have begun their journey towards digital transformation. They are creating entirely new types of digital ecosystems that include people, applications, systems and things – both inside and outside the organization. This is an exciting new world. At its heart lies a new generation of identity management technologies and mindsets.

The pace at which organizations are embracing digital transformation is startling. The digital transformation market is estimated to reach $798.44 billion by 2025 – up from $177.27 billion in 2016. That’s an increase of over 450%. Recent research from OpenText™ into the UK Financial Services sector found that over 60% of companies were either about to or had already deployed digital transformation programs.

The identity challenge

Digital transformation represents a massive change in the way that companies operate and conduct business. In its report “2017: A ‘transformative’ year”, AIIM states that digital transformation means re-inventing the business “from the outside in” where customer, employee and partner experiences need to be central to digital transformation initiatives. The trade body suggests: “A new generation of customers and partners, too – requires a dramatically different approach to engagement, specifically one that is personalized, immediate, expressive, and immersive.”

That sounds fantastic, but the stumbling block is obvious. To take advantage of the opportunities of digital transformation, you need to provide access to your digital ecosystem with the assurance that everyone is who they say they are and that they have the right access to information only when they should.

Effective identity management becomes the key enabler for successful digital transformation. However, previous approaches to identity management have primarily delivered on helping the IT Help Desk, which is an inside-out approach to identity and access management (IAM). This traditional method applies a trade-off between application security and user convenience that cannot deliver the types of experience that AIIM suggests are necessary.

The digital ecosystem – comprised of employees, customers, suppliers, partners and other stakeholders – involve too many applications and systems that are often not in the direct control of your IT department. On top of this, disruptive technologies such as IoT are adding new “things” to the ecosystem whose identity has to be provisioned and managed as well.

Identity management: Responding to the challenge

While reading a recent research report, I came across a recommendation from Gartner about how organizations should respond to the identity challenge in digital transformation: “Emphasize the benefits of risk-taking to Identity and Access Management innovators”. I’m not sure how many IT security professionals would be happy to take this approach – risk mitigation always seems better than risk taking for sensitive corporate data – and I’m not sure it’s necessary.

Certainly, the business-to-employee (B2E) approach to identity management is fine if we limit ourselves to only employees and the systems and cloud applications that they need to connect to. I’ve written a previous blog about the need of an ‘outside-in’ model for identity. It requires a collaborative approach to delivering identity assurance – the trust that people are who they say they say are – based around a new generation and mindset of identity management. Such a platform enables you to manage the entire lifecycle of internal and external users as well as their access to all resources across your extended enterprise.

These platforms – like OpenText™ Core Secure Access – deliver a host of intelligent features, including digital identity management, authentication management, identity event streaming and identity analytics. You have the ability to create a single, central identity for everyone and every thing that can be synchronized across devices, applications, systems and resources. This increases convenience for the user while facilitating information governance and compliance.

As importantly in the hyper-connected world of digital transformation, the platform goes beyond the establishment of trusted interaction between users and organizations within your digital ecosystem. It enables the secure interoperability of the different systems and things. You have an end-to-end identity infrastructure that manages access, relationships and lifecycle for every element of your digital ecosystem.

5 key capabilities of an identity management platform

These platforms are available today. In the case of OpenText™ Covisint, it’s the platform at the core of GM OnStar serving over 12 million people everyday. Key capabilities for an identity management platform include:

1. Identity provisioning

Centralizing the process of establishing digital identities for every actor on the digital ecosystem and assigning rights reduces administration and speeds up the onboarding of new users, organizations systems and devices. The most important factor in identity provisioning is the ability to move away from identity silos where each system or application has its own rights. This also make de-provisioning quicker and more effective by only having to deactivate one identity to ensure all access rights are revoked.

2. Authentication management

While single sign on (SSO) remains an important tool in identity management, it is no longer sufficient to meet the needs of a digital ecosystem. The vast majority of data breaches in 2017 were the result of credential based cyber attacks. The platform should be able to deliver multi-factor authentication as well as support emerging authentication technologies such as biometrics. The most advanced platform allows for adaptive and risk-based authentication as well as real-time provisioning.

3. Identity federation

Identity federation allows multiple organizations to provide access to users across systems and enterprises using the same identification data. The platform manages identity federation establishing a trust relationship between different parties in the ecosystem. As digital transformation progresses, identity federation capabilities grow increasingly important to establish secure and dynamic connections between people, systems, things and services.

4. Identity governance

It’s essential that the identity management platform you select contains integrated identity governance capabilities. It should include features such as user administration, privileged identity management, identity intelligence, role-based identity administration and analytics. You must be able to define, enforce, review and audit identity management policies and map your identity function to regulatory compliance requirements and records retention policies.

5. IDaaS deployment

As companies move more services to the Cloud, Identity as a Service (IDaaS) is becoming more attractive by delivering highly secure and scalable identity management services that let organizations concentrate on developing the benefits of digital transformation in a constantly evolving digital ecosystem. Recent research showed that 57% of respondents used IDaaS for single sign on and employee portals, while one third used the approach for mobility management and multi-factor authentication.

With the new generation of identity management platforms, companies can choose to outsource their entire identity management capabilities to a trusted third-party service provider. With OpenText Core Secure Access, you can select on-premises, Cloud or hybrid Cloud deployment to suit your business and security requirements.

If you’d like know more about how identity management underpins digital transformation, it’s a key topic at Enterprise World in Toronto in July. For a personalized and private meeting, please contact us through the website or email me directly.

OpenText

June 18, 2018

Line of Business, News & Events, OpenText World, Security, Security, Supply Chain, Technologies

Business Network, cloud, cybersecurity, data security, digital transformation, IAM, information governance, IoT

Why OEM (white-label) OpenText technology?

I know, you’re thinking “Whaaatt? An OEM partnership is not even possible!”

Well, actually, it is possible and there are a variety of products available — from industry standard ISIS drivers to our analytics packages. You can check out the OpenText™ OEM page for information on all of the products we sell to OEM partners.

So why would you consider an OEM partnership with OpenText?

For the majority of industry-specialty vendors or structured data vendors, the advantage is simply that you don’t have to re-invent the wheel for a module that is not core to your product’s value proposition.

Think of it this way:

You could invest your research and development budget into building a completely bespoke content management solution, or even open source one. And, yes, you can probably make a good enough module to check off that RFI requirement or satisfy that nagging roadmap question that keeps coming up in customer advisory calls.

But what happens when the content types change/expand/die? Do you really want to keep spending your R&D budget on upgrades to what is basically a value-add?

Our OEM program is designed so that you can not only check off that box on the RFI and customer satisfaction questionnaire, but you also get a partner that actively invests — heavily — in products that are outside your sweet spot. In other words, you get a large R&D commit on a budget that makes sense for your roadmap.

When you work with OpenText, we not only help fill the gap you have identified in information management, but we understand how the product can do more — which lets you focus on your roadmap planning for your embedded products.

For example, in many transitional markets like healthcare, education, and banking, reliance on paper is slowing the transformation of these industries. Two of our key products in those spaces are OpenText™ Captiva and OpenText™ AppEnhancer (formerly known as ApplicationXtender)— both of which are available as part of our OEM program. Captiva is a top-of-the-line capture solution that can capture and digitize any type of document, email, or form, and then send it via API to any type of system of record. AppEnhancer is a content services solution that electronically stores, organizes, and manages virtually any kind of business content. It’s easy to integrate into applications as a back-end service for management of documents and extraction of data related to your processes. Oh, and by the way, these two products have out-of-the-box integration, so you can have a single embedded solution for any content to store it natively and provide access via your UI.

How would it work?

Let’s take healthcare as an example:

In a perfect world, organizations would have several different information management functions combined in a single, enterprise-wide health information management system with the flexibility to manage various use cases. But, many customers still use multiple systems to handle a variety of information sources. For example, financial information may be managed by the ERP or CPOE application while imaging is handled by DICOM systems.

To remain your customers’ preferred partner, you’re going to want to offer an application to ease the movement of information across the various administrative and clinical systems. You can either invest in several connectors or you can add the varied elements of a content services platform to your application portfolio.

I would argue that a healthcare vendor would be better off keeping up with ever-changing clinical needs (and saving their R&D dollars) by adding elements of a white-labeled content services platform.

There are three areas where adding AppEnhancer as a white-labeled content service provides value versus building extensions to a product offering:

Cost of licensing: AppEnhancer has a flexible pricing model to ensure that OEM partners can build a full business model that incorporates the additional functionality without harming revenue potential.
Maintaining permissions: Most healthcare organizations have a complex cross-departmental network of admission, transcription, and billing administration that is difficult to model without an additional layer of document and process-based security.
Reducing the cost of innovation: OpenText maintains a robust API and SDK kit for AppEnhancer, taking the cost out of maintaining back-end connectivity between the document management repository and your system. As an OpenText partner, you have access to a larger portfolio of integrated products for managing information, including Captiva, OpenText™ LiquidOffice™, OpenText™ RightFax™, or OpenText™ Magellan™.

The benefits of working with OpenText as an OEM partner means we — not you — worry about ensuring that any type of content can be made manageable by your application. Working with start-ups and large established industry veterans, our OEM team is dedicated to ensuring that we understand how our portfolio of products fit into your roadmap.

Interested in learning more?

Chris Wynder

June 15, 2018

OEM, Technologies

Content, content capture, Developers, ECM, enterprise content management, OEM, white-label

What Should I do if the Database Performance is Slow?

Troubleshoot using the following checklist if your database performance is slow.

Check if any of the following problems exist:

Step	Task	Results
1	Is the query performance slow?	If the query performance is slow, review the Query Performance checklist. If the query performance is not slow, go to Step 2.
2	Is the entire database slow?	If the whole database is slow, go to Step 3. If the whole database is not slow, your checklist is complete.
3	Check if all the nodes are UP. `=> SELECT node_name, node_address, node_state FROM nodes WHERE node_state != 'UP';`	If there is any node DOWN, Investigate why the node is down review the Node Down checklist. Restart the node. `$ admintools –t restart_nodes –d <database> -s <nodes_address>` If nodes restarted and the performance improved, your checklist is complete. If the node restarted and performance is still slow, go to Step 4. If the node did not restart, review the Node Down checklist.
4	Check if there are too many delete vectors. `=> SELECT count(*) FROM delete vectors;`	If there are more than 1000 delete vectors, review the Manage Delete Vectors checklist. If there are not too many delete vectors, go to Step 5.
5	Check if epochs are advancing. `=> SELECT current_epoch, ahm_epoch, last_good_epoch, designed_Fault_tolerance, current_fault_tolerance FROM system ;`	If epochs are not advancing, review the AHM not Advancing checklist. If epochs are advancing, go to Step 6.
6	Check if one node is slower than the others. Run a select statement for each node in the cluster and identify the there is a slower node. $ `grep -P "^v_" /opt/vertica/config/admintools.conf\|awk '{print $3}'\| awk -F, '{print $1}'`; do echo ----- $host -----; date ; vsql -h $host -c ";select /+kV/ 1 ;";date ; done	If one node is slower than the others, Investigate host performance issue. Restart the Vertica process on that node. Start: `$ admintools –t restart_node –d <database> -s` Stop: `$ admintools –t stop_node –s <node_ip/Host slow>` If all the nodes have similar performance, go to Step 7.
7	Check if the workload is balanced across all the nodes. `=> SELECT node_name,count(*) FROM dc_requests_issued WHERE time > sysdate() -1 group by 1 ORDER BY 1;`	If one node has a heavier workload, distribute the workload to all the nodes. Review the documentation on Connection Load balancing. If the workload is balanced, go to Step 8.
8	Check if there are resource rejections. `=> SELECT * FROM resource_rejections ORDER BY last_rejected_timestamp;`	If there are resource rejections, review the Query Performance checklist. If there are no significant resource rejections that justify slowness go to Step 9.
9	Check if there are sessions in queue. `=> SELECT * FROM resource_queues;`	If there are queries waiting for resources, go to Step 10.
10	Check if there are long-running sessions that are using too many resources. => SELECT r.pool_name, s.node_name AS initiator_node, s.session_id, r.transaction_id, r.statement_id, max(s.user_name) AS user_name, max(substr(s.current_statement, 1, 100)) AS statement_running, max(r.thread_count) AS threads, max(r.open_file_handle_count) AS fhandlers, max(r.memory_inuse_kb) AS max_mem, count(DISTINCT r.node_name) AS nodes_count, min(r.queue_entry_timestamp) AS entry_time, max(((r.acquisition_timestamp - r.queue_entry_timestamp))) AS waiting_queue, max(((clock_timestamp() - r.queue_entry_timestamp))) AS running_time FROM (v_internal.vs_resource_acquisitions r JOIN v_monitor.sessions s ON (((r.transaction_id = s.transaction_id) AND (r.statement_id = r.statement_id)))) WHERE (length(s.current_statement) > 0) GROUP BY r.pool_name, s.node_name, s.session_id, r.transaction_id, r.statement_id ORDER BY r.pool_name;	If there is a statement running for too long and using a high proportion of the box resources, consider stopping the statement, `=> SELECT interrupt_statement(‘session_id’,’statement_id’);` Upon statement cancellation, resources should be freed and performance should improve. If it does not improve, go to Step 11.. If the statement does not terminate properly, contact Vertica Technical Support.
11	Check if any transactions are waiting for locks. `=> SELECT * FROM locks where grant_timestamp is null;`	If transactions are waiting for locks, identify lock-holding sessions and consider wait for transaction to complete or cancel the session to free locks. `=> SELECT interrupt_stament(‘session_id’,’statement_id’);` Upon statement completion or cancellation and lock release, performance should improve. If it does not improve, go to Step 12. If the session does not terminate properly, contact Vertica Technical Support reporting a hang session.
12	Check the catalog size in memory. `=> SELECT node_name,max(ts) as ts, max(catalog_size_in_MB) as catlog_size_in_MB FROM ( SELECT node_name,trunc((dc_allocation_pool_statistics_by_second."time")::TIMESTAMP, 'SS'::VARCHAR(2)) AS ts, sum((dc_allocation_pool_statistics_by_second.total_memory_max_value - dc_allocation_pool_statistics_by_second.free_memory_min_value))/(1024*1024) AS catalog_size_in_MB from dc_allocation_pool_statistics_by_second group by 1,2) foo group by 1 ORDER BY 1 limit 50;`	If catalog is larger than the 5% of memory in the host, resource pools should be adjust to free memory needed by the catalog as Vertica process is a risk of being terminated by the kernel with OOM. Contact Vertica Technical Support to debug catalog size growth and discuss alternatives to free memory to be used to allocated catalog. Alternatives are: Adjust general pool to use less than 95%. Create an additional Pool of size of the difference needed to accommodate catalog. Adjust the METADATA resource pool to free memory for the catalog In Many cases restarting the node could free memory used by catalog, debugging with support will help to determine the best course of action.
13	Check usage of resident and virtual memory and maps of memory created. `=> SELECT * FROM ( SELECT time, node_name, files_open, other_open,sockets_open,virtual_size,resident_size,thread_count,map_count, row_number() over (partition by node_name ORDER BY time::timestamp desc) as row FROM dc_process_info ) a where row <=3 ;`	If Virtual Memory or resident memory is high, monitor to see if the numbers lower. If the numbers do not lower, contact Vertica Technical support to debug the issue. Restarting the nodes should resolve the issue but a proper debugging should be done, follow Catalog Size Debugging checklist.

Learn More

Learn more about Connection Load Balancing in the Vertica Documentation.

OpenText

June 14, 2018

Analytics, Aviator AI, Technologies

Checklists, Vertica

What Should I do When the Database Node is Down?

When database node is DOWN, troubleshoot using the following checklist.

Step	Task	Results
1	Check whether your database is UP. `$ admintools -t db_status -s UP`	If the database is UP, go to Step 2. If the database if not UP, restart your database. `$ admintools -t start_db -d <Database_name> -p <Database_password>` If the database starts, the checklist is complete. If the database does not start, see the Database Process Not Starting checklist.
2	Identify all the DOWN nodes. `=> SELECT node_name, node_address, node_state FROM nodes WHERE node_state = 'DOWN';`	Upon identification of all the DOWN node/s, proceed to Step 3.
3	Check whether you can establish a connection with the DOWN nodes using SSH. `$ ssh dbadmin @<nodedown_ip>`	If you can SSH into the node, restart Vertica process on the DOWN node. `$ admintools -t restart_node -d <database_name> -s <node_host_name or IP>` If the restart was successful, the checklist is complete. If the restart failed, go to Step 4. If you cannot SSH into the node, contact your system administrator to find if it is a port issue or a network issue.
4	To find reasons for restart failure, on the DOWN node tail startup.log. A snippet of the tail startup.log is as follows: `$ tail -f catalog-path/database-name/v_database-name_node_catalog/startup.log { "node" : "v_cdmt0_node0001", "stage" : "Database Halted", "text" : Data consistency problems found; Check that all file systems are properly mounted. Also, the --force option can be used to delete corrupted data. "timestamp" : "2016-07-31 18:17:04.122" }`	The log results shows the latest state of the DOWN node. Proceed to Step 5 to see these stages.
5	If the startup.log… a. Remains in the Waiting for cluster invitestage.	See the Spread Debugging Checklist.
	b. Remains in the Recovery stage.	See the Node Recovery Checklist.
	c. Shows an error message, Data inconsistency problems found, restart the node with the force option. `$ admintools -t restart_node -d Database_name -s node_name --force`	Upon restart the checklist is complete.
	d. Shows no new data in the startup.log after last restart, check dbLog file for errors.	If you cannot resolve errors, contact Vertica Support.
	e. Shows Shutdown Complete but the node is still DOWN. tail vertica.log and look for <ERROR> and <PANIC>.	Contact Vertica Support with PANIC report, ErrorReport.txt, and scrutinize.

OpenText

June 14, 2018

Analytics, Aviator AI, Technologies

Checklists, Vertica

What Version of Vertica am I Running?

The built-in VERSION function returns a VARCHAR that contains your Vertica node’s version information.

Example:

dbadmin=> SELECT version(); version ------------------------------------ Vertica Analytic Database v9.1.0-2 (1 row)

The Vertica version is formatted as X.Y.Z-R, where…

• X.Y is the two-digit Vertica major release number, e.g., 8.1, 9.0 and 9.1
• Z is the service pack number, e.g., 7.2.3, 8.1.1 and 9.0.1
• R is the hotfix number, e.g., 9.0.1-9 and 9.1.0-2

Have Fun!

OpenText

June 14, 2018

Analytics, Aviator AI, Technologies

Checklists, DBadmin, Vertica

OpenText Decisiv brings Predictive Research to Enterprise Search

It’s been nearly a decade since Recommind (now OpenText) first pioneered predictive coding for legal document review with OpenText™ Axcelerate™. The advent of supervised machine learning revolutionized eDiscovery with the simplest of principles: if those documents are of interest to you, these probably will be also.

Now, all users searching for content across enterprises can benefit from this same principle — and the same proprietary artificial intelligence technology — with Predictive Research in OpenText™ Decisiv™ 8.2, released in advance of OpenText Enterprise World 2018.

Decisiv augments enterprise search with the power of artificial intelligence (AI). Decisiv’s unsupervised machine learning provides sophisticated, conceptual analysis of unstructured content to help users find what they’re looking for, faster. Decisiv helps legal teams, government workers and business professionals identify useful content, and people with specific expertise, by instantly searching across a wide range of enterprise sources, even when those searching are not sure what terms are likely to yield the best results.

For example, when a Decisiv user types “asia,” the system instantly and automatically retrieves documents that (1) contain the word Asia and/or (2) are conceptually related to Asia, regardless of whether they contain the word “asia.” Decisiv not only casts a wider net, it delivers and prioritizes results based on more sophisticated relevancy analysis than simple keywords can provide.

With Predictive Research, Decisiv now adds supervised machine learning to its functionality: the system learns on a continuous basis from human decision-making. As users “pin” useful documents to their Research view for quick access, Decisiv automatically learns what they are interested in, and automatically suggests relevant documents.

Importantly, Predictive Research requires no change in how researchers operate. A single click will pin a document to the Research view, a convenient space for ready access. The Research view operates like a virtual desktop, easily collecting key content without requiring downloads or making additional copies of files. Based on sophisticated document models generated from those pinned results, Decisiv automatically – virtually instantly – suggests additional documents likely to be useful to the researcher. A single click on a suggested document enables the user to approve or reject the suggestion, automatically refining the system’s training and updating the suggestions.

Using Predictive Research also improves overall search results across the enterprise. The more a document has been pinned by users, the more likely it is to be prominently featured in Decisiv search results. This valuable contribution of tribal knowledge is leveraged across the organization to help the search engine automatically pinpoint valuable content.

As it helps deliver what users are looking for, even without their asking, Predictive Research is one of those advances that seems likely to be considered indispensable in the not-too-distant future.

Join us at Enterprise World to learn more about Predictive Research and other AI use cases for legal teams and business users across your enterprise. Register today!

OpenText

June 14, 2018

eDiscovery, OpenText World, Technologies

artificial intelligence, discovery, ediscovery, enterprise content management, enterprise search, machine learning, Predictive Analytics, Predictive Coding, security

How to choose the best EDI software and services in 2018

In our last blog: “What is Electronic Data Interchange (EDI)?” we looked at why most companies today have deployed EDI systems to trade with customers and suppliers and what are the key benefits of EDI technology for your business. In this blog, we’ll discuss the key features and capabilities you should look for when you’re selecting your EDI solution.

What is EDI?

Let’s start with a quick recap. EDI, or Electronic Data Interchange, is the transfer of structured data, by agreed message standards, from one computer system to another without the need for human interaction. Human intervention should only be required in the case of dealing with errors, for quality review, and for special situations. It provides savings in terms of man-hours, management of paper documents and storage, as well as reduction in errors and improved speed for transactions. Since its widespread adoption in the 1980s and 1990s, EDI technology has been a corporate IT stalwart and has become the standard system for large enterprises transferring electronic documents with each other – EDI payments, EDI invoices, EDI shipping documents and much more – and improving key business and supply chain processes.

Features of EDI

That electronic data interchange definition gives little idea of the major features of an EDI system. So what are the capabilities that you’ll find in the best EDI tools? Some of the features you need for your EDI systems include:

Widespread support for EDI standards
Over the preceding 50 years, there are a number of key EDI standards that have developed including the Tradacoms, ANSI X.12 or EDIFACT standards. In addition, industry EDI document formats have appeared to meet the needs of specific industries like RosettaNet in High Tech and PEPPOL in the European public sector. The EDI system that you choose should be able to accommodate all of these different EDI standards and the variations that appear within each individual standard.
Widespread support for EDI document types
While EDI technology developed initially to make exchanges of EDI invoices and EDI payments more effective, it quickly addressed the business documents that underpin other key business processes. Today, EDI document types include orders, invoices, purchase orders, shipping notices, acknowledgements, remittance advice, financial statements, quotation requests, product and sales catalogs and much, much more. The best EDI solutions offer support for all major EDI documents. There are hundreds of EDI documents to choose from.
Widespread support for communications protocols
There are a number of communications protocols that support EDI transactions and the exchange of EDI documents. AS2, ebMS, FTP, OFTP, HTTP are just a few EDI messaging protocols. To enable any-to-any EDI communications, your EDI tools will need to support all the important protocols so that you can connect with all your trading partners. The development of AS2 was important in this regard as it allowed EDI transactions to take place over the Internet – paving the way for simple web forms that opened EDI to smaller companies.
EDI translation and mapping
The EDI system you select must include EDI tools for effective document mapping and translation. EDI mapping and translation tools will take EDI data from one format and place it into another to enable the end-to-end automated flow of EDI data from the enterprise applications of the sender (usually the buyer) and the receiver (usually the supplier).
Easy onboarding of new trading partners.
Growing companies need to onboard new EDI trading partners quickly and smoothly. Your EDI solution, whether you choose EDI software or EDI services from a provider – especially when working with EDI providers like OpenText^™ – should allow you to use predefined templates that place your information into EDI formats that can quickly connect with all your trading partners. While free Electronic Data Interchange software has basic features, the lack of EDI mapping and translation or trading partner onboarding can render these EDI solutions of limited use for most organizations.
Scalability to meet EDI demand
The scale of your EDI commitment is likely to have an effect on the type of EDI system you decide to implement. Some Web EDI systems operate on a ‘per transaction’ basis, which can be economical on a smaller scale, say 500 transactions or less per month, but can quickly get out of hand at larger volumes. For a larger scale operation, you’ll want to look at all-in EDI solutions. An EDI network that incorporates all the EDI document standards you require becomes increasingly attractive.
Enterprise application integration
EDI is among the highest-value integrations in your accounting system, supply chain systems and ERP environments because it eliminates time-consuming, error-prone manual effort that would otherwise be necessary to get orders, invoices and other EDI data in and out of other enterprise applications. The more trading partners you have, the more operational costs you’ll save through EDI integration.
Value-added services
Quite often, electronic data exchange software only delivers the simple exchange of set EDI documents. However, the best EDI systems also have a range of value added services that extend what you can achieve from your EDI investment. A common service from EDI tools is secure file transfer that allows you to protect content and data containing vital corporate intellectual property. As digital transformation progresses, standard business documents are becoming larger and more varied. Managed File Transfer (MFT) allows for the secure and fast exchange of large files such as CAD or PLM files or video and rich media files.

More importantly, the move from paper to digital processes offers the potential for a much greater analysis of the data you manage. EDI documents contain some of the most important data regarding supply chain performance and the financial health of your business. The deployment of a central EDI network – with EDI integration services from reputable EDI providers – allows you to apply analytics for EDI tracking of all elements of your supply chain operations as well as identifying customer and industry trends within the data to improve customer experience, enhance new product development and increase business agility.

Types of EDI

There are many types of EDI and approaches to enabling EDI across a trading community. Whether looking at EDI for the first time or expanding an existing EDI infrastructure to support business partners across the globe, there is a method of utilizing EDI that will suit your business needs, technical capabilities and budget. These include:

Direct EDI/Point-to-point
Brought to prominence by Walmart, direct EDI, sometimes called point-to-point EDI, establishes a single connection between two business partners. In this approach, you connect with each business partner individually. It offers control for the business partners and is most commonly used between larger customers and suppliers with a lot of daily EDI transactions.
EDI Network/Point-to-multipoint
An EDI network is also known as a Value Added Networks (VAN). It is a private network – normally delivered via a third party EDI provider – where electronic business documents are exchanged between partners. The EDI provider manages the network and provides companies with mailboxes where they can send and receive EDI documents.
Web EDI
Web EDI conducts EDI using a standard Internet browser. Organizations use different online forms to exchange information with business partners. Web EDI makes EDI easy and affordable for small and medium-sized organizations and companies that have only occasional need to exchange EDI documents and data with trading partners.
Mobile EDI
Users have commonly accessed EDI either by a private network such as value added network or the Internet in order to send and receive EDI-related business documents. As mobile becomes the device of choice, EDI transactions will increasingly become mobile. There is a growing industry for developing software applications or ‘apps’ for downloading onto mobile devices and it will be only be a matter of time before you will be able to download supply chain and EDI related apps from private or corporate app stores.
Full B2B integration
While EDI really only covers the exchange of electronic documents, it is the basis for B2B integration. This can be defined as the integration, automation and optimization of key business processes that extend outside the four walls of the enterprise. In addition to data exchange, B2B integration is based around a central digital B2B backbone and includes a whole series of value-added features such as partner onboarding, community management, Managed File Transfer (MFT) and secure file transfer.

Many larger companies adopt hybrid solutions – combining different types of EDI – to ensure they can connect to their entire trading partner communities, regardless of their size, geographic location, technical capabilities of frequency of their EDI transactions.

Why choose OpenText for EDI?

EDI providers don’t come much more experienced than OpenText. Our EDI experts are at the forefront of the development of EDI with a suite of services to help organizations, whatever their size, improve their existing processes and adopt new ones. With our software and services you can exchange a wide variety of EDI transactions with your business partners, including POs, ASNs, invoices and payment instructions.

Our flexible EDI solutions incorporates a range of key components which can be adapted according to your organization’s electronic trading requirements, including:

EDI software and services
The OpenText B2B integration services products allow you to create, send, receive, print and manage EDI documents, as well as integrate to accounting and other back-office systems. Our EDI translators and EDI mapping tools convert messages from the data structures of your enterprise applications into the Tradacoms, ANSI X.12 or EDIFACT standards. OpenText range of products and services include everything from on-premises B2B integration software, to value-added network options, to B2B managed services.
Industry leading B2B integration platform
EDI is the core component of OpenText Trading Grid^™, the leading B2B integration network. Built on the strength of the OpenText Cloud, it connects more than 600,000 businesses worldwide that execute in excess of 16 billion transactions per year with a value in excess of $8 trillion. The network handles all major EDI document types, data formats and communications protocols as well as delivering a range of integrated value-added features.
B2B enablers
Our low-cost, easy-to-use alternative to connecting your smaller trading partners who still send you manual transactions via phone, fax or email, OpenText Web EDI uses a simple web “forms” application where your partners can view or create common EDI documents such as electronic purchase orders, ship notices and invoices. There is no EDI software to license or install. All that is required is an Internet connection and a browser. The Web EDI forms look like the paper equivalents so there is no training required to Web EDI.

Download this EDI basics eBook to learn more about EDI and the options you have to optimize your business transactions.

OpenText

June 12, 2018

Digital Fax, Integration, Line of Business, Supply Chain, Technologies

B2B, B2B Integration, Business Network, EDI, fax, Integration, supply chain

OpenText Business Workspaces

In my last blog, I addressed the differences between Office 365^® and OpenText^™ Extended ECM for Office 365, and talked about the benefits of connecting content from Office 365 with ERP, CRM and HCM business applications.

One of the main benefits of connected content is a complete picture of all the information you need in one place, in a UI that is familiar and easy for you to use. At OpenText, we like to call it a single source of truth. This single source of truth comes from working in an OpenText Business Workspace.

Let’s explore Business Workspaces a little further. As we dive deeper into this topic, you’ll see why this is so powerful for companies and how it fuels results, fattens the bottom line and makes customers and workers happy. It also has the added benefit of keeping content logically organized, secured and managed according to your corporate policies.

Think of a Business Workspace as a place where you have all the pertinent information right in front of you, all gathered together and easily accessible for you to start and complete your critical business operations.

For example, let’s say a customer calls and leaves a message about a missed shipment. Without leaving your familiar Microsoft^® Office 365 interface, you can find out information about this customer. Key customer data is populated from a CRM system (see screenshot below) and the account team information is also right there from your Microsoft Teams app. Now you know who on the sales account team is responsible for supporting or selling to the customer.

All the pertinent sales orders and their corresponding sales contracts are presented right there for you to easily access and find. After you write a follow up email or letter to the customer, you can file it in the correspondence folder for future reference by others on the sales or support team.

OpenText Business Workspaces integrate data from a variety of systems to offer a complete view of a customer, opportunity, asset, project or any business object, in one user-friendly interface. From this Workspace, there is full visibility across the customer’s account team, customer data and more. Integration with SAP allows you to review related data, like customer contracts, orders and deliveries.

In addition to SAP, OpenText Extended ECM for Office 365 integrates with other leading business applications like Salesforce, Oracle, Microsoft Dynamics 365 and more. This seamless connection to your critical enterprise information sources, no matter where they reside, provides you with a 360-degree view of important customer data so you can gain insights, improve productivity and fatten the bottom line.

Deidra (Dei) Jow

June 9, 2018

Supply Chain, Technologies

Business Network, Microsoft

OpenText Magellan helps G7 hear the voice of the people

Enlightened leaders take the feelings and interests of their people into account. But how can you tell what they’re thinking, about myriads of subjects? Elections are yes/no questions without nuance. Opinion polls are time-delayed and limited in how much they can ask. And letters are labor-intensive to process.

Nowadays, the usual way to take the pulse of the public is to look at what they’re saying – in digital communications to their representatives, news, or social media. The challenge is how to read through millions of words and accurately sum up what they’re about, without throwing a whole office full of aides at the problem.

Do tell

Canada is facing that challenge as it hosts the annual Group of Seven economically advanced nations’ conference June 8 and 9. The Government of Canada wants to encourage citizen engagement with the five global hot topics the G7 countries will discuss at this year’s summit. (The topics are economic growth, gender equality, climate change, peace and security, and training for the jobs of the future.)

So the G7 organizers have turned to OpenText^™, a proud Canadian company, for help displaying and analyzing public opinion on those topics, in a format that’s easy to use and share. (Hashtag #myG7 and #monG7.)

Personalized coverage of the G7 issues

The solution OpenText created is “My G7,” a self-service online dashboard that lets you see, in near-real time, what public opinions on these major issues sound like.

Digesting thousands of articles and tweets of news and social media commentary every week, the My G7 site lets you visually monitor, compare, and discover interesting facts around global opinions of the 2018 G7 Summit’s topics. It can show you what the public is saying — by topics and keywords, countries, dates, and even tone of the coverage (positive, negative, or neutral).

You can see at a glance how the discourse changes from country to country or over time and what terms are “topping the charts” — useful insights into the summit’s discussions.

How it works

MyG7.ca is built using OpenText™ Magellan™, an AI-powered analytics platform that combines open source machine learning with advanced analytics, enterprise-grade BI, and the ability to acquire, merge, manage, and analyze Big Data and Big Content in a wide range of formats.

The Magellan application automatically crawls the web for G7 Summit focused articles and tweets in both French and English, and retrieves the raw text for evaluation.

The text mining module within Magellan processes the text to identify people, places and topics, following taxonomies (classification structures) that help the software organize and interpret the various terms it encounters.

The OpenText team that created the G7 tracker gave it a custom taxonomy. Although Canada picked the top five topics, sub-topics (such as “human rights” or “hunger”) were generated from the raw text the tracker had already consumed.

Sentiment analysis: That’s great, juuust greeaat

At the same time, Magellan determines the subjectivity and sentiment or tone of the article in question, using powerful, proprietary OpenText technology. (“Subjectivity” refers to whether an article/tweet is factual or presenting an opinion, while “tone” refers to its mood – positive, negative, or neutral.)

It’s even sophisticated enough to interpret sarcasm, emojis, and other nuances. 😲

Visualizations: I see your point

The tracker’s findings are then rendered as colorful, interactive charts and visualizations (from bar and ring charts to word clouds), making them easier to take in at a glance. Magellan’s JavaScript API (one of many APIs built into Magellan) seamlessly embeds the visuals into web pages.

With just a few clicks, you can alter the parameters to analyze the G7 coverage by the numbers, tone, theme, country, or time frame, and see what changes. Or pursue your own interests to find unique insights.

Showcasing the power of unstructured data

The G7 tracker is the latest in a series of online tools OpenText has built to showcase our ability to analyze unstructured data (mainly text, in contrast to structured data such as numbers in databases). We started 2 1/2 years ago with Election Tracker, which looked at news coverage of the 2016 U.S. Presidential Election.

OpenText knows making sense of unstructured content is a challenge for many organizations. Whether it’s spotting clues in insurance claims that could indicate fraud, searching for patterns in contracts that could help a company negotiate more consistent, profitable deals, or taking the temperature of consumer sentiment about product changes, companies want content analytics solutions that are more precise and more scalable than “assign a bunch of low-level employees to read through stacks of documents and make notes based on subjective impressions.”

The Magellan Text Mining module brings at least three big advantages to solving that problem:

Smart, automated text analytics. Just as Magellan can swiftly sort through, classify, and make sense of millions of G7 comments, it can do the same with millions of documents, memos, emails, tweets, or claim forms.
Interactive visualizations. We humans are visual creatures, so we perceive relationships and patterns much faster through images than words and numbers. OpenText Magellan offers many ways to focus on just the data set you want, and visualize it in ways that make sense to you. No data science degree needed!
It’s scalable and embeddable. Magellan’s technology platform can handle even the largest data sets and heavy peak demand. And a wide range of APIs means you can embed its capacities into nearly any business application or website, whether desktop or mobile, under your own branding.

For more information on Magellan, contact us.

OpenText

June 7, 2018

Analytics, Aviator AI, Technologies

Analytics, Canada, magellan, OpenText Magellan, sentiment analysis, text analytics, unstructured content, unstructured data, Voice of Customer

Vertica Optimizer

Do: Check the SQL

Do: EXPLAIN Your Query

Do: Update Statistics

Do: Run Your Query Using vsql

Do: Check QUERY_EVENTS

Do: DDLs and Projections

Do: Profile Your Query

Do: Update System Config (If needed)

Don’t: Underestimate Data Extraction

Useful Queries

The identity challenge

Identity management: Responding to the challenge

5 key capabilities of an identity management platform

1. Identity provisioning

2. Authentication management

3. Identity federation

4. Identity governance

5. IDaaS deployment

Learn More

What is EDI?

Features of EDI

Types of EDI

Why choose OpenText for EDI?

Do tell

Personalized coverage of the G7 issues

How it works

Sentiment analysis: That’s great, juuust greeaat

Visualizations: I see your point

Showcasing the power of unstructured data