Jan Vala

Jan Vala
I’m an ECM industry veteran based in Prague, and even after so many years in the industry, I keep on learning new things. For example: did you know that long term archiving can be exciting? Follow my blog posts to find out more.

Step-by-step Guide: Integrate Market Leading Analytics Engines With InfoArchive

Analytics

Gaining further insights from your data is a must-have in today’s enterprise. Whether you call it analytics, data mining, business intelligence or big data – your task will still be to gain further insights from the massive heap of data. But what if your data has already been archived? What if your data now resides in your long term archiving platform? Will you be able to use it in all analytics scenarios? Let me demonstrate how easily it can be done if your archiving platform is OpenText™ InfoArchive (IA). A customer recently requested a demonstration of integration with analytics/BI tools in a workshop we were running. The question asked was about the possibilities in InfoArchive to integrate with third party analytics engines? The answer is – everything in InfoArchive is exposed to an outside world in the form of REST API. When I say everything I mean every action, configuration object, search screen – literally everything. So we decided to use REST API for the analytics integration demo to the customer. What Analytics/BI tool to pick? Quick look at the Gartner Magic Quadrant has some hints. I’ve been using Tableau with InfoArchive in the past so let’s look at another option in the Gartner list: Qlik. OpenText™ Analytics (or it’s open source companion BIRT) is my other choice – for obvious reasons. Let’s get our hands dirty now! Qlik Qlik Sense Desktop seems to have a simple UI but there are some powerful configuration options hidden behind the nice façade. In Qlik to query a third party source simply open the Data load editor and create a new connection. Pick Qlik REST Connector and configure it. The connection configuration screen enables you to specify the URL of the request, request body and all necessary header values. All you need for a quick test. Now that the connection is configured you’ll have to tell Qlik how to process the IA REST response. Click the “Select data” button in your connection and Qlik will connect to InfoArchive, execute the query and show you the JSON results in a tree and table browser. All you need to do is to pick the column names that you want Qlik to process as shown below: Since the IA REST response columns are stored in name-value elements we have to transpose the data. This can be easily done with 20 lines of code in the Qlik data connection: Table3: Generic LOAD * Resident [columns]; TradesTable: LOAD Distinct [__KEY_rows] Resident [columns];   FOR i = 0 to NoOfTables()   TableList:   LOAD TableName($(i)) as Tablename AUTOGENERATE 1   WHERE WildMatch(TableName($(i)), 'Table3.*'); NEXT i FOR i = 1 to FieldValueCount('Tablename')   LET vTable = FieldValue('Tablename', $(i));   LEFT JOIN (TradesTable) LOAD * RESIDENT [$(vTable)];   DROP TABLE [$(vTable)]; NEXT i We’re almost done. Let’s visualize the data in a nice report now. Select “Create new sheet” on the Qlik “App overview” page and now add tables and charts to present your data. My example can be seen below: Just click “Done” on the top of the screen and you’ll be able to see the end user view: browse the data, filter it and all charts will dynamically update based on your selection. Job done! Continue reading on Page 2 by clicking below.

Read More

Forget on-premises: InfoArchive, Docker and Amazon AWS

InfoArchive

There are two buzzwords that we have heard in the IT world for some time now: Cloud and Containerization. For me, 2016 proved that these two topics have changed from hype to reality, even in the biggest enterprises, and a lot of customers were asking for our solutions, like OpenText™ InfoArchive,  in public clouds and/or running as Docker containers. While our engineering and PS teams are doing a great job in providing these solutions, I decided to walk this route myself. Follow me  on the journey if you’re interested. I started my tests by creating a Docker hub account. The account private repository will be used to store the InfoArchive Docker images and automatically deploy from there. It is very easy to create a Docker container from InfoArchive – talk to me if you want to know more. It takes just a couple of steps and you’ll have your InfoArchive Docker container image ready. What’s next? Now let’s run this image in Amazon EC2 Container Services (ECS). Welcome to the “cloud world” If you’re new to the Amazon world you might have difficulty understanding some of the terminology around Amazon ECS. I hope this post will help you with this. ECS cluster In the first step we need an ECS Cluster. ECS Cluster of EC2 instances and services. EC2 instances are our “good old” virtual machines and represent our available compute resources. The work that you assign to the cluster is described as “services”. The picture below shows that our InfoArchive cluster started with 3 micro servers (each of them automatically initiated by ECS from the below amzn-ami… VM image): Within a minute your cluster compute resources are running and waiting for you to assign them some work. Ignore the memory values in the below screenshot – I took the screenshot with 3 running tasks occupying the memory already. InfoArchive is a “classic” three-tiered architecture product: Native XML database xDB at the backend, InfoArchive server as middleware and InfoArchive web UI. To prepare for scalability requirements of our deployment we’ll run each of the tiers as dedicated containers. We’ll “front-end” each of the tiers with an EC2 load balancer. This approach will also simplify the configuration of the container instances, since each container instance will have to connect to the underlying load balancer only (with known static hostname/IP) instead of trying to connect with the constantly changing IP addresses of the container instances. On a very high level the architecture can be depicted as shown below: EC2 load balancers are set up quickly  – my list (shown below) contains 4 instances since I’ve also configured a dedicated public load balancer for xDB connectivity. With this step completed the ECS cluster, its compute resources and the cluster load balancers are prepared. Let’s put InfoArchive on the cluster now.

Read More