Tuesday, September 17, 2013

WebCenter Enterprise Capture - Introduction and Overview

Hey there everyone,

As you may or may not know by now, Oracle has finally invited Oracle Document Capture & Oracle Distributed Document Capture into the 11g family. 

As of PS7, release 11.1.1.8, ODC is now called Oracle Enterprise Capture. For the purposes of this article, ODC 10g and ODDC 10g will be referred to as ODC and ODDC, respectively. the 11g version will use the WEC acronym. 

Since this is a complete change from the 10g product, these articles will be broken into simple parts. The idea for these articles is to provide a comfort level on an administrative and user level within the new 11g application. 


Introduction and Overview


There is no longer any separation between ODC and ODDC. WEC is a single thin-client entity. There is separation between the admin console and the user client. Those are two key terms "console" and "client".





The default port and web root for these two webapps are:

http://<dns>:16400/dc-console
http://<dns>:16400/dc-client

The first is for the console, which is the new admin and manager interface. The second, the client webapp. This is where the actual scanning and indexing will take place. The console will provide access to all of the security, metadata, classification, capture, processing, & commit configurations. We'll go further detail on each of these shortly.

Now that you know the two main webapps, you'll next notice that they are ADF webapps. You surely recognize the look and feel of the WebCenter ADF log in page. If that didn't do it, then you can see the familiar url pattern of a ADF application. 


High-level Application Organization


Batches & documents are still the key logical groupings in WEC. There is one new, higher level construct now: the workspace.




The workspace is the new high-level container for all configurations. You can think of everything created within ODC 10g as within the one and only default workspace. In 11g, you can create mutually exclusive workspaces that have their own commit configurations, metadata, security, users, etc. 

In 11g, you can create multiple workspaces. In fact, you must create one before you can start configuring any of your scan configurations. You do not get a default workspace. You can see from the above screenshot what the console looks like with a workspace selected.

Using multiple workspaces allows managers to to divide different levels of access for users. It also allows for creating separate environments into a single workstation. There are a number of benefits allotted by this configuration. 

You can clone workspaces as well, so setup time can be lowered for similar workspaces. We'll also look into how workspaces are stored so we can see if they can be easily migrated between environments.


Processors


Workspaces in WEC allow for the configuration of 'processors'. These are the new manifestations of the Import, Commit, and Recognition servers. In addition to these, there is a new entity called the "Document Conversion Processor". 

The Import Processor has trimmed a few of the available jobs. ODC 10g offered FTP, Fax, custom, email, and folder providers. WEC 11g now offers email, folder, and 'list file' jobs. 'list file' is similar to the options available in the 10g folder provider, but now it's been separated out to its own top-level option. Look for more information on the Import Processor to show the new configuration options in an upcoming post.




The Document Conversion Processor is the engine that exposes the OutsideIn Technology (OIT) conversion features of WEC. In 10g, the conversion features were sprinkled around the application  but now things have been organized out a bit more. Also, in 10g, it took a registry hack to enable OIT, but it's default in 11g. More about the Document Conversion Processor coming soon.

The Recognition Processor is still similar to the 10g Recognition Server. There are a few differences, but the general purpose and function of this processor is the same. ** Note that my patched installation does not currently show the Recognition Processor Jobs table as expected. The documentation and help pages seem to expect two tables on this page, but I'm only seeing one:




The Commit Processor contains individual profiles that can be compared to the file cabinets used in 10g. These are similar to the options available in 10g, but they have been streamlined. You'll see that the configuration options have less than in 10g due to the re-organization of the metadata, lookups, and relationships being moved up to the workspace. One thing to note is that there are three available output formats: TIFF, PDF Image-Only, or Searchable PDF. ** Note that my current installation (11.1.1.8 MLR01) does not contain the Searchable PDF option. I'm currently looking into why my installation seems to differ from the documentation.  (see update at bottom)






High-Level Flow

Many of the processing configurations have a 'post-processing' configuration available for configuration of the next processor. The available options are 'Document Conversion', 'Recognition Processor'**, 'Commit Processor'. 

** Similar to the Recognition Processor configurations mentioned above, my MLR01 patched instance does not show this option. This is what's shown ootb, I'll be sure to show how to get these configurations enabled in a subsequent article. (see update at bottom)

The typical starting points are either an Import Processor job or a capture-enabled Client Profile. Client Profiles will be covered in the next article.

If enabled, Document Conversion and Recognition processes are triggered after initial ingestion. 

Finally, the commit profiles are processed. 

Customization

The 10g VBA macros appear to be deprecated and not compatible with 11g. I will look further into this in an upcoming article, but I wanted to at least mention this in the overview article.  

The new customization language is JavaScript. The JavaScript engine is the default in your JRE. This used to be the Rhino JavaScript engine, but I'm not sure if this is still the case. You can at least get an idea of the underlying engine here.

The documentation states that there are three locations where JS scripts can be leveraged:

  • Client
  • Import Processor
  • Recognition Processor
We'll show custom examples of each in the upcoming Customization article.


Summary

The purpose of this article is to show the new interfaces, the new configuration options, and to relate some of the concepts back to the well-known 10g application.

Each of the concepts shown here will be explored in full detail in upcoming articles through fully implemented examples. In addition, any issues, glitches, gotchas, or confusion points will be discussed and detailed as well.

The next article will introduce Client Profiles and show how they related to the Classification configurations as well as the Metadata configurations.

Look for the next article later this week!

Thanks all,
-ryan

** Update @ 11:14AM - In regards to the missing Recognition Server and Searchable PDF options, it turns out that these are features only available on Windows Installations. Very important point to keep in mind when spec'ing out your environment!