Thursday, July 12, 2012

Diagnosing and Reporting a Product Bug

When upgrading a customized UCM instance from a pre-10g release to 11g, we encountered a subtle but critical bug in OTS (Oracle Text Search), where the result of an incremental index update sometimes differed from that of a full collection index rebuild.

In this case, the client instance had a number of custom metadata fields, including Major Revision and Minor Revision. While Major Revision was a required field, a Minor Revision value was optional. A custom component allowed a user to explicitly search for documents with a blank Minor Revision. Here is where the fun started.

All content that had been migrated from the old instance to the 11g instance could be successfully be retrieved by the component, regardless of its Major Revision and/or Minor Revision values (or lack of value). All newly added content could be retrieved using the custom component…unless the Minor Revision was blank, and the search criteria being passed to the custom component explicitly looked for a blank Minor Revision. In that one case, no search results were returned.

For example, here were the results for a search of Document Number SO-G-108. Note that there are 15 items returned, 9 of which have blank minor revisions:

When we added the condition that the minor revision value must be blank, we got:

Yes, only 7 of the 9. From the Release Dates, we could see that the 2 that were missing had been checked in on April 24, i.e. after the content migration and a full search index rebuild. Just for reference, here’s how I defined that search:

I should add that when I was running the test show above, for the one with the 15-item result set, I also ran it with &IsSoap=1 appended to the URL, and the xppMinorRev values looked identical for all 9 with blank values there, i.e. there was no visible difference in the metadata values for the 7 that came back in the more specific search versus the 2 that did not.

In order to further help pinpoint the issue, we performed another collection index rebuild, and lo and behold, the search now returned all 9 documents with the blank minor revision. We then immediately checked in another document fitting the criteria, and repeated the steps above. Sure enough, the “new” document showed up when only the Document Number was specified, but was not in the search results when the blank Minor Revision was added to the criteria. So, it was time to open a bug report with Oracle, with clear instructions on how to reproduce the errant behavior. Clearly, incremental indexing that occurs upon check-in was suspect here. The bug was opened, and Oracle replied: 

The short version of this long email is:  Bob is right, this is a bug. I’ll open one.  

Searching on nulls is fixed in ps5 for DATABASE.METADATA and DATABASE.FULLTEXT. However, I haven’t checked OracleTextSearch yet, so this is good test.

I created two text fields to test what Bob described. This query is converted to the following format before getting run on the database:

Universal format: xTextField <matches> `SO-G-108` <AND> xMinorRevision <matches> ``
Converted native query: ' ((SO#0023G#0023108) WITHIN xTextField)   and   ((idcnull) WITHIN xMinorRevision) '

Notice that the empty string `` is converted to idcnull for that actual database search.

When I look in the idctext2 table, I can see that empty xMinorRevision fields have ‘idcnull’ as their value.  So far everything makes sense.

select dDocName, xMinorRevision from idctext2 where xTextField LIKE 'SO-G-108';

DDOCNAME                       XMINORREVISION                
------------------------------ ------------------------------
PFLIES-LNX.US.001405           A                             
PFLIES-LNX.US.001406           A                             
PFLIES-LNX.US.001403           idcnull                       
PFLIES-LNX.US.001404           B       

When I run the query described above I get one hit, which is what I expect.
xTextField <matches> `SO-G-108` <AND> xMinorRevision <matches> ``

I can manually run this query on the database using the instructions in the note:
Troubleshooting OracleTextSearch in UCM 10g and 11g: How To Convert UCM Searches into SQL Queries (Doc ID 1333414.1)

This query is exactly the same as what UCM runs. It returns the expected document.
SELECT dDocName FROM idctext2 WHERE CONTAINS(dDocName, '((SO#G#108) WITHIN xTextField)   and   ((idcnull) WITHIN xMinorRevision) ')>0;

DDOCNAME                       XMINORREVISION                
------------------------------ ------------------------------
PFLIES-LNX.US.001403           idcnull 

Now, I go to optimize the field xMinorRevision and run a fast rebuild.  The fast rebuild does what it always does, and that’s update the otsmeta column, adding: 
If a minor revision exists, then it’s set to
The “sd” prefix means that the field is an SDATA section. 
However, and here’s where Bob is right about the problem, new checkins have idcnull set in the xMinorRevision field. But in the otsmeta column, the sdata section lacks IDCNULL. So yeah, looks like a bug. I’ll open one…
New checkins after the fast rebuild have this in otsmeta:

When they should have this:

Oracle subsequently supplied a patch to the client, which resolved the problem.

The intent of this article is twofold:
  1. For any reader who has experienced similar behavior – this is indeed a bug, and Oracle does have a patch to address it. I am unsure if it has been included in any official patch release.
  2. If you suspect a bug in a product, be persistent and methodical in diagnosing and reporting the problem. If the behavior surfaces in a customization (as it did here), your first priority must be to successfully demonstrate the bad behavior WITHOUT the involvement of the customization in the mix. Finally, report the bug in as much detail as possible, and provide detailed instructions on how to reproduce the bad behavior. This will go a long way in getting the problem addressed.