Topics

04 Documentation updates
Server
Full-text indexes larger in R5, smaller in 5.0.5

R5 full-text indexes have been larger than R4 indexes, sometimes by as much as 50 percent. This has mainly been due to a problem with indexing too many attachment types that was resolved in QMR 5.0.5.

R5 index size can now be reduced by upgrading to 5.0.5, which incorporates a Unicode version of the internal GTR (Global Text Retrieval) engine. This version of the engine was actually incorporated into QMR 5.0.3, but not as the default search engine. Customers upgrading to 5.0.3 and 5.0.4 could elect to turn the new engine on with a NOTES.INI setting (see the" Domain Search and the Summarizer - additional information" Release Note on the Summarizer). The disk space advantage of the Unicode engine is achieved by combining terms from all code pages (languages) into a central lexicon. The 5.0.5 implementation of the engine has the added advantage of indexing only appropriate (non-binary) attachment types.

As in R4, the size of a full-text index in R5 and R5.x is related to the size of data, not database size. A small database with a lot of text can generate a larger index than a large database with a lot of design elements. In R4, a full-text index typically was 50 to 80 percent of the size of the data in the database. In R5, the size of the index increased to 75 to 120 percent of the size of the data. In QMR 5.0.5, index size decreases to roughly the same size as in R4.

We recommend that all customers concerned with index size upgrade to QMR 5.0.5.

If you want to reduce the size of your indexes even further, you can turn off attachment indexing by setting the NOTES.INI variable Ft_Index_Attachments to 2.