MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Read a CTX index stoplist

with one comment

If you want to know what’s in the default or custom STOPLIST, you can use this set of tools to find, format, and output the words in a STOPLIST. This example requires you to unlock the CTXSYS account. Then, you must grant the SELECT privileges on the CTX_STOPWORDS table to your user.
 
You create a package that contains an associative array that is indexed by a single alphabetic character:

CREATE OR REPLACE PACKAGE tools IS
  TYPE list IS TABLE OF VARCHAR2(4000) INDEX BY VARCHAR2(1);
END tools;
/

You create a function to organize the list of excluded words alphabetically in an associative array:

CREATE OR REPLACE FUNCTION get_stoplist (stoplist VARCHAR2) RETURN TOOLS.LIST IS
 
  -- Define and declare collection.
  TYPE alpha_key IS TABLE OF CHARACTER;
  code ALPHA_KEY := alpha_key('A','B','C','D','E','F','G','H','I','J'
                             ,'K','L','M','N','O','P','Q','R','S','T'
                             ,'U','V','W','X','Y','Z');
 
  -- Define two collections.
  stopword TOOLS.LIST;
 
  -- Define cursor.
  CURSOR c (stoplist VARCHAR2) IS
    SELECT spw_word
    FROM ctxsys.ctx_stopwords
    WHERE spw_stoplist = UPPER(stoplist)
    ORDER BY spw_word;
 
BEGIN
 
  -- Initialize reserved and key word collections.
  FOR i IN 1..code.LAST LOOP
    FOR j IN c(stoplist) LOOP
      IF code(i) = UPPER(SUBSTR(j.spw_word,1,1)) THEN
        IF stopword.EXISTS(code(i)) THEN
          stopword(code(i)) := stopword(code(i)) || ', ' || j.spw_word;
        ELSE
          stopword(code(i)) := j.spw_word;
        END IF;
      END IF;
    END LOOP;
  END LOOP;
  RETURN stopword;
END get_stoplist;
/

You create a function to format the output at the SQL*Plus prompt:

CREATE OR REPLACE FUNCTION format_list (list_in TOOLS.LIST) RETURN BOOLEAN IS
 
  -- Declare control variables.
  CURRENT VARCHAR2(1);
  element VARCHAR2(2000);
  status BOOLEAN := TRUE;
 
BEGIN
 
  -- Read through an alphabetically indexed collection.
  IF list_in.COUNT > 0 THEN
    CURRENT := list_in.FIRST;
    element := list_in(CURRENT);
    WHILE CURRENT IS NOT NULL LOOP
      DBMS_OUTPUT.put_line(CURRENT||' '||element);
      CURRENT := list_in.NEXT(CURRENT);
 
      -- Prevent reading beyond the list with a NULL index value.
      IF CURRENT IS NOT NULL THEN
        element := list_in(CURRENT);
      END IF;
    END LOOP;
  END IF;
  RETURN status;
END format_list;
/

You call the formatted output using an anonymous block:

BEGIN
  IF format_list(get_stoplist('default_stoplist')) THEN
    NULL;
  END IF;
END;
/

You’ll see the following output:

A a, all, almost, also, although, an, and, any, are, as, at
B be, because, been, both, but, by
C can, could
D d, did, do, does
E either
F for, from
H had, has, have, having, he, her, here, hers, him, his, how, however
I i, if, in, into, is, it, its
J just
L ll
M Mr, Mrs, Ms, me, might, my
N no, non, nor, not
O of, on, one, only, onto, or, our, ours
S s, shall, she, should, since, so, some, still, such
T t, than, that, the, their, them, then, there, therefore, these, they, this, those, though, through, thus, to, too
U until
V ve, very
W was, we, were, what, when, where, whether, which, while, who, whose, why, will, with, would
Y yet, you, your, yours

 

There are permutations that would let you see XHTML formatted output. They’re left to your creativity.

The default STOPLIST is created by the $ORACLE_HOME/ctx/admin/defaults/drdeftr.sql script. You can add a word to the excluding STOPLIST by:

BEGIN
  CTX_DDL.ADD_STOPWORD('DEFAULT_STOPLIST','oh');
END;
/

Written by maclochlainn

July 5th, 2008 at 9:01 pm

Posted in Uncategorized

One Response to 'Read a CTX index stoplist'

Subscribe to comments with RSS or TrackBack to 'Read a CTX index stoplist'.

  1. […] The example is here … […]

Leave a Reply