Import glibc-mainline for 2006-08-16

git-svn-id: svn://svn.eglibc.org/fsf/trunk@4 7b3dc134-2b1b-0410-93df-9e9f96275f8d
author: gcc <gcc@7b3dc134-2b1b-0410-93df-9e9f96275f8d> 2006-08-17 01:18:26 +0000
committer: gcc <gcc@7b3dc134-2b1b-0410-93df-9e9f96275f8d> 2006-08-17 01:18:26 +0000
commit: 15f34685e7a9b5caf761af2ebf6afa20438d440b (patch)
tree: dc04ce3cdf040f198743c15b64557824de174680 /libc/manual/search.texi
parent: 1e848e0e775a36f6359161f5deb890942ef42ff3 (diff)
1 files changed, 564 insertions, 0 deletions
diff --git a/libc/manual/search.texi b/libc/manual/search.texi
new file mode 100644
index 000000000..1ac2653f1
--- /dev/null
+++ b/libc/manual/search.texi
@@ -0,0 +1,564 @@
+@node Searching and Sorting, Pattern Matching, Message Translation, Top
+@c %MENU% General searching and sorting functions
+@chapter Searching and Sorting
+
+This chapter describes functions for searching and sorting arrays of
+arbitrary objects.  You pass the appropriate comparison function to be
+applied as an argument, along with the size of the objects in the array
+and the total number of elements.
+
+@menu
+* Comparison Functions::        Defining how to compare two objects.
+				 Since the sort and search facilities
+                                 are general, you have to specify the
+                                 ordering.
+* Array Search Function::       The @code{bsearch} function.
+* Array Sort Function::         The @code{qsort} function.
+* Search/Sort Example::         An example program.
+* Hash Search Function::        The @code{hsearch} function.
+* Tree Search Function::        The @code{tsearch} function.
+@end menu
+
+@node Comparison Functions
+@section Defining the Comparison Function
+@cindex Comparison Function
+
+In order to use the sorted array library functions, you have to describe
+how to compare the elements of the array.
+
+To do this, you supply a comparison function to compare two elements of
+the array.  The library will call this function, passing as arguments
+pointers to two array elements to be compared.  Your comparison function
+should return a value the way @code{strcmp} (@pxref{String/Array
+Comparison}) does: negative if the first argument is ``less'' than the
+second, zero if they are ``equal'', and positive if the first argument
+is ``greater''.
+
+Here is an example of a comparison function which works with an array of
+numbers of type @code{double}:
+
+@smallexample
+int
+compare_doubles (const void *a, const void *b)
+@{
+  const double *da = (const double *) a;
+  const double *db = (const double *) b;
+
+  return (*da > *db) - (*da < *db);
+@}
+@end smallexample
+
+The header file @file{stdlib.h} defines a name for the data type of
+comparison functions.  This type is a GNU extension.
+
+@comment stdlib.h
+@comment GNU
+@tindex comparison_fn_t
+@smallexample
+int comparison_fn_t (const void *, const void *);
+@end smallexample
+
+@node Array Search Function
+@section Array Search Function
+@cindex search function (for arrays)
+@cindex binary search function (for arrays)
+@cindex array search function
+
+Generally searching for a specific element in an array means that
+potentially all elements must be checked.  The GNU C library contains
+functions to perform linear search.  The prototypes for the following
+two functions can be found in @file{search.h}.
+
+@comment search.h
+@comment SVID
+@deftypefun {void *} lfind (const void *@var{key}, void *@var{base}, size_t *@var{nmemb}, size_t @var{size}, comparison_fn_t @var{compar})
+The @code{lfind} function searches in the array with @code{*@var{nmemb}}
+elements of @var{size} bytes pointed to by @var{base} for an element
+which matches the one pointed to by @var{key}.  The function pointed to
+by @var{compar} is used decide whether two elements match.
+
+The return value is a pointer to the matching element in the array
+starting at @var{base} if it is found.  If no matching element is
+available @code{NULL} is returned.
+
+The mean runtime of this function is @code{*@var{nmemb}}/2.  This
+function should only be used elements often get added to or deleted from
+the array in which case it might not be useful to sort the array before
+searching.
+@end deftypefun
+
+@comment search.h
+@comment SVID
+@deftypefun {void *} lsearch (const void *@var{key}, void *@var{base}, size_t *@var{nmemb}, size_t @var{size}, comparison_fn_t @var{compar})
+The @code{lsearch} function is similar to the @code{lfind} function.  It
+searches the given array for an element and returns it if found.  The
+difference is that if no matching element is found the @code{lsearch}
+function adds the object pointed to by @var{key} (with a size of
+@var{size} bytes) at the end of the array and it increments the value of
+@code{*@var{nmemb}} to reflect this addition.
+
+This means for the caller that if it is not sure that the array contains
+the element one is searching for the memory allocated for the array
+starting at @var{base} must have room for at least @var{size} more
+bytes.  If one is sure the element is in the array it is better to use
+@code{lfind} so having more room in the array is always necessary when
+calling @code{lsearch}.
+@end deftypefun
+
+To search a sorted array for an element matching the key, use the
+@code{bsearch} function.  The prototype for this function is in
+the header file @file{stdlib.h}.
+@pindex stdlib.h
+
+@comment stdlib.h
+@comment ISO
+@deftypefun {void *} bsearch (const void *@var{key}, const void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare})
+The @code{bsearch} function searches the sorted array @var{array} for an object
+that is equivalent to @var{key}.  The array contains @var{count} elements,
+each of which is of size @var{size} bytes.
+
+The @var{compare} function is used to perform the comparison.  This
+function is called with two pointer arguments and should return an
+integer less than, equal to, or greater than zero corresponding to
+whether its first argument is considered less than, equal to, or greater
+than its second argument.  The elements of the @var{array} must already
+be sorted in ascending order according to this comparison function.
+
+The return value is a pointer to the matching array element, or a null
+pointer if no match is found.  If the array contains more than one element
+that matches, the one that is returned is unspecified.
+
+This function derives its name from the fact that it is implemented
+using the binary search algorithm.
+@end deftypefun
+
+@node Array Sort Function
+@section Array Sort Function
+@cindex sort function (for arrays)
+@cindex quick sort function (for arrays)
+@cindex array sort function
+
+To sort an array using an arbitrary comparison function, use the
+@code{qsort} function.  The prototype for this function is in
+@file{stdlib.h}.
+@pindex stdlib.h
+
+@comment stdlib.h
+@comment ISO
+@deftypefun void qsort (void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare})
+The @var{qsort} function sorts the array @var{array}.  The array contains
+@var{count} elements, each of which is of size @var{size}.
+
+The @var{compare} function is used to perform the comparison on the
+array elements.  This function is called with two pointer arguments and
+should return an integer less than, equal to, or greater than zero
+corresponding to whether its first argument is considered less than,
+equal to, or greater than its second argument.
+
+@cindex stable sorting
+@strong{Warning:} If two objects compare as equal, their order after
+sorting is unpredictable.  That is to say, the sorting is not stable.
+This can make a difference when the comparison considers only part of
+the elements.  Two elements with the same sort key may differ in other
+respects.
+
+If you want the effect of a stable sort, you can get this result by
+writing the comparison function so that, lacking other reason
+distinguish between two elements, it compares them by their addresses.
+Note that doing this may make the sorting algorithm less efficient, so
+do it only if necessary.
+
+Here is a simple example of sorting an array of doubles in numerical
+order, using the comparison function defined above (@pxref{Comparison
+Functions}):
+
+@smallexample
+@{
+  double *array;
+  int size;
+  @dots{}
+  qsort (array, size, sizeof (double), compare_doubles);
+@}
+@end smallexample
+
+The @code{qsort} function derives its name from the fact that it was
+originally implemented using the ``quick sort'' algorithm.
+
+The implementation of @code{qsort} in this library might not be an
+in-place sort and might thereby use an extra amount of memory to store
+the array.
+@end deftypefun
+
+@node Search/Sort Example
+@section Searching and Sorting Example
+
+Here is an example showing the use of @code{qsort} and @code{bsearch}
+with an array of structures.  The objects in the array are sorted
+by comparing their @code{name} fields with the @code{strcmp} function.
+Then, we can look up individual objects based on their names.
+
+@comment This example is dedicated to the memory of Jim Henson.  RIP.
+@smallexample
+@include search.c.texi
+@end smallexample
+
+@cindex Kermit the frog
+The output from this program looks like:
+
+@smallexample
+Kermit, the frog
+Piggy, the pig
+Gonzo, the whatever
+Fozzie, the bear
+Sam, the eagle
+Robin, the frog
+Animal, the animal
+Camilla, the chicken
+Sweetums, the monster
+Dr. Strangepork, the pig
+Link Hogthrob, the pig
+Zoot, the human
+Dr. Bunsen Honeydew, the human
+Beaker, the human
+Swedish Chef, the human
+
+Animal, the animal
+Beaker, the human
+Camilla, the chicken
+Dr. Bunsen Honeydew, the human
+Dr. Strangepork, the pig
+Fozzie, the bear
+Gonzo, the whatever
+Kermit, the frog
+Link Hogthrob, the pig
+Piggy, the pig
+Robin, the frog
+Sam, the eagle
+Swedish Chef, the human
+Sweetums, the monster
+Zoot, the human
+
+Kermit, the frog
+Gonzo, the whatever
+Couldn't find Janice.
+@end smallexample
+
+
+@node Hash Search Function
+@section The @code{hsearch} function.
+
+The functions mentioned so far in this chapter are searching in a sorted
+or unsorted array.  There are other methods to organize information
+which later should be searched.  The costs of insert, delete and search
+differ.  One possible implementation is using hashing tables.
+The following functions are declared in the the header file @file{search.h}.
+
+@comment search.h
+@comment SVID
+@deftypefun int hcreate (size_t @var{nel})
+The @code{hcreate} function creates a hashing table which can contain at
+least @var{nel} elements.  There is no possibility to grow this table so
+it is necessary to choose the value for @var{nel} wisely.  The used
+methods to implement this function might make it necessary to make the
+number of elements in the hashing table larger than the expected maximal
+number of elements.  Hashing tables usually work inefficient if they are
+filled 80% or more.  The constant access time guaranteed by hashing can
+only be achieved if few collisions exist.  See Knuth's ``The Art of
+Computer Programming, Part 3: Searching and Sorting'' for more
+information.
+
+The weakest aspect of this function is that there can be at most one
+hashing table used through the whole program.  The table is allocated
+in local memory out of control of the programmer.  As an extension the
+GNU C library provides an additional set of functions with an reentrant
+interface which provide a similar interface but which allow to keep
+arbitrarily many hashing tables.
+
+It is possible to use more than one hashing table in the program run if
+the former table is first destroyed by a call to @code{hdestroy}.
+
+The function returns a non-zero value if successful.  If it return zero
+something went wrong.  This could either mean there is already a hashing
+table in use or the program runs out of memory.
+@end deftypefun
+
+@comment search.h
+@comment SVID
+@deftypefun void hdestroy (void)
+The @code{hdestroy} function can be used to free all the resources
+allocated in a previous call of @code{hcreate}.  After a call to this
+function it is again possible to call @code{hcreate} and allocate a new
+table with possibly different size.
+
+It is important to remember that the elements contained in the hashing
+table at the time @code{hdestroy} is called are @emph{not} freed by this
+function.  It is the responsibility of the program code to free those
+strings (if necessary at all).  Freeing all the element memory is not
+possible without extra, separately kept information since there is no
+function to iterate through all available elements in the hashing table.
+If it is really necessary to free a table and all elements the
+programmer has to keep a list of all table elements and before calling
+@code{hdestroy} s/he has to free all element's data using this list.
+This is a very unpleasant mechanism and it also shows that this kind of
+hashing tables is mainly meant for tables which are created once and
+used until the end of the program run.
+@end deftypefun
+
+Entries of the hashing table and keys for the search are defined using
+this type:
+
+@deftp {Data type} {struct ENTRY}
+Both elements of this structure are pointers to zero-terminated strings.
+This is a limiting restriction of the functionality of the
+@code{hsearch} functions.  They can only be used for data sets which use
+the NUL character always and solely to terminate the records.  It is not
+possible to handle general binary data.
+
+@table @code
+@item char *key
+Pointer to a zero-terminated string of characters describing the key for
+the search or the element in the hashing table.
+@item char *data
+Pointer to a zero-terminated string of characters describing the data.
+If the functions will be called only for searching an existing entry
+this element might stay undefined since it is not used.
+@end table
+@end deftp
+
+@comment search.h
+@comment SVID
+@deftypefun {ENTRY *} hsearch (ENTRY @var{item}, ACTION @var{action})
+To search in a hashing table created using @code{hcreate} the
+@code{hsearch} function must be used.  This function can perform simple
+search for an element (if @var{action} has the @code{FIND}) or it can
+alternatively insert the key element into the hashing table.  Entries
+are never replaced.
+
+The key is denoted by a pointer to an object of type @code{ENTRY}.  For
+locating the corresponding position in the hashing table only the
+@code{key} element of the structure is used.
+
+If an entry with matching key is found the @var{action} parameter is
+irrelevant.  The found entry is returned.  If no matching entry is found
+and the @var{action} parameter has the value @code{FIND} the function
+returns a @code{NULL} pointer.  If no entry is found and the
+@var{action} parameter has the value @code{ENTER} a new entry is added
+to the hashing table which is initialized with the parameter @var{item}.
+A pointer to the newly added entry is returned.
+@end deftypefun
+
+As mentioned before the hashing table used by the functions described so
+far is global and there can be at any time at most one hashing table in
+the program.  A solution is to use the following functions which are a
+GNU extension.  All have in common that they operate on a hashing table
+which is described by the content of an object of the type @code{struct
+hsearch_data}.  This type should be treated as opaque, none of its
+members should be changed directly.
+
+@comment search.h
+@comment GNU
+@deftypefun int hcreate_r (size_t @var{nel}, struct hsearch_data *@var{htab})
+The @code{hcreate_r} function initializes the object pointed to by
+@var{htab} to contain a hashing table with at least @var{nel} elements.
+So this function is equivalent to the @code{hcreate} function except
+that the initialized data structure is controlled by the user.
+
+This allows having more than one hashing table at one time.  The memory
+necessary for the @code{struct hsearch_data} object can be allocated
+dynamically.  It must be initialized with zero before calling this
+function.
+
+The return value is non-zero if the operation were successful.  if the
+return value is zero something went wrong which probably means the
+programs runs out of memory.
+@end deftypefun
+
+@comment search.h
+@comment GNU
+@deftypefun void hdestroy_r (struct hsearch_data *@var{htab})
+The @code{hdestroy_r} function frees all resources allocated by the
+@code{hcreate_r} function for this very same object @var{htab}.  As for
+@code{hdestroy} it is the programs responsibility to free the strings
+for the elements of the table.
+@end deftypefun
+
+@comment search.h
+@comment GNU
+@deftypefun int hsearch_r (ENTRY @var{item}, ACTION @var{action}, ENTRY **@var{retval}, struct hsearch_data *@var{htab})
+The @code{hsearch_r} function is equivalent to @code{hsearch}.  The
+meaning of the first two arguments is identical.  But instead of
+operating on a single global hashing table the function works on the
+table described by the object pointed to by @var{htab} (which is
+initialized by a call to @code{hcreate_r}).
+
+Another difference to @code{hcreate} is that the pointer to the found
+entry in the table is not the return value of the functions.  It is
+returned by storing it in a pointer variables pointed to by the
+@var{retval} parameter.  The return value of the function is an integer
+value indicating success if it is non-zero and failure if it is zero.
+In the latter case the global variable @var{errno} signals the reason for
+the failure.
+
+@table @code
+@item ENOMEM
+The table is filled and @code{hsearch_r} was called with an so far
+unknown key and @var{action} set to @code{ENTER}.
+@item ESRCH
+The @var{action} parameter is @code{FIND} and no corresponding element
+is found in the table.
+@end table
+@end deftypefun
+
+
+@node Tree Search Function
+@section The @code{tsearch} function.
+
+Another common form to organize data for efficient search is to use
+trees.  The @code{tsearch} function family provides a nice interface to
+functions to organize possibly large amounts of data by providing a mean
+access time proportional to the logarithm of the number of elements.
+The GNU C library implementation even guarantees that this bound is
+never exceeded even for input data which cause problems for simple
+binary tree implementations.
+
+The functions described in the chapter are all described in the @w{System
+V} and X/Open specifications and are therefore quite portable.
+
+In contrast to the @code{hsearch} functions the @code{tsearch} functions
+can be used with arbitrary data and not only zero-terminated strings.
+
+The @code{tsearch} functions have the advantage that no function to
+initialize data structures is necessary.  A simple pointer of type
+@code{void *} initialized to @code{NULL} is a valid tree and can be
+extended or searched.  The prototypes for these functions can be found
+in the header file @file{search.h}.
+
+@comment search.h
+@comment SVID
+@deftypefun {void *} tsearch (const void *@var{key}, void **@var{rootp}, comparison_fn_t @var{compar})
+The @code{tsearch} function searches in the tree pointed to by
+@code{*@var{rootp}} for an element matching @var{key}.  The function
+pointed to by @var{compar} is used to determine whether two elements
+match.  @xref{Comparison Functions}, for a specification of the functions
+which can be used for the @var{compar} parameter.
+
+If the tree does not contain a matching entry the @var{key} value will
+be added to the tree.  @code{tsearch} does not make a copy of the object
+pointed to by @var{key} (how could it since the size is unknown).
+Instead it adds a reference to this object which means the object must
+be available as long as the tree data structure is used.
+
+The tree is represented by a pointer to a pointer since it is sometimes
+necessary to change the root node of the tree.  So it must not be
+assumed that the variable pointed to by @var{rootp} has the same value
+after the call.  This also shows that it is not safe to call the
+@code{tsearch} function more than once at the same time using the same
+tree.  It is no problem to run it more than once at a time on different
+trees.
+
+The return value is a pointer to the matching element in the tree.  If a
+new element was created the pointer points to the new data (which is in
+fact @var{key}).  If an entry had to be created and the program ran out
+of space @code{NULL} is returned.
+@end deftypefun
+
+@comment search.h
+@comment SVID
+@deftypefun {void *} tfind (const void *@var{key}, void *const *@var{rootp}, comparison_fn_t @var{compar})
+The @code{tfind} function is similar to the @code{tsearch} function.  It
+locates an element matching the one pointed to by @var{key} and returns
+a pointer to this element.  But if no matching element is available no
+new element is entered (note that the @var{rootp} parameter points to a
+constant pointer).  Instead the function returns @code{NULL}.
+@end deftypefun
+
+Another advantage of the @code{tsearch} function in contrast to the
+@code{hsearch} functions is that there is an easy way to remove
+elements.
+
+@comment search.h
+@comment SVID
+@deftypefun {void *} tdelete (const void *@var{key}, void **@var{rootp}, comparison_fn_t @var{compar})
+To remove a specific element matching @var{key} from the tree
+@code{tdelete} can be used.  It locates the matching element using the
+same method as @code{tfind}.  The corresponding element is then removed
+and a pointer to the parent of the deleted node is returned by the
+function.  If there is no matching entry in the tree nothing can be
+deleted and the function returns @code{NULL}.  If the root of the tree
+is deleted @code{tdelete} returns some unspecified value not equal to
+@code{NULL}.
+@end deftypefun
+
+@comment search.h
+@comment GNU
+@deftypefun void tdestroy (void *@var{vroot}, __free_fn_t @var{freefct})
+If the complete search tree has to be removed one can use
+@code{tdestroy}.  It frees all resources allocated by the @code{tsearch}
+function to generate the tree pointed to by @var{vroot}.
+
+For the data in each tree node the function @var{freefct} is called.
+The pointer to the data is passed as the argument to the function.  If
+no such work is necessary @var{freefct} must point to a function doing
+nothing.  It is called in any case.
+
+This function is a GNU extension and not covered by the @w{System V} or
+X/Open specifications.
+@end deftypefun
+
+In addition to the function to create and destroy the tree data
+structure, there is another function which allows you to apply a
+function to all elements of the tree.  The function must have this type:
+
+@smallexample
+void __action_fn_t (const void *nodep, VISIT value, int level);
+@end smallexample
+
+The @var{nodep} is the data value of the current node (once given as the
+@var{key} argument to @code{tsearch}).  @var{level} is a numeric value
+which corresponds to the depth of the current node in the tree.  The
+root node has the depth @math{0} and its children have a depth of
+@math{1} and so on.  The @code{VISIT} type is an enumeration type.
+
+@deftp {Data Type} VISIT
+The @code{VISIT} value indicates the status of the current node in the
+tree and how the function is called.  The status of a node is either
+`leaf' or `internal node'.  For each leaf node the function is called
+exactly once, for each internal node it is called three times: before
+the first child is processed, after the first child is processed and
+after both children are processed.  This makes it possible to handle all
+three methods of tree traversal (or even a combination of them).
+
+@table @code
+@item preorder
+The current node is an internal node and the function is called before
+the first child was processed.
+@item postorder
+The current node is an internal node and the function is called after
+the first child was processed.
+@item endorder
+The current node is an internal node and the function is called after
+the second child was processed.
+@item leaf
+The current node is a leaf.
+@end table
+@end deftp
+
+@comment search.h
+@comment SVID
+@deftypefun void twalk (const void *@var{root}, __action_fn_t @var{action})
+For each node in the tree with a node pointed to by @var{root}, the
+@code{twalk} function calls the function provided by the parameter
+@var{action}.  For leaf nodes the function is called exactly once with
+@var{value} set to @code{leaf}.  For internal nodes the function is
+called three times, setting the @var{value} parameter or @var{action} to
+the appropriate value.  The @var{level} argument for the @var{action}
+function is computed while descending the tree with increasing the value
+by one for the descend to a child, starting with the value @math{0} for
+the root node.
+
+Since the functions used for the @var{action} parameter to @code{twalk}
+must not modify the tree data, it is safe to run @code{twalk} in more
+than one thread at the same time, working on the same tree.  It is also
+safe to call @code{tfind} in parallel.  Functions which modify the tree
+must not be used, otherwise the behavior is undefined.
+@end deftypefun
author	gcc <gcc@7b3dc134-2b1b-0410-93df-9e9f96275f8d>	2006-08-17 01:18:26 +0000
committer	gcc <gcc@7b3dc134-2b1b-0410-93df-9e9f96275f8d>	2006-08-17 01:18:26 +0000
commit	15f34685e7a9b5caf761af2ebf6afa20438d440b (patch)
tree	dc04ce3cdf040f198743c15b64557824de174680 /libc/manual/search.texi
parent	1e848e0e775a36f6359161f5deb890942ef42ff3 (diff)