The choice of height function (input) depends on the application, and the basic watershed algorithm operates independently of that choice. For intensity-based image data, you might typically use some sort of gradient magnitude calculation as input. (see itk::GradientMagnitudeImageFilter)
The watershed algorithm proceeds in several steps. First, an initial classification of all points into catchment basin regions is done by tracing each point down its path of steepest descent to a local minima. Next, neighboring regions and the boundaries between them are analyzed according to some saliency measure (such as minimum boundary height) to produce a tree of merges among adjacent regions. These merges occur at different maximum saliency values. The collective set of all possible merges up to a specified saliency ``flood level'' is referred to in this documentation as a ``merge tree''. Metaphorically, the flood level is a value that reflects the amount of precipitation that is rained into the catchment basins. As the flood level rises, boundaries between adjacent segments erode and those segments merge. The minimum value of the flood level is zero and the maximum value is the difference between the highest and lowest values in the input image.
Note that once the initial analysis and segmentation is done to produce the merge tree, it is trivial to produce a hierarchy of labeled images in constant time. The complexity of the algorithm is in the computation of the merge tree. Once that tree has been created, the initial segmented image can be relabeled to reflect any maximum saliency value found in the tree by simply identifying a subset of segment merges from the tree.
The following is a brief application of segmenting 2d image data with the watershed algorithm using ITK
The following is 24 bit RGB image taken from Visible Human data. The data shown in this application is two-dimensional, but all the filters used below are completely generalized for higher dimensional data.


The following was produced from the color image above using 15 iterations of "VectorGradientAnisotropicDiffusionImageFilter" with a ConductanceTerm parameter of 0.5 and a TimeStep parameter of 0.125. Similar diffusion filters exist for scalar data.

Once the data has been smoothed, we compute an height function for the input to the WatershedImageFilter. The following edge image was produced with an itk gradient magnitude image filter.

The following is the result of watershed segmentation on the preceding edge image. The Level value is purposely set low to show some oversegmentation (too many regions identified). Each unsigned long integer label value has been mapped to a unique RGB color value for display purposes.

Increasing the Level parameter merges segments in the original image to produce an output with fewer labeled regions. To view segments with "softer'' boundaries (often finer detailed structures), set the Level to a lower value. Raise the Level parameter to capture grosser structures with more defined boundaries.
After a merge hierarchy has been created, changing the Level parameter can produce a new output in constant time as long as the Level is not raised above its initial setting. The filter saves state between updates and only re-executes if the requested Level is not already in the hierarchy (or the input data has changed). Most of the complexity of the watershed algorithm is in the generation of the hierarchy, and experience shows that for many medical image data sets, only the first 10-20 percent of the complete hierarchy is of interest. Setting the Level to a low value can therefore significantly save computation time.

For more detailed information on using the WatershedImageFilter, see the inline documentation of itkWatershedImageFilter.h and associated classes. See also InsightApplications/WatershedSegmentation.