19  Document Similarity and Distance

19.1 Objectives

  • Distance and similarity measures
  • Measuring similarity
  • Measuring distance
  • Clustering
  • Multi-dimensional scaling
  • Network analysis of document connections

19.2 Methods

Applicable methods for the objectives listed above.

19.3 Examples

Examples here.

19.4 Issues

Weighting and feature selection and its effects on similarity and distance.

Computational issues.

19.5 Further Reading

Additional resources from libraries or the web.

19.6 Exercises

Add some here.