Topics form a network with an implied hierarchy. The figure above illustrates the relationship between a hypothetical set of topics and documents. Any given document contains a set of topics. Solid lines represent paths from topics to the documents they are contained in.

Topic A is related to any topic that occurs with it in a document. For example, Topic A and C both are found in Document 2. Topic A is found in more documents than Topic C, so Topic A is an implied parent of Topic C as expressed by the line connecting both.

For example, Topic A is found in Documents 1, 2, 3 and 5 (as well as an approximate 20,000 additional documents). Topic B is found in Documents 3, 4 and 5 (as well as an approximate 2,000 additional documents).

Topification uses this topic hierarchy network approach to navigate and display search results. Topic networking characteristics become apparent when studying paths to Document 4. Topic A is not found in Document 4, but both Topic B and Topic D are found in other documents with Topic A. If Topic A is selected as a search constraint, then Topics B, C, D, E, F, G, and H are viable topic results since they are found in common documents along with Topic A.

The diagram’s bands indicate the (relative) number of documents that contain a given topic. So, Topic A at the top of the diagram is contained in more documents than any other topic. Topic B is found in fewer documents than Topic A, but in more documents than Topics C, D, E, F, G, or H.

Notice that even though Document 4 does not contain Topic A, it is on a path from Topic B or Topic D. So picking Topic B and then Topic D would lead to the display of Document 4 as a relevant search result.

The implied hierarchy is a result of the frequency that a topic occurs in the document set. A topic that appears in many documents is less specific and, therefore, higher in the hierarchy than a topic that appears in just one.

Topic D has two implied parents: Topics A and B. This means coverage in the topic selection process is extensive because there are multiple paths to relevant results. Taxonomies do not have this networking property. There is only one parent for each child in taxonomy.