The Data Visualisation Catalogue Blog




Chart Snapshot: DocuBurst

A DocuBurst is a chart that attempts to answer ‘what is this document about?’ through visualising the text-based content. The DocuBurst visualisation provides a summarised view of one or more documents by using a Sunburst Diagram that is generated based on word semantic relationships found in the lexical database WordNet. When visualising multiple documents, a DocuBurst can be used as a comparison tool for textual analysis between documents.

The occurrence counts of words in the document are overlayed onto the segments of the Sunburst Diagram to provide insights into the frequency and distribution of terms. Colour intensity (alpha) is also used to represent word frequency. When comparing multiple documents, colour is used to distinguish words from different document sources.

The centre of the Sunburst Diagram contains the root word or synset of interest and all child nodes proceed outwards and segmented in each concentric ring that represents a hierarchical level. The angular widths are proportional to the size of the subtree rooted at that node.

Interactivity can be used on a DocuBurst to zoom, filter, highlight, or refocus target words of interest.

Links to tools or code that can draw a DocuBurst:

The DocuBurst web tool is no longer online, but the video below demonstrates it’s functionality:

Examples of DocuBurst

Figure 1: DocuBurst of a science textbook rooted at {idea}.
DocuBurst: Visualizing Document Content Using Language Structure — Christopher Collins / Vialab

docuburst

Visualizing Alice in Wonderland with “Animal” as the root word?
DocuBurst — Infoviz Wiki

docuburst

DocuBurst graph rooted at “atmospheric phenomenon”.
infoviz.info

docuburst

Occurrences of words in the document of interest (a science textbook).
infoviz.info

docuburst
Chart Types Data Visualization

Next post:

Chart Snapshot: Fan Charts

Blog Home

Previous post:

Chart Snapshot: Tanglegrams