Bertin Matrices: Introduction
What are Bertin Matrices ?
Among the rich material on graphical presentation of information, in
La Graphique et le Traitement Graphique de l'Information (1977),
engl. Graphics and Graphic Information Processing (1981),
discusses the presentation of data matrices, with a particular view to seriation.
A Tribute to J. Bertin's Graphical Data Analysis,
presented at the SoftStat Conference '97, gives an appraisal of
this aspect of J. Bertin's work.
This is a web version of the SoftStat '97 presentation, recorded in 2014.
The methods discussed in SoftStat '97 have been implemented in the Voyager system. Voyager now is part of Oberon and its descendants, and comes bundled with the current Oberon system from ETHZ.
The methods have been partially re-implemented in R. The R-implementation can be downloaded as a package bertin from http://bertin.r-forge.r-project.org/.
Variables are collected in a matrix to display the complete data set. By convention, J. Bertin shows variables in rows and cases in columns. To make periodic structures more visible, the data may be repeated cyclically. So data for the 12 months of a year appear in 24 columns.
As J. Bertin pointed out, the indexing used is arbitrary. You can rearrange rows and/or columns to reveal the information of interest. If you run a hotel, of course the percentage of hotel occupation and the duration of the visits are most interesting for you. Move these variables to the top of the display, and rearrange the other variables by similarity or dissimilarity to these target variables . Time points have a natural order. No rearrangement is used here in this example.
Variables need not enter at their face value; they can be transformed, or derived variable can be added. In the case of the hotel data, this has already been done in the original data set. For example, the guests have been classified in tourists and business, and both sum up to 100%. If we want, we can remove this redundant information. This may clean up the picture. But it may hide information. For example, tourists are "anti-cyclic" to the hotel occupation and just fill the gaps. Removing this variable because it is (1-business) would hide this point.
And More ...
Bertin matrices are not restricted to this display. For example, J. Bertin gives a matrix representing vowel schemes of folk songs - you can easily spot yodellers.
The main point in common for all matrix representations is that when the arrangement of rows and columns is arbitrary, some permutation may help to reveal information.