# Bertin Matrices: Selections and Models

## Selections

In the context of interactive Bertin matrices, the key concept is that of a selection. This may apply to variables, or cases, or both.

In the basic access, we select a variable, or case, and move it, thus defining a permutation. More generally, we may select any combination of cases and variables.

Selections may be moved as a whole, collapsed (hidden in the display) or replaced by a surrogate variable.

We may keep classical statistical models in our mind. In these models, selections have roles. In a regression context for example, variables may be regressors (the independent variables) or response (the dependent variables). In Bertin, this is supported by supplying two types of selection.

## Modelling

Visualising information is
but one aspect. In statistics, as we see it today, visualisation may be
one part of an analysis. The outcome will be a decision leading to an
action. Then there is a loss (or gain) depending on the action taken on
the one hand, and the "true" state of the world on the other. This is
the common decision theoretical setting. Statistics has formulated a few
standard problems, and given suggestions how to handle these. In our
Hotel example, the problem can be seen as a prediction problem: find a
prediction model to predict occupation and duration, based on the other
variables.

More specifically this is a control problem. The
statistical contribution is to find a regression model for occupation
and duration, based on the other variables. Some of these variables
may allow an intervention, and this makes it a control problem.

The visualisation can be
seen as one way to hint at a regression model. There are very few
classical problems. Regression is one of them, and prediction is closely
related. Classification and clustering is another, closely related pair
of problems, and their relation to Bertin matrices should be obvious.

In a Bertin context, it is tempting to go beyond classical regression. As a proof of concept, we implemented a nearest neighbour smoothing in Bertin, where neighbourhoods are understood as cell neighbourhood in the matrix. Following the usual procedure in regression, this leads to a fit matrix and a residual matrix. Both are Bertin matrices linked to the original data matrix, that is dynamic linking is preserved.