G. Sawitzki StatLab Heidelberg Last revision: 2014-04-19 by gs
statlab > projects >  bertin >  patches

Bertin Matrices: Patches and Local Models

It may be useful to think of classical statistical methods and see how they may be reflected in a Bertin setting. For a start, regression and classification may serve as starting points. For both, traditional statistics has provided solutions, in the classical framework as least squares regression and as discriminance analysis. Many variations have followed.

Classification and regression trees have thrown a new light on these problems. In principle, CART is a higher dimensional strategy. In the Bertin context, we deliberately restrict ourselves to a two dimensional display context. CART suggests to look for ``patches'', rectangular areas that allow for an economic model. CART uses a simple strategy, splitting one variable at a point. However this leads to a fragmentation of the data material, dividing it on the average by a factor 2^k for k splits, and gives a corresponding penalty in terms of variance.
Friedman has pointed out that this fragmentation is not necessary. Instead of one set of patches you can have two: one used for estimation, and a second one, possibly different, uses to apply the estimation for fitting. You want to keep the first one large to control variance, and the second one small to reduce bias.

This is not implemented explicitly in the R of Bertin implementation, but we suggest it as a strategic guide. The examples provided in these notes follow this strategy.

A patch
Patches with multiple models