Unlike many numerical properties we deal with, shape is a somewhat nebulous concept. We find that we can recognize similarities between shapes, but we are often unclear about how we recognize it. Further, we are even more unclear about how we might instruct a machine to recognize and classify shapes. One of the main tasks of topology is to develop methods for recognizing shapes, which it does through a set of tools called *homology*, or *Betti numbers, *named after the Italian mathematician Enrico Betti. There is one Betti number for each non-negative number. The zeroth Betti number is just the count of the number of connected components.

The first Betti number counts the number of independent loops in a space, suitably defined. So, the first Betti number of the letter “A” is one, and the first Betti number of the letter “B” is two.

The higher Betti numbers count occurrences of “higher dimensional cycles”, which again have to be suitably defined. Here are some examples.

The kind of counts we have described above sound great intuitively, but it seems difficult to make mathematical sense of it, and be able to instruct a computer to evaluate them. It turns out that this is possible, although it took a great deal of effort from a number of mathematicians in the early to mid 20th century. For every space (think of a subset of 2,3,4, or n-dimensional space), there is a Betti number for every k greater than or equal to zero, and it measures the presence of higher dimensional cycles in that space.

The problem which has only come up in the last 15 years is how to infer shape when all we have is a finite sample from the shape. Here is a typical example.

When we look at this picture, our visual system is able to detect the presence of a loop in the set. On the other hand, at a very fine grained level, this set is just a finite discrete set of points, which has no loops or higher dimensional cycles, only a large number of connected components, each of which contains a single point. Looking at this picture, we naturally ask ourselves if we can construct mathematical objects like the Betti numbers, but which actually detect this kind of statistical pattern in the sets. This turns out to be possible, and the resulting objects are called *barcodes*, which are simply finite collections of intervals. Each *point cloud*, such as the one we have shown above, has a barcode for each non-negative dimension. Here are some examples.

The upper row of barcodes are for the first Betti number. Note long bars correspond to the features, in this case essential loops. The circle has one, the sphere has none, and the torus has two. The second row are the analogues of the second Betti number, in the case of the circle there are no two dimensional features, and in the case of the torus and sphere point clouds, there is a single long line, indicating that the second Betti number is one.

This has been a brief description of how one uses patterns occurring in a shape to distinguish shapes from each other, and how one can do that for “point clouds”. The method for ordinary shapes is called *homology*, and for point clouds it’s called *persistent homology*.

Comment: Suppose that I have an irregular hyperdimensional data I would like to characterize in terms of shape. The nature of the data does not fit in terms of regular shapes, so one would think in terms of fractals, actually in terms of multifractals.

Question:

a.) if you would like to characterize such hyperdimensional data in terms of a topological a analysis, what are the benefits of analyzing such data in terms of their topological properties in contrast to a multifractal perspective?

b.) if you are interested in the ‘general’ smoothness of the data, seeking the clusters of data most smooth, what methodology would you follow?

Context: I am thinking in terms of characterizing a turbulent flow and use a network approach to use such frame of reference to analyze optimization results which seek to describe financial data.