Topology within mathematics can be characterized as that part of the subject which studies notions of shape.  It really consists of at least two separate threads, one in which one attempts to “measure” shape, and in the other in which one attempts to find compressed combinatorial representations of shape and analyze the degree to which these representations are faithful to the shape.  The first proceeds primarily via algebraic invariants, such as homology and homotopy groups, to measure and count the instances of particular patterns within the shape in a suitably systematic way.  The second is the subject of a great deal of manifold topology, and is exemplified by the work on the “Haputvermutung” concerning the existence of a common subdivision of any two triangulations of manifolds. 

Both these threads have been extended to the world of point clouds of data.  The measurement aspect is extended via the theory of persistent homology and its variants.  The second one is extended by various simplicial complex constructors, such as Vietoris-Rips complexes, witness complexes, and the complexes constructed by Ayasdi’s Platform.  In ordinary topology, the role of the combinatorial representations is to lend additional concreteness to the study of the shape, as well as to provide a succinct representation of it.  They serve the same purpose in the study of high dimensional and complex data sets, in that they provide a compressed representation of the data which retains information about the geometric relationships between data points.  The representations are also easy to work with, so they provide extremely useful and simple ways to interrogate the data, and to understand the driving variables characterizing various subgroups.  At a high level, one can say that they allow for easy identification of coherent groups within the data.  The search for coherent groups, performed naively, is a clearly intractable problem since it requires searching through the collection of subsets in the data set. 

Ultimately, both sets of ideas will be useful in permitting investigators to study their data.  The representations are at the forefront, because they are what a user deals with directly.  As we move further into automation, the measuring of the shape of a data set and of Ayasdi’s complex outputs will be critical, since we will want, for example, to test Ayasdi constructions for the presence of geometric features such as flares and loops, so as to provide the user the best possible “quick analysis”, automatically building  complexes for the user without requiring by hand selection of parameter values, metrics, and lenses. 

10 thoughts on “Why Topological Data Analysis?

  1. vijay sharma says:

    Really useful article.
    Could you please point me to a case study related to IRIS complex?

    Thanks & Regards,

    1. TJ Laher says:

      Hi Vijay,

      Thank you for the note. Here is a white paper that explains TDA and how it relates to Iris. Let me know if you have any other questions.


      1. TRIDIB DUTTA says:

        Hi TJ Lohar,

        I also find it very interesting. I am a mathematician by training (actually has a PhD in Commutative Algebra), but I am fascinated by big data (having worked as a postdoc in computational biology lab).
        I tried to get the paper, but it is asking for employer information etc. Unfortunately, I am currently unemployed and have no affiliation.
        Can I still get the paper ? or if not, can you give me some pointer which will be helpful in understanding this fascinating relationships.
        Thank& regards.
        PS. I can be reached via

        1. TJ Laher says:

          Hi Tridib,

          Thanks for reaching out. Yes, you can still download the paper. You can put no affiliation and it should work just fine.


Leave a Reply

Your email address will not be published. Required fields are marked *