In the morning session today Sara Sullam and I will be presenting our work on exploring nominal (in our case study - bibliographical) data. We do it by borrowing a method from educational research - the notion of phenomenographic variation. #CHR2023🧵
First, what is #phenomenography and how could it help us. In contrast to phenomenology that studies what phenomena are, phenomenography is concerned "only" with how these are perceived. We build on the idea that scientific inquiry could also be seen as learning at the collective level. This is especially true for interdisciplinary research conducted by a team of researchers (hello digital humanities). A theory that emerged from phenomenographic research is what is called variation theory, i.e. one needs to experience variation to comprehend a phenomenon. And there is a very specific way to achieve this: by using patterns of contrast, generalisation and fusion
These patterns of variation consider aspects of phenomenon, which at a simple level could be seen as dimensions of data. The simplest of the three patterns is contrast, the idea that to start understanding a phenomenon one needs, to consider each of its dimensions in isolation (i.e. variating it while keeping others fixed). Our example is from translation of Italian novels from the post-war period into the UK market. We apply contrast on authors. Our way to fix other dimensions is by counting them
The second pattern is generalisation. It says we should fix our dimension of interest and and vary anything independent of it. We do this by focusing on individual authors and looking of other dimensions (e.g. the interplay between Italian and UK publishers). In the example here, we consider Vasco Pratolini who is the only author from the above graph that has more than one publisher in both countries. Of course this graph is just a start of an enquiry as to why this occurs. In our case we actually had to look into the archives of publisher exchange to get an understanding, but that's beyond the topic here.
PS: Sorry for the transparent backgrounds of images that don't work well on dark app themes. The graphs can be seen better in the paper which is at the end of this thread
Finally, after a first exploration, one might feel ready to see the big picture, i.e. fusion. Of course after that one might tbacktrack to drill back into particular values.
One way to show multidimensional (nominal data, except for years) that we've found useful is the following graph. But more generally we need visualisation techniques that allow for multidimensional nominal data. For two dimensions heatmaps could be a good candidate. It gets more complicated with more dimensions. Alluvial diagrams could turn handy here
Finally, here's the full text. It has more context and examples. We'd love a discussion beyond the one after the presentation https://ceur-ws.org/Vol-3558/paper774.pdf