In the morning session today Sara Sullam and I will be presenting our work on exploring nominal (in our case study - bibliographical) data. We do it by borrowing a method from educational research - the notion of phenomenographic variation. 🧵

First, what is and how could it help us. In contrast to phenomenology that studies what phenomena are, phenomenography is concerned "only" with how these are perceived. We build on the idea that scientific inquiry could also be seen as learning at the collective level. This is especially true for interdisciplinary research conducted by a team of researchers (hello digital humanities). A theory that emerged from phenomenographic research is what is called variation theory, i.e. one needs to experience variation to comprehend a phenomenon. And there is a very specific way to achieve this: by using patterns of contrast, generalisation and fusion

Show thread

These patterns of variation consider aspects of phenomenon, which at a simple level could be seen as dimensions of data. The simplest of the three patterns is contrast, the idea that to start understanding a phenomenon one needs, to consider each of its dimensions in isolation (i.e. variating it while keeping others fixed). Our example is from translation of Italian novels from the post-war period into the UK market. We apply contrast on authors. Our way to fix other dimensions is by counting them

Show thread
Follow

The second pattern is generalisation. It says we should fix our dimension of interest and and vary anything independent of it. We do this by focusing on individual authors and looking of other dimensions (e.g. the interplay between Italian and UK publishers). In the example here, we consider Vasco Pratolini who is the only author from the above graph that has more than one publisher in both countries. Of course this graph is just a start of an enquiry as to why this occurs. In our case we actually had to look into the archives of publisher exchange to get an understanding, but that's beyond the topic here.

PS: Sorry for the transparent backgrounds of images that don't work well on dark app themes. The graphs can be seen better in the paper which is at the end of this thread

Finally, after a first exploration, one might feel ready to see the big picture, i.e. fusion. Of course after that one might tbacktrack to drill back into particular values.

One way to show multidimensional (nominal data, except for years) that we've found useful is the following graph. But more generally we need visualisation techniques that allow for multidimensional nominal data. For two dimensions heatmaps could be a good candidate. It gets more complicated with more dimensions. Alluvial diagrams could turn handy here

Show thread

A note on visualisation. For no good reason a majority of visualisation tools cannot handle categorical dimensions. Probably they can't be bothered with explicitly having to specify an order of the visualisation dimension. This is only one of the reasons we found tools like Excel fast, but too limited to be useful. Programmable tools like ggplot or bokeh are more powerful, but iterations take too long to visualise and that slows down the exploration process. We've found a great solution for two reasons. First, it allows categorical dimensions in the charts we used, and second, it allows SVG export for post-processing. So shout out to rawgraphs.io/

Show thread

Finally, here's the full text. It has more context and examples. We'd love a discussion beyond the one after the presentation ceur-ws.org/Vol-3558/paper774.

Show thread

Sorry, that was incorrect. Pratolini is not the only one with two publishers in the UK, but the only one who has two publishers repeatedly translating works of his

Show thread
Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.