When "Big data" is not Enough: A Report of the BigVideo Conference in Aalborg

By: Liliana Melgar

(also on behalf of: Jaap Blom -lead developer at the CLARIAH project, and Rob Wegter, media scholar who represented the CLARIAH pilot projects at the event).

During three days in November 2017 around thirty researchers and practitioners interested in video analysis came to Aalborg, Denmark, to the “BigVideo sprint and mini-conference” organized by Professor Paul McIlvenny of the Department of Culture and Global Studies, and Assistant Professor Jacob Davidsen, of the Department of Culture and Psychology of Aalborg University. The event was sponsored by DIGHUMLAB, which is the national distributed research infrastructure in Denmark, and the Video interaction lab (VILA) of Aalborg University. This blog will present an overview of the event, and conclude with the most interesting parts from a CLARIAH perspective.

The main audience of the conference were experts in the field of ethnomethodological conversation analysis (EMCA), but also researchers from different fields such as ethno-cinema, media studies, and digital culture, who were attracted by the interesting, mostly qualitative approach to video analysis of this conference (and they were very welcome to share their views).

Participants (presenters) of the "BigVideo sprint and mini-conference" (Photo: Mathias Broth)

Participants (presenters) of the “BigVideo sprint and mini-conference” (Photo: Mathias Broth)

The purpose was to work around the idea of putting forward a “manifesto” for innovative qualitative methods to working with video in a “human-centered data” approach. Recognizing the importance and impact of quantitative methods and big data, the organizers strongly advocate the need to be critical and look closely again at the qualitative side of research in this context. As a parody to the term “big data”, they (not so seriously in the beginning) used the term “big video”, which drifts the focus from aggregated data towards the individual “objects”, which are richer (and bigger) moving images. This need for a “sense of balance” in working with video, originates from the significant changes that technology has brought to the research activities of capturing, storing, archiving, accessing, collaborating, sharing, visualizing (which often brings the need of transforming sources into data), and publishing. The conference also focused on the tools that support these tasks, and the methodological implications of their use.

The opening keynote by Paul McIlvenny and Jacob Davidsen summarized their collaborative work on experimenting with 360-degree cameras, stereoscopic cameras, multi-track video and audio streams, multi-track subtitling and video annotation. More references and illustrations of their work can be seen in the BigVideo Manifesto , published at Nordicom.

Keynote by Paul McIlvenny and Jacob Davidsen

Keynote by Paul McIlvenny and Jacob Davidsen

The other keynote speakers of the conference spoke about their perspectives on using video in research from their different disciplines: Anne Harris (RMIT, Australia), an expert on participatory methods discussed video in the framework of social movement theories and ethno cinema. Inspired by Jean Rouch, her focus is on the use and creation of “non-representational” videos for researching cultures and problematic societal situations. Robert Willim (Lund University, Sweden), showed another angle of video “as method,” in which cultural analysis becomes closer to art, when video devices are not meant to “objectively” capture realities for further study, but to transform them into new visual and aural digital experiences. In one of the “data sessions” that he lead, we observed and discussed his experiment with a recording of natural landscapes and sounds with a home device resulting in a “rarified” version of the original recording, leading us to reflect on fieldwork, reality capture and preservation.

For our work in the CLARIAH project, the most interesting and awaited keynote was by Adam Fouse (Aptima, USA), an expert on “interactive visualization of complex data” and on the analysis of time-based media. He is the developer and maintainer of the “ Cronoviz ” tool, a rich environment for the visualization, navigation and analysis of time-based media and accompanying data with a multi-modal approach. This tool facilitates the use of multiple, synchronized sources, the integration of data captured by sensor devices, geographic positioning, and the integration of notes taken during field work or from observations during the analysis.

Adam Fouse presenting Cronoviz (slide on its application domains)

Adam Fouse presenting Cronoviz (slide on its application domains)

Adam Fouse has also conceptualized the research process of scholars working with audio-visual data (we have also done some studies in the context of CLARIAH - see our paper -Melgar et al., 2017 ), and presented an overview of the workflows that take place in video-based research in different domains. Gladly for our work in implementing an annotation tool in the CLARIAH environment, we observe similarities also with media scholars and oral historians in those tasks and processes that he listed. Not in vain, Unsworth (2000) refer to annotation as one of the “scholarly primitives”, which are those “basic functions common to scholarly activity across disciplines, over time, and independent of theoretical orientation.”

One of the most surprising experiences of this conference to me, was to get acquainted with the method of “data sessions.” These are meetings in which communication scholars (conversation analysts to be more precise), get together to look at videos (data), and do the analysis as a group. Coming from the field of information science myself, and familiarized with media scholars from my previous research and work at CLARIAH, I have often seen scholarly work as quite a solitary activity. Even though scholars exchange their observations with peers via different channels during their research, or work in teams, I never came across this interesting (“proto-analyses”) practice with defined protocols. In these sessions, someone brings an “unpolished” source (audio, video) and/or a transcript, and everyone writes down their observations either freely or guided by a problem or question, which are then shared and discussed in the end.

Our contribution, presented by Jaap, myself, and Rob Wegter (media scholar who participates in one of the CLARIAH pilot projects), consisted of explaining the CLARIAH project, and the development of an annotation tool for video, audio, images and text for that project. Jaap explained his previous prototypes that led to the initial ideas for the CLARIAH annotation tool, and Rob explained how he is using the initial versions of that tool for doing the analysis of documentary films.

Jaap Blom, Rob Wegter, Liliana Melgar representing CLARIAH at the BigVideo Conference

Jaap Blom, Rob Wegter, Liliana Melgar representing CLARIAH at the BigVideo Conference

For our work in the CLARIAH project, besides Adam Fouse’s keynote, I highlight:

The presentation about the evolution of multimodal transcription practices by Lorenza Mondada. She detailed existing practices in transcribing video data for subsequent analysis using tools such as ELAN, and explained how the limitations of transcription conventions for describing temporal relations led her to develop new ways of transcribing. An important observation she made, which I associated with the practices of oral historians, is that scholars working with audio-visual media have often relied on a textual transcript, but that the progress in tools for annotating video directly will challenge this practice.
- I also highlight the presentation by Christensen and Abildgaard about The Design Thinking Research Symposium 11 (DTRS11) which was highly inspiring for getting new insights about the co-development approach we take in CLARIAH project. In this workshop, participants from several disciplines were exposed to the same datasets, and had to practice and reflect on their own methods and the differences across disciplines when using the same data. This is crucial to understand how to facilitate data sharing that fits the needs of different communities, who can approach the same datasets with different research questions and methods.
- The presentation by Tom Koolen, from Groningen University, introduced the ADVANT proposal. This is a longer term plan for providing annotation support to communication scholars (and other scholars in The Netherlands). Tom organized a one-week workshop in Leiden few weeks before the BigVideo conference called “[ Collecting, Annotating & Analyzing Video Data](https://www.lorentzcenter.nl/lc/web/2017/926/participants.php3?wsid=926&venue=Snellius) .”
- Finally, one of the surprises/discoveries for me in this conference, was to observe how annotations are used in 3D/360 videos. In one of the data sessions Paul and Jacob gave the chance to some of the participants to interact with sophisticated devices for experiencing high quality video and audio captures. Surprisingly, or perhaps not if we remember Unsworth’s idea of the scholarly primitives, annotation takes different forms in these videos. As it can be seen below, the analysts draws shapes in the video and records voice memos in order to take notes of important observations. You can see some gif annimations at Hugo Huurdeman’s event report . These notes and illustrations remain stable in the video, and can be “played” simultaneously or asynchronously by other researchers (as in the data sessions) to exchange ideas that lead to possible interpretations. In one of the working sessions, we briefly discussed the need to establish conventions for these “drawings,” attaching semantics to the shapes in order to facilitate subsequent processing: e.g., finding all drawings with the shape of a hand/arm, or circular objects. These are more advanced forms of coding data directly, which bring annotation out of the space of 2D video in which innovation in standards for sharing those is also still taking place (see for example the recent W3C standard for web annotations).
  
  The event also included “sprint” sessions, in which we were divided in groups to work in themes for the manifesto. The three themes were “video data collection”, “video data visualization including transcription” (which I would call “analysis phase”), and “video data sharing and dissemination.” The final section included presentations with the main conclusions from the working sessions. One of the most challenging but promising aspects for our CLARIAH project is the need to consider methodological differences across disciplines when the infrastructure also accommodates communication scholars. The most important difference is that in ethnomethodological conversation analysis, the ethnographers focus significantly in the recording processes, “capturing what is out there”, which makes prominent the issues related to storage and sharing (for instance, as the case of the organizers, their collections may occupy as much as 8 terabytes), while media scholars are more used to working with “published” sources by others (e.g., broadcast materials, films, youtube videos), and oral historians, even though they also create their sources, are less demanding in terms of capturing technologies.
  
  Happily, two weeks after the event, we got the news that the manifesto has been published! We would like to thank the organizers for having invited us to be part of such an exciting, intensive and fruitful event! (and for having brought us to a fantastic dinner at the Viking graveyard).
  
  More information:
  - The “BigVideo sprint and mini-conference” program and abstracts: https://easychair.org/smart-program/BIGVID2017/index.html
  - MciLvenny, P., & Davidsen, J. (2017). A Big Video Manifesto: Re-sensing video and audio. Nordicom Information, 39(2). Retrieved from https://nordicom.gu.se/sv/system/tdf/kapitel-pdf/mcilvenny_davidsen.pdf?file=1&type=node&id=39006
  - Hugo Huurdeman’s “Conference Report: Big Video Sprint Conference”, contains also illustrations: https://www.ub.uio.no/om/prosjekter/the-visualisation-project/news/conference-report-big-video-conference.html
  - Melgar Estrada, L., Koolen, M., Huurdeman, H., & Blom, J. (2017). A process model of time-based media annotation in a scholarly context . In CHIIR 2017: ACM SIGIR Conference on Human Information Interaction and Retrieval . Oslo.