Screening Desmet: reflections on the MIMEHIST-project and CLARIAH’s Media Suite.

By: Kathleen Lotze (University of Amsterdam)

From January through June 2018 I have had the pleasure of working for the CLARIAH research pilot project MIMEHIST: Annotating EYE’s Jean Desmet Collection (2017-2018) , led by Christian Olesen Gosvig (University of Amsterdam). The main purposes of this project were to make available EYE Filmmuseum’s collection of cinema owner and distributor Jean Desmet via the CLARIAH infrastructure (via the CLARIAH Media Suite ), and to use and develop the annotation functionalities of the CLARIAH Media Suite to analyse this collection. The Desmet collection consists of approximately 950 films produced between 1907 and 1916, a business archive containing around 127.000 documents, some 1050 posters and around 1500 photos. The business archive had been digitized in 2011 as part of the Metamorfoze project, which is a national digitization project for the conservation of paper-based heritage items. It is unique because of its large amount of rare films from the transitional years of silent cinema, and because of the richness of its business archive which holds extensive documentation of early film exhibition and distribution practices in the 1910s. Because of my expertise in local film exhibition, distribution and consumption I was asked to contribute to MIMEHIST’s aims to do research on exhibition and distribution practices of Desmet during World War I and test the Media Suite’s annotation functionalities. During these six months I worked together closely with Christian as well as with postdoctoral researcher Liliana Melgar (University of Amsterdam/Institute of Sound and Vision) and programmer Ivan Kisjes (University of Amsterdam).

Image 1 : Documents from Desmet’s business archive. Source: Blog entry for EYE by Piet Dirkx, “Desmet op UNESCO werelderfgoedlijst”,

I had been familiar with Desmet and his importance for Dutch film exhibition and distribution, but in order to find my way through the enormous collections, I started with rereading Ivo Blom’s dissertation * Pionierswerk. Jean Desmet en de vroege Nederlandse filmhandel en bioscoopexploitatie (1907-1916) (University of Amsterdam, 2000) and his monograph based on this dissertation, Jean Desmet and the Early Dutch Film Trade (Amsterdam University Press, 2003), which both draw extensively on the documents contained in Desmet’s business archive. Additional documents that were very valuable and helped me navigate through the sheer endless digital pile of images were a detailed inventory made by Eye’s collection specialist for personal and business archives, Piet Dierkx, as well as former reports and papers Christian had written before. Amongst the latter were reports for the NWO KIEM project Data-driven Film History: A Demonstrator of EYE’s Jean Desmet Collection (2014-2015), that aimed at understanding the usefulness of the data available on the Desmet Collection for research on film distribution and exhibition.

One of the questions that were triggered by Blom’s analysis regards Desmet’s distribution and exhibition practices during World War I, particularly in how far the selection of films he distributed and/or exhibited were based on personal motivations or choices dictated by the (local) market. Previous scholarship, for example, has discussed Desmet’s distribution of newsreels or a demand for war-themed films among some of Desmet’s customers during World War I, but left some questions unanswered. It has not yet been studied systematically, for example, in a way that elucidates the films’ distribution history and reception as expressed in Desmet’s correspondences with his customers. Linking the films in the Desmet collection to information on them in the business archive could contribute to better understanding of the distribution histories of war-themed titles and ultimately also of film distribution in the Netherlands during the Great War.

The documents contained in Desmet’s business archive have thus offered a rich source for the socio-economic history of Dutch film distribution and exhibition in cinema’s earliest years. However, previous research (see also, for example, Karel Dibbets and Rixt Jonkman , who both studied parts of the archive to manually transcribe and organize the collected data in databases) also made evident that the archive is too large and diverse to be organized and transcribed manually in its entirety. A particular challenge is that the collection contains many different kinds of documents, ranging from personal letters, business letters, records of film rentals, to postcards, newspaper clippings, telegrams, scraps of paper with notes, photographs etc. In addition, some documents are printed or machine written, others handwritten.

When I started working for MIMEHIST, the film collection as well as the poster collection had already been integrated in the Media Suite, which was then available for researchers in its second (beta) version. As a complete newcomer to the Media Suite, I was actually perfectly suited to explore and test its functionalities. The first logical step was to become familiar with the Media Suite, its tools and architecture 1 . One of the challenges in this respect was that often terminology was used there which might be familiar to developers, programmers and other ICT experts, and perhaps to those who had been involved in building the Media Suite from the start, but not to a rather traditional film historian like me. While abbreviations as OCR (an abbreviation of “optical character recognition” and refers to the process of converting images to machine-readable text) and API (“application program interface”, a set of subroutine definitions, protocols, and tools for building application software, which makes it easier to develop a computer program by providing all the building blocks, which are then put together by the programmer) had already trickled into my professional vocabulary, terms as “elastic search” (a particularly powerful search engine), “recipe” (an interface in which tools combine several functionalities), for example, or certain tools and platforms as GTAA , DIVE+ , CKAN etc., where new to me. In order to find my way around the Media Suite, but also to make it easier for its future users who, like me, are not digital natives, I started writing a Glossary of often used terms and abbreviations 2 . For the same purpose, I also kept a log book of my test sessions which could be used as tutorials for future users, with practical examples for searching and annotating archival material contained in the Media Suite. The log book also helped Liliana and the CLARIAH Media Studies team in identifying flaws and bugs and thus helped in further improving different tools and functionalities in the Media Suite.

Image 2 : Screenshot of the user interface of CLARIAH’s Media Suite (version 3) ( ).

As the integration of the Desmet business archive in the Media Suite faced a number of unforeseen obstacles and was only finalized just before MIMEHIST ended 3 , the focus of my main objectives shifted. Rather than doing actual research by working with the Desmet material in the Media Suite, the objectives were now, first, assisting in making the digitized version of the business archive searchable and second, testing the research facilities in the Media Suite environment by way of improvising and simulating certain research steps (see below).

With regard to the first objective, I was mainly assisting Ivan in making searchable and accessible in the Media Suite the documents from the Desmet business archive which Ivan had meanwhile OCRed completely. One of the requirements was to make the collection searchable in its entirety, while also keeping the collection’s original structure, as it is now kept by EYE in its analogue form and as it was indexed by Dirkx. In correspondence with Dirkx’ inventory, the collection is subdivided in six thematic sections (such as private life, fairground period etc.), altogether comprising about a thousand folders of which each contains several envelopes with a differing amount of documents. While previously, inspection of the many documents was only possible folder by folder, the advantage of making the archive digitally accessible and searchable is that documents can now be searched across all folders. By opening up for such a defragmented use of the archive, documents from different periods (for example, his travelling cinema period, World War I, the interwar period) or of different types (shipping orders, film catalogues, personal correspondence etc.) can be combined and examined, thereby facilitating interdisciplinary and comparative research.

In order to improve the searchability of the archive, for example, Ivan and I tested possibilities of clustering documents according to visual features (such as company logos or document layout), in order to be able to roughly distinguish between types of documents (shipping orders, contracts, catalogue entries, telegrams, personal correspondence etc.). We also tested possibilities to extract data from shipping orders and contracts for the automatic creation of databases on film distribution and exhibition. This, however, turned out not to be feasible at this stage, as most of the corresponding documents are handwritten and the quality of these OCRs does not allow for reliable data extraction yet. Although these assisting activities were originally not part of my MIMEHIST work package, the collaboration with Ivan made me realize the steps and proceedings involved when making historical collections digitally accessible and searchable.

With regard to the second objective, the testing of annotation and research facilities in the Media Suite, I decided to start with a case study, for which I needed to go through the usual research steps an imagined Media Suite user would. These steps included the formulation of a research question, the building of a corpus by using the search and explore function, the checking of the quality of the metadata as well as annotating the documents – all of which has been made possible in the Media Suite environment. I chose to investigate the distribution of one particular film: The Blacksmith’s Love (1911, Francis Boggs, USA) , a short drama set in the American Civil War in 1860s. Although the film features successful actors, such as Tom Santschi, Herbert Rawlinson and Eugenie Besserer, hardly anything is known about the distribution and reception of this film in the Netherlands, except for its Dutch title Vreugde en smart , under which it was distributed in the Netherlands and which was provided in the inventory of the Desmet film collection.

As searching the business archive in the Media Suite was not possible yet, I worked with a provisional website that Ivan had built to make searching the digitized and OCRed business archive possible. Based on the keyword search in the digitized Desmet paper collection, in combination with manual detection of additional documents in their immediate environments (same folders) I gained some insights in the circulation of The Blacksmith’s Love . The film lasted approximately 15 minutes and as such it was part of a longer film program which Desmet distributed to his clients. These programs usually consisted of between five to seven films and included newsreels, short comedies and short documentaries (often in the form of travelogues). The Blacksmith’s Love was usually screened after the main feature, as one of the last items in the program. Remarkably, although the film was always offered by Desmet as part of a program, it was never offered with the same films twice, in other words, each one of the programs was differently assembled. In addition, what this preliminary search across the different sections and folders of the business archive also revealed is that, although the film dates from the pre-war period, it was distributed by Desmet mostly during and even after the war and exhibited in different venues across the Netherlands (for a visualization of the preliminary results see this timeline ). A more thorough inspection of the business archive in the Media Suite and perhaps with enhanced OCRs will most likely result in many more documents attesting to additional screenings of the film.

Image 3 : Examples from the business archive documenting the film programs that included The Blacksmith’s Love , which Desmet distributed to his clients.

In addition to the insights about the circulation of The Blacksmith’s Love , this case study offers a plethora of new points of departures to be investigated, including microhistories that reveal the relationship between Desmet and his clients, changes in the composition of his programs after the war in terms of genres and the number and lengths of the films, or insights on the newsreels Laatste Bioscoop Wereldberichten , that Desmet allegedly assembled himself from newsreel items produced by, amongst others, the French company Pathé.

The added value of an environment as the Media Suite is of course that, except from offering access to the collections and the possibility of searching within and across different collections, (segments of) the archived material can be linked and annotated and subsequently shared with other researchers. In case of The Blacksmith’s Love , information in the documents about for example, Desmet or the film’s production and circulation details can be classified using thesauri like GTAA or DBpedia, or linked to databases such as the International Movie Database or Cinema Context . Classifications and links thus potentially serve purposes of data enrichment and Linked Data. In addition, the possibility of sharing research notes and results as provided by the Media Suite, not only resonates with current trends in research practices in the humanities towards interdisciplinary collaboration, but also strongly encourages it.

Image 4 : Screenshot of a test session, displaying the annotation tool in the Media Suite (left) and databases IMDb and Cinema Context (right).

  1. For more information about how the Media Suite has been built, see Ordelman, R., Martínez Ortiz, C., Melgar Estrada, L., Koolen, M., Blom, J., Melder, W., … Noordegraaf, J. (2018). Enabling Scholarly Research for Distributed Audiovisual and Mixed Media Data Sets in a Sustainable Infrastructure. Presented at the Digital Humanities, Mexico: DH2018.”

  2. Some of the terms and definitions I wrote in that preliminary glossary will be made publicly available in the Media Suite glossary (in construction):

  3. At the time of writing this post, the Desmet paper collection is fully accessible via the Media Suite, including the mentioned OCR enrichments. See more information at the CKAN collection registry .