back to conference papers page

Conference Papers Online

Session 1: Capturing Context


A Methodological Approach to the Collection of Science Records: The "Metaprotocol" as a New Point of View[1]

Didier Devriese

Archivist  - Université libre de Bruxelles

Introduction

When I am confronted with the question ‘How are we to deal with records produced by present-day scientific research?’ I experience the anxiety common to all archivists, even when the issue is limited to context related issues. The issues which arise can be summarised as follows: how am I to select, how am I to classify, structure and describe, how am I to justify my choices and my errors? Of course we have, with time, developed effective methods,[2] and we keep reflecting on the evolution of our procedures, but, there is an area of practice beyond theory (or should I say prior to theory) where the practitioner knows, almost intuitively, what should or should not be done. We also know that we continually commit errors, because we cannot refer to one single system of thought or to one single method that covers all the needs and demands - sometimes contradictory - that we are called upon to answer. 

Now one may start wondering what is the role and use of moral questions like these at a conference which tries to solve practical, technical problems. Of course, we try to develop systems that aim at perfection. But does that question have a technical answer? 

We try to deal with the difficulties that arise by means of technical arguments, but knowing full well that the perfect system does not exist - that whatever we do, we will always commit errors, like destroying data that should have survived, or failing to locate a key document. 

Experiences like these suggest that we need moral as well as technical answers, that we need to justify our actual choices rather than strive for a perfection which is out of our reach. The concept of metaprotocol[3] that I would like to discuss in my conclusion will seek to reconcile the technical and the moral perspectives. 

My attempt today will seek to bridge the gap between theory and practice. Much of what I am going to say may sound familiar to you, because it corresponds to your natural experience in your professional practice. My contribution, then, will be an attempt: 

  • to formalise a number of empirical notions and bring them together in a theoretical system which may, in due course, come to constitute a convenient frame of reference;
  • to justify technical options and positions from a moral as well as a strategic perspective. This is a sine qua non underlying successful performance.

1. On the need of situating records within a context and on the question of ‘loss of meaning’.

Recent research in epistemology has led us to view science from a radically different perspective. Given the fact that sociologists of science are seeking to redefine the role (status) of the document in the laboratory and that the notion of document itself has undergone a considerable evolution, one can easily understand the archivist’s interest in the issue. These new insights inevitably lead them to reconsider the collection and processing of archive records resulting from contemporary scientific research and research management. Numerous debates in the sociology of science have recently addressed the question of the construction of scientific knowledge. The archivist will translate this question into a practical one: how can we deal with records as meaningful data if we no longer understand the activity which gave rise to them? And since we realize that knowledge of the context is indispensable, how can we satisfy this demand?[4] 

1.1 The link between ‘classifying framework’ and context.

The idea that a description of the context is an indispensable prerequisite to the keeping of archives seems so self-evident that it has actually become a ‘golden rule’ in the trade. Yet the principle has received remarkably little regard in the archivist’s approach to the domain of science data. One reason may be that the notion of context itself has undergone a considerable evolution. But if the golden rules - context, content and structure - remain mere standards of reference, we are not paying them the respect they rightfully deserve. 

A comment is necessary here on the links between the notions of organisation diagram, classifying framework and context. The question is still an open one, and has in turn given rise to other questions such as the issue of the substitution of the concept of ‘series’ for the earlier ‘respect of data collections’.[5] The issues raised by classification allow us to appreciate that ‘context description’ is not as easy a task as may appear on first sight: for all our archivistic theory rests on the basic assumption that structures are relatively stable phenomena - a conception which links the notion of records to that of evidence,[6] and which within the domain of the history of ideas has clearly conditioned the notion of ‘evidential value’: a document ‘proves’ something, and provides ‘evidence’ of a given function or activity. 

This conception is rooted in the idea of a discrete, clearly identifiable producer of records. In those cases, the classifying frame is modelled on the producer’s organisation chart. It is the producer who identifies the records produced, and it is the records which provide evidence of the producer’s activity. 

This notion, which was new and interesting at the time, and which is still in use, rests on a principle which is increasingly being called into question today - none of the terms in the logical sequence can be described with sufficient accuracy, and while the terms themselves are adequate, the content assigned to them is becoming increasingly fuzzy. These reservations apply with particular relevance to work on science records: how can one draw an ‘accurate’ organisation chart of a laboratory, how can one delineate the context within which records are produced, i.e. a scientific activity which is fuzzy by nature; how can one identify with precision records which are subject to perpetual change, whether in one laboratory or spread over several laboratories, how can one know which records to retain as evidence . . . and how can one know what the evidence is of?[7]  

As long as archivists kept gathering institutional data, the question seemed simple: the organisation chart could be reconstructed on the basis of the data/records collected and careful observation of the institution concerned, and thus an ad hoc ‘classifying framework’ could be set up. In general terms, the collection or ‘gathering’ of records (even from the viewpoint of the record keeping system, which condemns data collection as a process extraneous to the activity recorded) is always simpler to define if it is based on an a priori structural pattern than if it takes into account the actual activity taking place before the observer’s eyes. While it must be granted that a ‘record keeping system’ makes better allowance for the constant mobility of reality than an extraneous collecting system ever could, the fact remains that we must, by some device or other, draw an organisation chart in order to develop a classifying framework or, conversely, design a classifying framework which implicitly presupposes an organisational structure. This framework, no matter how adaptable and accurate with regard to the institution, activity or function that it seeks to render, is bound to cast the actual reality in a more rigid pattern than the actual institution, activity or function has in real life. But this is an unavoidable difficulty inherent in the system itself. An excellent example of this problem is offered by the ‘functional’ model proposed by Helen Samuels in Varsity Letters.[8] The so called functions described by Helen Samuels aim to delineate with greater accuracy the actual activity of the record producers rather than the activity as it appears in the institution’s organisation chart. With this aim in view, each record or document is related to a discrete productive ‘function’ rather than to an activity described by the institution. This allows, notably, a serial classification; the drawback is that it also permits a thematic arrangement. It is hard to give an accurate delineation of the ‘functions’ involved. There is too much room for personal interpretation and as a result, the description is as inaccurate as a classical, chart based one. 

On a first sight basis, the only differences we have to deal with are records of a different nature, which should not raise many problems other than questions of conservation. I believe we may follow Terry Cook[9]  when he claims that the appearance of the electronic support is, after all, not very different from the appearance of photography or of the substitution of wood-pulp paper for rag paper. In both cases, the change of support posed formidable conservation problems; to these are now added problems of readability. One could claim, however, that this is a merely technical issue, albeit a complex one, which could be solved by the availability of important human and financial resources and/or by increasing the capacity of the material itself. 

A specific example is constituted by the appearance of texts saved as ‘picture memory’ on CD ROM, a solution made possible by the exponential growth in memory capacity of the supports which (thanks to the increasing possibilities of the visual display equipment) reduces reading difficulty. 

Another good example is constituted by the observation of a laboratory’s office equipment computer facilities. The average memory capacity of the hard disks in a department whose records I am gathering has jumped within a five-year period from 300 MB to 10 GB of ROM per PC, and they are equipped with graphic viewers which allow them to read practically any present day text, picture or other graphic document. Personal records are increasingly stored on hard disks, a process which entails massive data destruction or random copying whenever the equipment is replaced.[10] 

Many of us here today are aware of these issues. Gavan McCarthy has demonstrated the need for the archivist to master a number of computer techniques in order to successfully implement electronic record keeping systems.[11] His argumentation took into account the present day changes in the nature of record production instruments to which I referred supra. 

But if one agrees that changes in the nature of records result from the local production context, and that these changes are not only technical ones, then it follows that: 

  • the question of description of production contexts can not be resolved by exclusive recourse to technical means; and that
  • the issues of the nature of documents and of their context of production cannot be artificially kept apart; because
  • the context is always partly dictated (given, conditioned) by the documents themselves and by the link between them, i.e. their structure. [12]

1.2 The indissociable link between context, content and structure.

The question of the context, then, cannot be separated from the two other notions, i.e. content and structure; and I would like to highlight the close link prevailing between these three concepts. I feel that one of the errors committed in recent years, notably in the establishment science records, has been to separate the issues of context, content and structure in practice, if not in theory. The change in the nature of records is bound to raise the related issue of the change in the construction of knowledge itself, particularly in the laboratories. The changes in the nature of documents are as dependent on changes in paradigms and methods as in technical upheavals. The issue is not simply a change in support (from paper towards electronic processing, as is the case in, say, administration) but a phenomenon that actually produces a different type, not just a different shape of records, which result from but also help to shape the research processes themselves. 

Several among you have already highlighted changes in scientific practice, more specifically in exact sciences, in instances like the difficulty of keeping ‘common electronic notebooks’ where the ‘chat’ increasingly appears as one of the loci of the laboratory’s memory. The point, albeit crucial, does not exhaust the issue.[13] 

The most striking examples may be found in computer models (hydrophysics, aerodynamics, astronomy etc.) where the simulation patterns are more important than the results of the calculations; but the simulations can be understood only if the results are made visible. The ‘model’ may exist in a printable mathematical form, but it supposes the subsequent construction of a ‘calculator’, linked to the object of the study (for instance, the air/water variable in aerodynamics). But it seems rather absurd to record in a two dimensional representation the results of a calculus designed to produce three dimensional images! 

If we are to maintain records of this type, it is obvious that the instrument built to produce the visible result (and its operating instructions) must be kept as well as the results themselves. Although the question has been raised, the answer remains to be found. But it is clear that the same reasoning is applicable to a great variety of disciplines. 

We have already underscored the need to penetrate into the world of the laboratory itself in order to gain insight into the new types of documents, and to develop appropriate approaches in the archivist’s procedures. This, in turn, raises the issue of the locus where scientific records are to be collected: how can archivists be sure to cover the full range of present day scientific production? Even if they agree to radically restrict their scope, they must first gain awareness of the scientific phenomenon as a whole lest entire sectors of its documentary production should be overlooked. From this viewpoint, the criteria commonly applied so far appear as all too restrictive; hence the need to broaden the range of disciplines involved, and to extend coverage to commonly neglected loci of production like, for instance, research projects carried out by groups of students and researchers during laboratory exercise sessions. 

Another question to be raised is that of the type of data to be collected. With regard to this issue, two tendencies prevail: on the one hand, a propensity to collect only those data felt to be representative of the full scope of the scientific process characterising laboratory research; and on the other, a desire to collect archives from as wide a variety of backgrounds as possible. These two approaches should be made to complement one another for two reasons: 

  • it is the only way to fairly account for the present day diversity in scientific procedures; and
  • it is a means to endow the archive collection with methodological cogency without laying it open to the indictment of pseudo-objectivity. Users of documents must be shown the multifariousness of possible viewpoints; for this reason, archivists must explicate the procedure chosen and thus clarify the manner in which the whole of their documentary corpus is constituted.
On the other hand, it is clear that the limitations of exhaustive collection are rapidly reached. At our latest Liege seminar, I resorted to a mildly provocative title, "Is there a pilot on the plane?", to underscore what I felt to be the limits of collecting science archives. Let me restate the main points here.[14] 

In order to obtain as faithful an image as possible of the context within which documents have been produced, it would actually be necessary to gather all the documents produced by a given laboratory. Let us at least temporarily adopt this paradox, and imagine we keep all the records, all the instruments, with their operation manuals. But in order to keep a trace of their operation, we must also record the explanations of the actual users. 

I was profoundly struck by a NASA survey, which pointed out their inability to operate the first lunar rockets, even though all the records, including the equipment, had been kept. What had been lost in the process was the ‘live’ memory of the ‘ropes’, the hands on knowledge, as very few of the technicians who participated in the original project are still active. In France, a procedure has been set up to save this live memory, for certain manufacturing processes are being forgotten and lost. It is therefore necessary to record conversations - of what sociologists call ‘the creative moment’; and we need pictures of the laboratories, as well as the tangible results of experiments: superconductors and fungi (fungi? in archives?), holograms (how does one keep a hologram?), clothing and furniture. It is not hard to see where this would lead us in the long run, to say nothing of the problems of storage and maintenance raised by the conservation of all those artefacts. 

Beyond the anecdotal aspects, there are three more serious questions to be dealt with: 

  • We know from the physicists that the presence of a measuring or recording instrument may affect the results of an experiment. By the same token, our very presence may falsify the reality we are there to record [working with hidden cameras would, of course, raise other issues in terms of a moral and honour code]. Within the perspective of a permanent ‘record keeping system’, this puts the question of whether we can implement it without warping the ‘natural’ record producing activity. How, for example, could the record-keeping exclude the observer’s bias and make allowance only for the scientists’ viewpoints? 
  • The major drawback remains the system’s ambition to view the set of data collected as absolutely exhaustive, that is, as a source susceptible to answer all possible questions. But of course we know that this is not so, and that therefore we are ‘misleading’ the users, that is, giving a false or incomplete image of the production context; and this in turn is likely to jeopardise the honest critical approach which should preside over later description and analysis. 
  • If we take into account not only the needs of scientists and research staff themselves, but also the requirements of historians and sociologists of science (and presuming for the purpose that we may find scientists who would lend themselves to a project of this kind), the laboratory likely to collaborate effectively is bound to be a ‘good’ one, and will, understandably, seek to use the fact to promote its own image. The result will be a biased, stereotyped image of science, which excludes other possible approaches. And this raises the question: which kind of science are we supposed to describe? 
These questions demonstrate a basic point: entering into the laboratory encourages us not only to seek records different in nature from what we are accustomed to, but also (and perhaps above all) to understand the context within which these ‘new records’ are produced. 

I hope I have demonstrated that the archivist’s entry into the world of the laboratory does not solve all the problems; on the contrary, it raises new ones. 

2. A Technical and Moral Solution: The Simultaneous Description of Structure and Context.

The metaprotocol aims: 
  • to describe the logical interrelation linking the laboratory’s various activities - in other words, not the protocol of every single experiment, but the relationship between individual experiments, as well as with other non technical operations (e.g. administration) performed by laboratory staff.   
  • to add to each descriptive step the list of records produced, the type and nature of these records, as well as the ‘metadata’ associated with them.   
  • within this perspective, to highlight what has been produced rather than what has been kept. This approach allows us to relate documents to one another and to understand them. This link constitutes an element of the context as it will eventually be reconstructed. The structure (a key concept here) is not different from the context. 
The drawbacks of the metaprotocol are: 
  • The difficulty of implementing a record keeping system in the case of laboratories. It is hard to formalise a system operating in a fuzzy environment, due to changes in the laboratory setup, in document type and nature, and to the existence of non-recordable data like individual recall of know-how and hands-on experience.  
  • The need to work in the company of the scientists: this allows the identification of record ‘sources’, but raises the problem of the observer’s neutrality. The observer’s choice cannot be limited to the scientists’ choices. The metaprotocol allows archivists to steer clear of a number of obstacles, but the description is bound to remain conditioned by the scientist’s perspective.   
  • It is useful to remember that knowledge is constantly evolving, and that the recording process must, therefore, not be a punctual, but a permanent activity. This point stresses the narrow link with the continuum record process. The principle of coupling the record keeping activity with the record production itself is fundamental, but moves well beyond simple technical considerations. The important thing is to ‘understand’ the activity.[15] 

Conclusion 

The question, ‘How are we to study the scientific phenomenon if we are not able to record the documentary traces of its activity?’ may be transfigured into another question, namely, ‘How are we to collect traces of an activity whose production process we are not familiar with?’ 

This question has taken on an almost ritual value for me, as I feel it is likely to condition our future successes and failures. It supposes that we must be able, not only to describe the context within which future records are produced, but also to recognise the influences acting upon our comprehension and selection processes. The question provides fundamental underpinnings for the implementation of a record keeping system. 

As pointed out in my introduction, recent debates in sociology of science have often pondered the question of the construction of scientific knowledge. The question apparently has only limited relevance to our practice as archivists. Yet today we increasingly realise that this construction of knowledge conditions the records themselves - the content - as well as its producers and production - the context - and the links between them - the structure. The description of the context informs us of the changes undergone by the records themselves, both in terms of nature (new supports, like the electronic format) and in terms of type, context and structure and inter-record relationships. Links between records constitute the most important elements in the context . . . and thus the reasoning may go on in circles, highlighting the interrelatedness of all aspects. 

In the particular case of science archives, a detailed, accurate description of the record production context can be performed only if the changes undergone by the activity are taken into account - changes in its nature, its organisation, its knowledge production processes and means of communication. 

This is the main aim of the metaprotocol: to inform us on the workings of the laboratory in order to improve the description of the context and the structure, and in due course condition the selection, description and comprehension of the ‘content’. 

An instrument like this, definitely more convenient than the conservation of all records and artefacts, may aid us in solving the technical difficulties - selection and description - as well as the moral problems - we justify our record selection, we provide information on what has existed and thus contribute to an improved understanding of the context, and hence, to a better understanding of the records in our care. To be continued . . . 
 



[1] Le texte présent est celui de la communication présentée au colloque "Working with Knowledge", Australian Science Archives Project, Canberra, 1998. First of all, I would like to extend my thanks to ASAP for inviting me, and to my colleagues and friends who have take time to comment my on "thoughts": Anne Barrett (Imperial College, London), Helen Morgan (ASAP), Denise Ogilvie (Institut Pasteur, Paris), Julia Sheppard (Wellcome Institute for History of Medicine, London), Odile Welfelé (Archives nationales-CNRS, Paris), John Krige (CRHST, Paris), Russell McCaskie (CSIRO), Gavan McCarthy (ASAP), Philip Kent (CSIRO) and, "as usual", my friend Frank Scheelings (VUB). Ce texte n’ayant pas été modifié en vue de la publication, le lecteur aura, je l’espère, la gentillesse d’excuser sa tournure orale. Pour les mêmes raisons, les notes ont été réduites. Ces quelques pages se veulent un résumé: les différents points évoqués notamment ici feront l’objet d’une publication de synthèse ultérieure.  
[2] Voir sur ce point une bonne histoire de l’archivistique, P. Desalle, Une histoire de l'archivistique, PUQ, 1998, et un excellente synthèse théorique due à T. Cook, "Interaction entre théorie et pratique archivistiques depuis la parution du manuel néerlandais de 1898" dans ICA - XIIIe Congrès International des Archives, 3eme session plénière, Rapport principal, Bejing, 1996; la version longue de cet article sera consultée utilement.  
[3] J’ai pris la liberté de forger ce terme de la manière suivante: les expériences de laboratoires (mais ceci est valable pour toute démarche scientifique de manière générale, même si les appelations diffèrent), sont formalisées par un "protocole d’expérience"; celui-ci vise à décrire l’ensemble des opérations ayant abouti à un résultat X afin de pouvoir être reproduit et / ou modifié; c’est notamment à l’élaboration de ce(s) protocole(s) que sont destinés les cahiers de laboratoires, où on les retrouve. Le protocole définit notamment les contraintes techniques de chaque expérience particulière. En ce sens le "metaprotocole" se veut un descripteur formel de l’ensemble des opérations ayant mené à l’élaboration du ou des protocoles particuliers.  
[4] For a summary see, D. Devriese, "Les archives de la recherche en milieu académique: réflexions sur les lieux de production et la philosophie de conservation", Janus, 1995.2, pp. 20-28; de manière générale voir la partie du numéro consacrée aux archives des sciences, mais aussi par exemple l’article trèes intéressant de Richard J. Cox, "Archival Documentation Strategy: A Brief Intellectual History 1984-1994", Janus, 1995.2, pp. 76-93. Ce sujet a fait l’objet d’un séminaire ICA/SUV – Stama et Commission internationale de Logique, Philosophie et Histoire des sciences (Liège, 28-29 May 1996).  
[5] cf. Mule’s synthesis entitled "Le principe de provenance" dans ICA - XIIIe Congrès International des Archives, Rapport principal, Bejing, 1996. Notamment, ce probleme est en cours d’examen en qui concerne les universités au sein des groupes de travail de la ICA/SUV, see http://www.usyd.edu.au/su/archives/ica_suv/ 
[6] T. Cook, ibid, which discusses Jenkinson, p. 19.  
[7] Sur ce point voir le séminaire ICA/SUV – Stama et Commission internationale de Logique, Philosophie et Histoire des sciences (Liège, 28-29 May 1996).
[8] Helen Samuals, Varsity Letters: Documenting Modern Colleges and Universities, Metuchen, New Jersey, 1992.  
[9] T.Cook, "It's 10 O'Clock: Do you Know Where Your Data Are?" in
Technology Review On-Line , http://www.techreview.com/articles/dec94/cook.html.
[10] Voir par exemple sur ce point les actes du Forum "Données lisibles par machine", DLM, Bruxelles, 18-20 December 1996, dans INSAR-Courrier Européen des Archives, II, 1997. On attend la parution des actes du colloque "The Challenge of Electronic Records for National Archives", Public Record Office, London, 27-31 July 1998.  
[11] G. McCarthy, "The Archivist as Computer Wizard", séminaire ICA/SUV – Stama et Commission internationale de Logique, Philosophie et Histoire des sciences (Liège, 28-29 May 1996).
[12] Implicitement, ceci rejoint les positions de Luciana Duranti et de son équipe (see http://www.slais.ubc.ca/users/duranti/intro.htm); l’application de la critique historique aux documents, utilisées couramment en archivistique traditionnelle devrait nécessairement amener à ce constat; c’est d’ailleurs la raison pour laquelle la notion de ‘metadata’ semble très familière aux archivistes européens ayant une formation d’historien. Voir notamment l’intervention de Peter Horsman dans le même colloque et sa bibliographie. Ce sujet a fait l’objet de nombreuses discussions ICA/SUV Meeting in Stockholm, Sweden, 3-4 September, 1998, "The Impact of Information Technology on Academic Archives". See Heather McNeill’s paper, to be published in the seminar proceedings. De manière générale et pour une première approche: ICA STUDIES, Guide for Managing Electronic Records from an Archival Perspective, Studies 8, February 1997; ICA STUDIES, Electronic Records Programs. Report To the 1994/1995 Survey, December 1996 ; ICA STUDIES, A. Erlandson, Electronic Records Management, Studies 10, April 1997.  
[13] An eloquent example of our misunderstanding is our propensity to compare the growth of electronic data in scientific institutions to that in administration (e.g. large-scale databases). The analogy is invalid, since the administration rarely gives rise to record artefacts that are radically innovative, whether in form or in substance.  
[14] See footnote 4.  
[15] Sur ce point voir en particuliers les communications de D. Ogilvie, O. Welfelé, J. Warnow-Blewett et J. Krige au séminaire ICA/SUV – Stama et Commission internationale de Logique, Philosophie et Histoire des sciences (Liège, 28-29 May 1996).
 


WWKballcolour.gif
back to conference papers page


Published by: Australian Science Archives Project on ASAPWeb
Comments or questions to: ASAPWeb (asapweb@asap.unimelb.edu.au)
Prepared by: Helen Morgan
Graphics by Lisa Cianci
Date modified: 7 October 1999