EAD Conference

EAD Conference

Opening the conference, Ishbel Barnes explained that the Scottish Archive Network aimed to open up the archival heritage of Scotland to the world. In order to do so, it had to take some important decisions regarding descriptive standards and the way in which finding aid information is communicated on the Internet. Considerable interest is being shown in the archive world about Encoded Archival Description (EAD) and the aim of the conference was to help determine whether it should be used in the project.

Daniel Pitti, of the University of Virginia, and the person who originally developed EAD, said that we define ourselves by what we choose to remember and to forget. He gave a vision of universal access, with which researchers could find relevant resources in one place, anytime, from anywhere,. He explained the objectives of EAD: to give accurate representation of archival descriptive practices, to support intelligent access and navigation among archival materials, to help guarantee that information would survive changes in software and hardware, and to enable archives to communicate and share information about dispersed materials. EAD, he explained, was an encoding system for archive descriptions. Different parts of the description (for example the reference, the dates) are “tagged” with special characters to indicate what they are. Encoding could be either procedural, such as is used in word processing, and dedicated to a single purpose, or descriptive, allowing more flexibility and relationships. Whereas the HTML (hypertext mark up language) used by the present generation of world wide web page displays is largely procedural, the SGML (standardized general markup language) is descriptive. SGML has been in existence for 10 years and is an accepted ISO standard. Now a new standard, XML, (extensible markup language) is appearing and will be the basis for future software for browsing on the web. EAD was developed under SGML and is compatible with XML.

There are broadly two types of implementation of EAD: it can be used to do everything, to create, maintain, publish and communicate information in SGML/XML. The Archives of California use it in this way, as do Glasgow University Archives. Alternatively, EAD can be used to communicate and publish data, which is created and maintained in a relational database. This is the model used by the Swedish National Archives and the Public Record Office in London.

Daniel concluded by advising the Scottish Archive Network to define its objectives, based on professional principles and responsibilities, and to define and articulate the evaluation criteria it would use. Archivists, he said, had a responsibility to ensure that content can be communicated and used now and in the future.

Peter Horsman, of the Netherlands Archive School in Amsterdam, said he was not going to attack EAD, but would instead ask where we are going. He took the audience on a high-speed tour of archival theory from the pre-provenance period, when there was one finding aid per repository to the post provenance period, when there is one finding aid per fonds. At the heart of a finding aid system is usually an inventory which contains an introduction with an administrative and custodial history, the original order of the fonds and any arrangement decisions, a classification scheme, a series of multi-level descriptions of the fonds, and an index. Like Bunyan’s Christian, Peter saw the archivist treading a narrow and sometimes difficult path. To the left he said, lies the temptation of technology, and to the right the attraction of standards. Looking at EAD as a standard he observed that it was based on bibliographic description techniques, and questioned how far it fitted archival methodologies and how far it would really become a standard. Differences in national practice, Peter predicted would in a short while produce a US-EAD, a UK-EAD, a UNI-EAD, just as had developed in the library world with the MARC system. He followed this by saying that any low level standard was bound to fail. Instead of looking at data structure standards, data content standards and data value standards, we should be looking at Information System standards. The way forward was to re-design archival description, harnessing the power of the technology and treating descriptions as meta-information.

Peter proposed a simple model with three elements: input applications, which are based on archival methods, a description base, which combines archival and IT standards, and stores descriptions and contextual information, and output applications, which are user oriented and provide a variety of products, including paper copies, web pages, and on-line searching. He also made a plea for ergonomics to be used, so that, for example, archival descriptions were kept manageable and would fit on a single screen. Peter concluded by calling for vision and creativity and professional guts in the development of information systems.

In discussion following his paper, Peter agreed that making ISAD and ISAAR into ISO standards would not do any harm, but would be a long and bureaucratic process.

Kent Haworth, archivist of York University in Canada, and a member of the ICA Descriptive Standards Committee, spoke on the advantages of standards generally. They were means not ends and had to be practical. Standards only work when they are deliberately accepted by an organization or group. Their benefit is to clarify purpose and function, to provide benchmarks and to encourage collaboration and co-operation. They also help to influence behaviour. Standards can cover both content and structure of description. ISAD(G) is a data content and data structure standard, MARC is a data structure standard only. EAD is also a data structure standard, designed mainly by US archivists. As a Canadian, Kent explained that he came to EAD with a natural scepticism and now concludes that it offers potential solutions to some, but not all of the problems facing archivists. One of its main advantages is that it can accommodate the multi-level descriptions archivists demand today. Another is that it is itself based on a more general standard, SGML. EAD is. like MARC, software independent, which is an important feature, as the experience of libraries suggests that new information systems are developed every 5 to 7 years. However, Kent warned, it was essential that EAD was fully compatible with international standards like ISAD and ISAAR.

Kent went on to share some of his experiences with the Canadian Archival Information Network (CAIN) and the Ontario system ARCHEION, with both of which there are similarities to the Scottish Archive Network project. He noted that in Ontario they had decided not to use subject headings, concluding that they would never reach agreement on, for example railway or railroad. The Canadian archival community is further forward in its use of standards, since most comply with national Rules for Archival Description (RAD). Kent called for increased knowledge of users and their needs. He strongly advocated a cycle of prototyping, testing by potential users, refining and re-testing, which he characterised as “prototype to template”.

Returning to the main theme, he warned that there was a steep learning curve for EAD, but urged the Scottish project to investigate its functionality thoroughly, and to base its decision on the question: will EAD improve access to Scottish archives?

Dick Sargent, of the Historic Manuscripts Commission (HMC), ignored EAD in his presentation, and spoke instead on standards, particularly name standards. Arguing that we should separate out the capture and the maintenance of archival information, he pointed out that the primary access point to archives was the name of their creator. He used the structure of the ISAAR(CPF) standard to explain the different areas of information that go to make up a name authority. He then spoke about the HMC ARCHON system which has been phenomenally successful, achieving over 1 million hits per year, compared with less than 10,000 personal visits, letters and faxes.

Dick sketched out a vision of a UK national name authority, which would combine the knowledge of different archivists. Entries would be created locally, then submitted to a central server and merged. The resulting central file would be linked to on-line catalogues and networks and could be used within each repository’s own system. Dick also saw possibilities for linking with Geographic Information Systems (GIS) to bring in, for example, census placename data.

Chris Seifried, of the National Archives of Canada, and a member of ICA/CIT, spoke on the situation in Canada. Shamelessly flattering his audience, he began with a quotation from an 1881 Canadian government report on archives which acknowledged the debt they owed to Scotland and the Register House. He explained that the government’s policy and programme were now focused around the idea of connecting Canadians, and this affects all aspects of public life. For example, a government sponsored programme entitled Canada’s Digital Collections is hiring young people to set up websites of Canadian material, and these include a number of archive scanning projects.

Chris gave a comprehensive survey of the 8 different archival network initiatives going on in his country. First is BCAUL, the British Columbia Archival Union List, which holds around 8,500 fonds level descriptions for 163 repositories. It has a central database using the GEAC library system, with MARC-tagged records and can output in either MARC or EAD. The Archives Network of Alberta (ANA) ;has around 4,000 descriptions from 25 repositories. They carry out ad hoc data capture and use MARC tags. They have no plans to use EAD. The Saskatchewan Archival Information Network and the Manitoba Archival Information Networks (SAIN-MAIN) are prototyping at present. They believe that EAD is the best means of access to multi-level finding aids, which MARC cannot handle. The Ontario network, ARCHEION, already mentioned by Kent Haworth is still being designed. It uses EAD format ASCII files. In Quebec, there are two systems. PISTARD, the Archive nationales du Québec system, uses a distributed approach. Each of the 9 regional offices uses its own system and every month a copy is loaded to a central database. It uses the descriptive rules RAD, but not MARC or EAD. The database was custom built with ORACLE software, and has built in multi-lingual navigation. The system holds around 10,000 fonds, series and sub-series descriptions. The second Quebec system is the Réseau des Archives, which is still being studied. It will use a web crawler to maintain a central index to resources. The central server will use ORACLE or SQL Server software, but EAD is not being considered. By January 2000 it expects to hold 1,000 fonds from 160 repositories. ARCHWAY, the system run from the Nova Scotia archives, has a team that go out and collect data. Their descriptions are compliant with RAD, but there are differences in implementation. The system uses GENCAT software which can output in EAD, but currently makes automatic HTML conversions form the database. ARCHIVIANET, the system of the National Archives, uses BRS-Net software, which allows some multi-level navigation. There are no EAD applications, but the subject is currently being studied with great interest.

Finally, there is the country-wide system, the Canadian Archival Information Network (CAIN). This is a network of networks, developed by the Canadian Council on Archives and is based on provincial union lists. It puts great emphasis on standards and training. Most Canadian systems using SGML and EAD are connected to universities: the University of Saskatchewan supports SAIN-MAIN, while ARCHEION is supported by York University.

In conclusion, Chris recommended that the Scottish Archive Network should try EAD.

Gavan McCarthy, Director of the Australian Science and Technology Heritage Centre, said that there was not a lot of activity round EAD in Australia. The main work was being led by IT staff in the library world, and there was a small-scale investigation of EAD markup in the National Archives of Australia.

Gavan made a plea for encoded context. EAD in itself was not enough. What was needed went beyond the traditional notion of authority. He argued that archivists should follow the traditional path of separating records and content from the context of their creation, and apply this by creating separate context databases, for example name authorities. He advocated a three part approach:

q context

q records

q literature.

This underlies his organisation’s website <http://www.austhec.unimelb.edu.au> which displays information on the creators of the fonds, on the records, and on links to the published works of the creators. The website currently has around 20,000 users per week.

Göran Kristiansson, of the Swedish National Archives (Riksarkivet) spoke about their new Arkis 2 system, which has a relational database and uses EAD as an output format for displaying on their website. They set up their National Archive Database (NAD) in 1990, covering the holdings of the national archives and the provincial archives in Sweden. In 1993 it received a boost, with the provision of 1,000 young unemployed people to work on it, under a government scheme. An early principle was not to re-invent things, and accordingly the MARC-AMC standard was adopted, not to create records in a MARC system but to tag data elements to allow the export of information. One feature of the NAD work was that it led to a fonds war, since for the first time, archives throughout the country could see what other institutions held, and how they related together.

Göran demonstrated the data model for the Arkis 2 system (see illustration) which shows the relations of the different parts to each other. Arkis 2 is a relational database using SQL Server and, unlike its predecessor Arkis 1, it allows true multi-level descriptions. It has also, from the outset, been designed as an Internet available service. During the development phase of Arkis 2, EAD had emerged and its value was quickly recognized. It is used in the same way as MARC-AMC was used, as a means of tagging data elements in the system to allow the export of archival information. He emphasized that the foremost purpose of standards is not to make life easier for archivists but for users.

One interesting feature of the Arkis 2 system is the way it can display the multiple levels of information, including the automatic construction of an organisation chart, based on the descriptive levels. This will allow users to see how the levels of description are derived from organizational levels, in a graphical way.

Göran also reported that in Sweden a major initiative is underway, with central funding, to create a national authority database, involving various parts of the heritage sector, including archives, libraries and museums.

Discussion

One theme for discussion was users, who were mentioned frequently during the presentations, but what is really known about what they want? Peter Horsman advocated simple screens and the study of different sites. Lesley Richmond said that one way of finding out was to study audit trails of sites, to see how people actually work. Gavan said that users should be asked, and their queries and criticisms all followed up. Chris Seifried said that the question might be: what do Scots need? Dorothy Johnstone mentioned the JISC user survey which led to the conclusion that users wanted more catalogues, rather than more digital documents.

Another theme was the need for co-ordination of standards among the different UK projects. Carolynn Bain mentioned the JISC Higher Education Hub, the Scottish Archive Network, and the A2A initiative, and asked if they were going the same way? Dick Sargent replied that his role, as a member of each of the steering groups of these projects, was to ensure consistency, and avoid re-invention of wheels. Dorothy Johnston added that there was not always consistency within the HE initiative.

A further theme, stimulated by Frances Shaw, was the difficulties faced by small archives, with limited technical knowledge, when joining large and ambitious projects like the Scottish Archive Network. Chris Seifried suggested that, rather than concentrating on the technology, it was important to build a community.

It was agreed that, although about half the participating archives in the Scottihs Archive Network were at the conference, there was a need for more information to be circulated to those that were not able to be present.

Glossary

Access Point		A basis of searching for information in an archival information system, such as name, date, subject
ASCII		American Standard Code for Information Interchange, a well established coding or tagging system for text
GIS		Geographical Information Systems, computerised systems in which the information is arranged and accessed on a geographic basis
HTML		Hypertext markup language a means of marking text to allow it to be displayed on the world wide web; likely to be replaced by XML in future
ICA		The International Council on Archives, a non-governmental organization of archives and archivists from around the world.
ICA/CIT		ICA committee on Information Technology, chaired by Peter Horsman, UK member is Ishbel Barnes.
ICA/CDS		ICA committee on descriptive standards, the body which developed ISAD and ISAAR etc.

ISAD(G) or ISAD		International Standard Archival Description (General) a standard for describing archives, developed by a committee of ICA, launched in 1994 and now in 1999 being revised in light of comments. Translated into several languages and widely used across the world. A companion to ISAAR(CPF)
ISAAR(CPF)		International Standard Archival Authority Record (Corporate, Personal and Family) a standard for describing individuals and organizations in archive finding aids developed by an ICA committee
ISO		The International Standards Organisation, composed of national standards bodies, which develops and promotes international standards. ISAD and ISAAR are not ISO standards but could in future become such; a records management standard is being developed at present.
MARC		Machine Readable Cataloguing, a library description system which was used by many archives in the US to catalogue archival material. Main drawback is that it does not allow multi-level descriptions
MARC-AMC		A subset of MARC for Archives and Manuscripts Cataloguing
SGML		Standard Generalised Markup Language, a means of marking or encoding text to allow it to be displayed, developed about 10 years ago and an ISO standard (ISO 8879); extensively used in publishing but not directly compatible with display on the world wide web
Web browser		Software allowing a user to view and copy pages from the world wide web. Examples are Microsoft Internet Explorer, and Netscape Navigator
Web Crawler		a utility that moves automatically from website to website across the Internet gathering up pre-determined information, for example updating a central record
XML		Extensible Markup Language, a new language developed to replace HTML; it is a simplified version of SGML. It is recognised by the latest Web browser software