Difference between revisions of "VALEP:About"

From VALEP
Jump to navigation Jump to search
(Is it digital humanities?)
(The logarithmic scale of archival work)
Line 55: Line 55:
 
When a digital archive only offers rudimentary metadata and rather ambiguous documents, it might be helpful to include a presentation of the digital material that represents the physical structure and arrangement within an archive. Archives typically structure their material into collections and subcollections (or series and subseries), shelves, boxes, folders, etc. This detailed structure often already represents a certain order, e.g., distinguishing between manuscripts and correspondence, providing a chronological arrangement, and/or selecting topics or correspondence partners. Despite remaining inconsistencies, archive users are usually able to work with a mnemotechnical approach, often supported by the archival finding aids. In order to increase the transparency and usability of digital archival sources, the material could be organized in such a way that mirrors the physical structure of the analog archive.
 
When a digital archive only offers rudimentary metadata and rather ambiguous documents, it might be helpful to include a presentation of the digital material that represents the physical structure and arrangement within an archive. Archives typically structure their material into collections and subcollections (or series and subseries), shelves, boxes, folders, etc. This detailed structure often already represents a certain order, e.g., distinguishing between manuscripts and correspondence, providing a chronological arrangement, and/or selecting topics or correspondence partners. Despite remaining inconsistencies, archive users are usually able to work with a mnemotechnical approach, often supported by the archival finding aids. In order to increase the transparency and usability of digital archival sources, the material could be organized in such a way that mirrors the physical structure of the analog archive.
  
=== The logarithmic scale of archival work ===
+
=== The Logarithmic Scale of Archival Work ===
 
[[File:Logarithmicscale.jpg|thumb]]
 
[[File:Logarithmicscale.jpg|thumb]]
 
Structuring a digital archive according to its physical Organization in the analog world also yields a major pragmatic benefit. If digitized copies are stored in folders that mirror the structure of the analog archive, then you could upload the entire material, basically, with a single mouse click. The archival work being involved here is close to zero. On the other hand, if this archive is really large as, for example, the Schlick Papers in Haarlem, which comprises some 50,000 pages, or the Carnap Papers at the ASP, which consists of close to 100,000 pages, then you can expect to process tens of thousands of documents. If an archivist, say, processes 20 documents per hour, this eventually might amount to several years of full time work. So, we compare here two extremes on the scale of gains divided by working time. Most archival resources won't be processed at such a micro-level document structure, for mere lack of time and funding. However, these sources might still be available, at the level of representation of the physical structure of the repository.
 
Structuring a digital archive according to its physical Organization in the analog world also yields a major pragmatic benefit. If digitized copies are stored in folders that mirror the structure of the analog archive, then you could upload the entire material, basically, with a single mouse click. The archival work being involved here is close to zero. On the other hand, if this archive is really large as, for example, the Schlick Papers in Haarlem, which comprises some 50,000 pages, or the Carnap Papers at the ASP, which consists of close to 100,000 pages, then you can expect to process tens of thousands of documents. If an archivist, say, processes 20 documents per hour, this eventually might amount to several years of full time work. So, we compare here two extremes on the scale of gains divided by working time. Most archival resources won't be processed at such a micro-level document structure, for mere lack of time and funding. However, these sources might still be available, at the level of representation of the physical structure of the repository.

Revision as of 11:29, 7 August 2021

VALEP was programmed by Maximilian Damböck and designed by Christian Damböck. It was launched in May 2021, e.g., on Daily Nous, Leiter Report and the IVC website.

Hosts, supporters, and funding

VALEP is located at the server valep.vc.univie.ac.at, which is hosted by the University of Vienna. The Institute Vienna Circle, is maintaining the server. Further financial support is provided by the following sources:

  • FWF research grant P31716: € 16,000 for programming in 2020 and 2021, € 2,000 for data processing in 2021
  • The Vienna Circle Society: € 5,000 for programming in 2021
  • FWF research grant P34887: at least € 12,000 for programming (2021-2023) plus approx. € 30,000 for digitization and data processing (2021-2023)

VALEP and Phaidra

In a future development phase of VALEP, projected for 2022, its data will also be mirrored in the University of Vienna's digital repository Phaidra. The staff and consultants of Phaidra have already supported the initial development of VALEP, including aspects of archival science (Susanne Blumesberger), database design and the technical integration of Phaidra (Raman Ganguly), copyright issues (Seyavash Amini Khanimani), and all details of the metadata design of VALEP (Ratislav Hudak).

Cooperation Partners of VALEP

We are looking for cooperation partners among several international archives that hold material on the history of Logical Empiricism. If you are interested in cooperating with VALEP as an institution, or if you are simply using VALEP for your own research and are interested in depositing your collected material, please contact Christian Damböck.

Archives of Scientific Philosophy (ASP), Hillman Library, University of Pittsburgh

The Archives of Scientific Philosophy has been sharing their electronic resources with VALEP. This includes scans produced by the ASP of the Papers of Rudolf Carnap, Carl Gustav Hempel, Richard C. Jeffrey, Hans Reichenbach, Frank Plumpton Ramsey, and Rose Rand. The material comprises about 30,000 scans and is already available in VALEP. We would like to thank Ed Galloway for his most generous support and Clinton T. Graham for transferring the files.

The scope and mission of VALEP

Error creating thumbnail: File missing

VALEP is an archive management tool that is intended as a platform for the history of Logical Empiricism and related currents.

VALEP functionalities:

  • (left/red part of the window) representation of the hierarchical structure of an archives, including archives, collections, digital reproductions, shelves, boxes, folders, files
  • (middle/green part of the window) transformation of archival material into objects that belong to a certain document category, document type and that are uniquely identified by metadata that include title, description, author, date
  • (upper right/yellow part of the window) archive nodes and documents are characterized by metadata that can be accessed in the upper right section of the screen
  • (lower right/blue part of the window) Files and documents can be viewed in an integrated document viewer (already available) and the objects can be downloaded and printed (to be implemented in 2021)

VALEP stores titles, descriptions and the like as Unicode. But some metadata categories that include date, location, language, persons, and institutions are stored in a relational database and/or using special formats and parsing tools, e.g., EDTF for data, and an internal tool for the mereological grasp of locations. See the metadata page for the details.

Is VALEP Based on the Concept of Digital Humanities?

If one expects from a digital humanities project the adoption of sophisticated statistical methods of experimental research, then the answer is clearly no. Though the data pool built by VALEP might be used for the adoption of such methods in the future, VALEP neither now nor in the near future is planning to integrate any tools for complex statistical evaluation.

On the other hand, VALEP is certainly aiming to collect large amounts of data. The history of Logical Empiricism, together with related currents, such as Neokantianism, French Positivism, British Empiricism, and American Pragmatism, comprises dozens of major and probably thousands of minor individuals, including academic and private scholars. Many of the papers of these relevant individuals can be found in public institutions and private collections. Additional material was collected by academic and private institutions. There are thousands of manuscripts, publications, and probably millions of letters between representatives of the relevant scientific movements that might add significant research value in one way or another in our studies of Logical Empiricism. VALEP allows us to store, preserve, and evaluate all these sources, as soon as they become available in electronic form. Then, we can search and filter them according to our needs and interests, to find the material relevant to us. This is, of course, also an important aspect of digital humanities.

VALEP's Innovative Approach

Existing tools for the management of archival sources include (1) tools that academic archives, such as the Archives of Scientific Philosophy provide; (2) open tools such as PhilArchive, where anybody can upload electronic documents; (3) tools tailored to the presentation of material of a specific origin, such as the papers of Ludwig Wittgenstein. All these tools have in common that they are more or less document oriented. They do not mirror the physical structure of an archive, but rather store documents according to a particular unit of metadata. This approach could be fruitful, if the documents are processed thoroughly and the associated metadata is clear, transparent and sufficiently complex.

However, most of the existing tools include only rudimentary metadata, and, in the case of public archives, single documents are not processed as a logical unit (e.g. letter from Otto Neurath to Rudolf Carnap from December 26, 1934), but rather, the archives provides constructed units, e.g., folders that contain several letters from Carnap to Neurath from the years 1923 to 1929. Sometimes, they might even include additional material unrelated to the main topic of the unit. In such cases, offering effective and meaningful metadata may not be possible at all, simply because the document units are too inconsistent.

Advantages of an Archive Oriented Presentation

When a digital archive only offers rudimentary metadata and rather ambiguous documents, it might be helpful to include a presentation of the digital material that represents the physical structure and arrangement within an archive. Archives typically structure their material into collections and subcollections (or series and subseries), shelves, boxes, folders, etc. This detailed structure often already represents a certain order, e.g., distinguishing between manuscripts and correspondence, providing a chronological arrangement, and/or selecting topics or correspondence partners. Despite remaining inconsistencies, archive users are usually able to work with a mnemotechnical approach, often supported by the archival finding aids. In order to increase the transparency and usability of digital archival sources, the material could be organized in such a way that mirrors the physical structure of the analog archive.

The Logarithmic Scale of Archival Work

Error creating thumbnail: File missing

Structuring a digital archive according to its physical Organization in the analog world also yields a major pragmatic benefit. If digitized copies are stored in folders that mirror the structure of the analog archive, then you could upload the entire material, basically, with a single mouse click. The archival work being involved here is close to zero. On the other hand, if this archive is really large as, for example, the Schlick Papers in Haarlem, which comprises some 50,000 pages, or the Carnap Papers at the ASP, which consists of close to 100,000 pages, then you can expect to process tens of thousands of documents. If an archivist, say, processes 20 documents per hour, this eventually might amount to several years of full time work. So, we compare here two extremes on the scale of gains divided by working time. Most archival resources won't be processed at such a micro-level document structure, for mere lack of time and funding. However, these sources might still be available, at the level of representation of the physical structure of the repository.

Adequate metadata are important

Metadata can be needlessly complex and confusing. Therefore, a careful selection is important. For example, a document should only be associated with relevant metadata categories. Only a letter, for example, has a recipient or a place of posting, whereas a manuscript, unlike a published book or article, may not offer any publication date. So, one important aspect of making metadata adequate is by restricting certain document categories to only category-relevant metadata.

Metadata should have a consistent format. Especially crucial metadata such as date and location. Dates should be able to cover not only (several) single days but also entire months or years, and date ranges, e.g. from December 24 1924 until October 1930. Such an approach will cover also cases where the date of a document is not clear or where a document was produced over a longer period and/or at different days or years. Locations, on the other hand, should become embedded into the mereological structure of geography. The fact that Vienna, for example, belongs to Austria and Europe but also to the Habsburg Empire and the German-speaking world, is a complex relationship structure to represent, but necessary in order to find all Viennese locations when filtering documents from Vienna, Austria, Europe, the Habsburg Empire, or the German-speaking world. Finally, in many other cases, such as individuals, institutions, languages, a consistently searchable and filterable layout is easily obtained if the database uses relational features and stores these items in certain predefined lists or tables. References to these predefined resources, however, should be optional, in order to keep the database structure as flexible as possible.

Archival documents exist in a variety of instances (versions, chapters) across many archival repositories

An important aspect of the merging of several archival sources is that documents tend to be located not only in one folder/box that belongs to one collection. Rather, we often find the following scenarios:

  • An original document is held in archive X, whereas copies are located in other archives, e.g., carbon copies of a letter kept by the sender (which might contain relevant information that the original letter does not provide)
  • Written duplicates of a document might exist, transcriptions and translations, as well as commentaries that are located at different archival repositories.
  • A document might consist of several parts or chapters that, in turn, might be scattered in different archives (some of them might be the original, some might be copies, written duplicates, transcriptions, etc.)

A note on copyright

The international copyright legislation generally stipulates open access to metadata, whereas facsimiles may be published online only if (α) the copyright was granted to the publishers by the copyright holder(s), or (β) the legal situation allows for publication without explicit transfer of copyright. There are two typical scenarios for (β): (β-1) publication of a document is possible if the death dates of all authors involved is more than 70 years ago, which turns the material into public domain; (β-2) publication is possible if the publishers can prove that the copyright holders could not be identified, despite all reasonable efforts to do so.

Copyright protection vs. research access: The flexible VALEP approach

Another logarithmic scale is emerging here. It is often quite easy for prominent individuals such as Carnap, Reichenbach, or Quine, to obtain copyright for their works. However, among their papers is also a wealth of material authored by others - letters TO Carnap, Quine, Reichenbach - or material that touches on privacy rights of others - e.g. Carnap discussing a specific person in a letter. Resolving all the copyright issues emerging from a Nachlass can turn into tedious and unmanageable task. Therefore, it would be advantageous if a database could deal with these issues in a flexible way. Material might be either removed from public access in its entirety - metadata plus facsimiles - or it will be publicly accessible (because metadata are unproblematic), but with restricted access to facsimiles. Moreover, access can be restricted to the internal/non-public realm of the database, allowing access only to authorized staff.

Improving Research Tools across Archival Repositories

Along the lines of these considerations, the following features would be desirable additions to the typical coverage of existing archival tools:

  • To cover the physical structure of an archive (in order to serve the mnemotechnical skills of researchers and make existing finding aids more useful)
  • To provide parts with high gains and low costs first and add the rest -- very high gains and very high costs -- only in these cases where the existing resources make this possible
  • To provide a flexible handling of metadata categories that tailor them to the required document categories
  • To ensure that critical metadata categories such as date and location use a most flexible, consistent and transparent format, together with suitable parsing tools (that avoid inconsistent entries)
  • To implement other critical metadata via predefined lists and tables in a relational database setting, while keeping data fields optional whenever possible
  • To provide suitable tools that enable the processing of decentralized documents that disintegrate into several versions and chapters
  • To enable keeping parts of published material restricted - access to the metadata but not to the facsimiles - or even keeping metadata and the facsimiles internal, as long as copyright issues remain unsolved

VALEP offers them

Indeed, the aforementioned features are all offered by VALEP. The design of this tool was from the beginning centered around the idea of combining representation of an archive via its physical structure with representation via documents. The rest of the innovative features of VALEP in part directly followed from this key idea -- this is true, for example, for the implementation of versions and chapters who somewhat intermediate beteween (general) documents and archives --, and in part dived into the conception on the basis of feedback from archivists and the designer's own experience at the archives.

Who can use VALEP?

VALEP is available to everybody and it's free of charge in all its varieties. Typical users of VALEP might include:

  • Public and private institutions that house material on the history of Logical Empiricism and want to use VALEP as a tool that helps them to distribute their sources and integrate them with other relevant material
  • Private persons that hold collections being relevant for the history of Logical Empiricism and want to use VALEP not just to distribute and integrate their sources but also to safeguard them for the future
  • Researchers from all over the world who digitized material in the archives and are willing to share this with others and/or want to use VALEP as a tool that allows them to process and better organize their sources

If you are interested in using VALEP as an institution, private person, or researcher, please contact Christian Damböck.

Future prospects

The recent (and first) version of VALEP was developed in 2020/21. Until fall 2021 we plan to implement, among other things, the following additional features:

  • Persistent links to all documents, versions, files, and nodes of the archive tree
  • Possibility to assign DOIs to general documents

Features to be implemented in 2022 (preliminary list)

  • Integration of Phaidra: each published VALEP object becomes stored in Phaidra
  • The possibility to selectively restore deleted VALEP objects
  • Possibilities to mark objects in VALEP with flags, together with advanced filter tools
  • Bundles of documents can be loaded to the file viewer
  • The sequence of jpgs that is loaded to the file viewer can be downloaded as a pdf
  • In the internal view (construction site) the nested content of any node of the archive tree can be downloaded to the local computer
  • For each node of the archive tree the number of files that belong to this node becomes displayed

If you are finding any bugs, want to report problems, or have any other suggestions, please contact Christian Damböck.