Difference between revisions of "VALEP:About"

From VALEP
Jump to navigation Jump to search
(Hosts, Supporters, and Funding)
 
(85 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  
VALEP was programmed by [mailto:maxi.damboeck@gmail.com Maximilian Damböck] and designed by [https://homepage.univie.ac.at/christian.damboeck/ Christian Damböck]. It became launched in May 2021, e.g., on [https://dailynous.com/2021/06/01/virtual-archive-of-logical-empiricism/ Daily Nous], [https://leiterreports.typepad.com/blog/2021/05/valep-the-virtual-archive-of-logical-empiricism.html Leiter Report] and  the [https://wienerkreis.univie.ac.at/forschung/valep-virtual-archiv/ IVC website].
+
VALEP was was launched in May 2021, e.g., on [https://dailynous.com/2021/06/01/virtual-archive-of-logical-empiricism/ Daily Nous], [https://leiterreports.typepad.com/blog/2021/05/valep-the-virtual-archive-of-logical-empiricism.html Leiter Report] and  the [https://wienerkreis.univie.ac.at/forschung/valep-virtual-archiv/ IVC website]. The development team includes Christian Damböck (design), Catherine Schlienger and Maximilian Damböck (programming), Helena Müller (UX Design), Brigitta Arden, Lucas Baccarat, Philipp Leon Bauer, Roman Jordan, Lois M. Rendl, and Miguel De la Riva (data processing). 
  
== Hosts, supporters, and financiers ==
+
== VALEP and Phaidra ==
 +
 
 +
All VALEP data are mirrored in the University of Vienna's digital repository [https://phaidra.univie.ac.at/ Phaidra]. The staff and consultants of Phaidra support the development of VALEP, including aspects of archival science (Susanne Blumesberger), database design and the technical integration of Phaidra (Raman Ganguly), copyright issues (Seyavash Amini Khanimani), and all details of the metadata design of VALEP (Rastislav Hudak).
 +
 
 +
== Cooperation Partners of VALEP ==
 +
 
 +
We are looking for cooperation partners among international archives that hold material on the history of Logical Empiricism. If you are interested in cooperating with VALEP as an institution, or if you are simply using VALEP for your own research and are interested in depositing your collected material, please contact [mailto:christian.damboeck@univie.ac.at Christian Damböck].
 +
 
 +
=== Archives of Scientific Philosophy (ASP), University of Pittsburgh Library System ===
  
VALEP is located at the server [https://valep.vc.univie.ac.at/ valep.vc.univie.ac.at], which belongs to the [https://www.univie.ac.at/ University of Vienna]. It is operated by the [https://wienerkreis.univie.ac.at/forschung/valep-virtual-archiv/ Institute Vienna Circle], which also covers the running costs for the server. Further financial support came from the following sources:
+
The [https://digital.library.pitt.edu/collection/archives-scientific-philosophy Archives of Scientific Philosophy] of [https://www.library.pitt.edu/ University of Pittsburgh Library System] has been sharing their electronic resources with VALEP. This includes scans produced by the ASP of the Papers of Rudolf Carnap, Carl Gustav Hempel, Richard C. Jeffrey, Hans Reichenbach, Frank Plumpton Ramsey, and Rose Rand. The material comprises about 30,000 scans and is already available in VALEP. We would like to thank Ed Galloway for his most generous support and Clinton T. Graham for transferring the files.
  
* FWF research grant [https://homepage.univie.ac.at/christian.damboeck/carnap_2018-2021/index.html P31716]: € 16,000 for programming in 2020 and 2021, € 2,000 for data processing in 2021
+
=== Brenner Archiv, University of Innsbruck ===
* The [https://www.univie.ac.at/vcs/ Vienna Circle Society]: € 5,000 for programming in 2021
+
 
* FWF research grant P34887: at least € 12,000 for programming (2021-2023) plus approx. € 30,000 for digitization and data processing (2021-2023)
+
The [https://www.uibk.ac.at/brenner-archiv/index.html.de Brenner Archiv] which houses the Nachlass of Wolfgang Stegmüller made available parts of the correspondence of Wolfgang Stegmüller and his dissertation and habilitation thesis in VALEP. This comprises more than 7,000 scans and correspondences with philosophers such as Hans Albert, Max Black, Rudolf Carnap, Herbert Feigl, Carl-Gustav Hempel, Thomas Kuhn, Paul Lorenzen, and Joseph Sneed. Thanks go to Michael Schorner for preparing the scans and to Ulrike Tanzer for granting the permission to put the material online.
 +
 
 +
=== Moritz Schlick Forschungsstelle, University of Rostock ===
 +
 
 +
In 2022 VALEP made a cooperation agreement with the [https://www.iph.uni-rostock.de/forschung/moritz-schlick-forschungsstelle/ Moritz Schlick Forschungsstelle] in the form of a formal agreeement between the Univerities of Rostock and Vienna. Part of the cooperation will be that the Moritz Schlick Forschungsstelle step-by-step uploads scans of the entire Moritz Schlick Nachlass which comprises more than 50,000 pages. These scans of the Nachlass which is located at the [https://noord-hollandsarchief.nl/ Noord-Hollands Archief] in Haarlem were prepared by the Moritz Schlick Forschungsstelle as part of their Moritz Schlick Edition Project. The Moritz Schlick Forschungsstelle will made available the scans in VALEP and it also will process the files into documents. Future plans also include an electronic edition of the published and unpublished works of Moritz Schlick. Thanks go to Matthias Wunsch and Martin Lemke for their generous support and fruitful cooperation.
  
== VALEP and Phaidra ==
+
=== The Wittgenstein Archives at the University of Bergen (WAB) ===
  
In a future implementation which is projected for 2022 VALEP will also mirror its data in the University of Vienna's digital depository [https://phaidra.univie.ac.at/ Phaidra]. The staff and consultants of Phaidra already supported the development of the first version of VALEP; this included general questions of archival science (Susanne Blumesberger), questions of database design and the technical integration of Phaidra (Raman Ganguly), copyright issues (Seyavash Amini Khanimani), and all details of the metadata design of VALEP (Ratislav Hudak).
+
The [http://wab.uib.no/ WAB] agreed to share material from its Bergen Nachlass Edition (BNE) of the estate of Ludwig Wittgenstein which is currently available at [http://www.wittgensteinsource.org/ Wittgenstein Source] and make it available in VALEP. This will include scans of the manuscripts and, in a future stage of the cooperation, also the XML transcriptions of the works from the Wittgenstein Nachlass. Thanks go to Alois Pichler for his generous support and to the Wren Library at Trinity College Cambridge for granting the permission to upload the material in VALEP.
  
== Cooperation Partners of VALEP ==
+
== The Technical Implementation of VALEP ==
  
We are recently seeking cooperation partners among several international archives that house material on the history of Logical Empiricism. If you are interested in cooperating with VALEP as an institution or simply using VALEP for your own research and store the material that you collected in the archives please contact [mailto:christian.damboeck@univie.ac.at Christian Damböck].
+
[[File:Valep architecture.jpg|thumb|The technical implementation of VALEP]]
  
=== Archives of Scientific Philosophy (ASP), Hillman Library, University of Pittsburgh ===
+
VALEP uses the following system architecture
 +
* A backend that is implemented in Java - Spring-Boot and uses a PostgreSQL Database
 +
* Additional services that include
 +
** an EDTF-Service (NodeJS) for parsing data entries
 +
** (in future implementations) services for Data Import (LaTeX, external SQL databases) inside of the Java backend
 +
** (in future implementations) services for transfer from LaTeX to TEI-XML and HTML (written in Python)
 +
* Local users correspond with the backend via an Apache Webserver (Angular) and a Keycloadk - OAuth2 User Access Management
  
The [https://digital.library.pitt.edu/collection/archives-scientific-philosophy Archives of Scientific Philosophy] share their electronic resources with VALEP. This includes all scans of the papers of Rudolf Carnap, Carl Gustav Hempel, Richard C. Jeffrey, Hans Reichenbach, Frank Plumpton Ramsey, and Rose Rand being processed by the ASP. The material comprises about 30,000 scans and is already fully available in VALEP. We would like to thank Ed Galloway for his most generous support and Clinton T. Graham for transferring the files.
+
VALEP uses the following strategies for data security and long-term availability of data
 +
* Daily data backup with the standard tools provided by the University of Vienna
 +
* Regular data backup to non-local computers
 +
* Persistent links via DOI
 +
* Use of standard formats such as jpg, tiff, pdf, mp3, mp4 for electronic files and LaTeX, HTML, TEI-XML for electronic editions
 +
* Mirroring of all data at [https://phaidra.univie.avc.at Phaidra]
 +
* (in future implementations) integration of data on persons, institutions, locations, etc. in existing online databases such as [https://www.wikidata.org/wiki/ Wikidata]
  
== The scope and mission of VALEP ==
+
== The Scope and Mission of VALEP ==
  
 
  [[File:VALEP-window-public.jpg|thumb]]
 
  [[File:VALEP-window-public.jpg|thumb]]
  
VALEP is an archive management tool that is intended as a platform for the history of Logical Empiricism and related currents.  
+
VALEP is an archive management and edition tool that is intended as a platform for the history of Logical Empiricism and related currents. ([https://doi.org/10.48666/875836 Presentation from September 2022])
 +
 
 +
VALEP functionalities:
 +
* (left/red part of the window) representation of the hierarchical structure of an archives, including archives, collections, digital reproductions, shelves, boxes, folders, files
 +
 
 +
* (middle/green part of the window) transformation of archival material into objects that belong to a certain document category, document type and that are uniquely identified by [[metadata]] that include title, description, author, date
 +
 
 +
* (upper right/yellow part of the window) archive nodes and documents are characterized by [[metadata]] that can be accessed in the upper right section of the screen
  
VALEP processes
+
* (lower right/blue part of the window) Files, documents, and editions can be viewed in an integrated document viewer (already available) and the objects can be downloaded and printed (to be implemented in 2021)
* (left/red part of the window) the hierarchical structures of archives that include archives, collections, digitizations, shelfs, boxes, folders, files
 
  
* (middle/green part of the window) documents that process files of an archive into objects that belong to a certain document category, document type and become specified by means of [[metadata]] that include title, description, author, date
+
VALEP records titles, descriptions and the like in the text format unicode. But some [[metadata]] categories, including date, location, language, persons, and institutions, are stored in a relational database and/or using special formats and parsing tools, e.g., EDTF for dates, and an internal tool for the mereological structure of locations. See the [[metadata]] page for the details. In future implementations (presumably comming in 2024/25) VALEP will integrate its data on persons, institutions, locations in external resources such as [https://www.wikidata.org/wiki/ Wikidata]. 
  
* (upper right/yellow part of the window) All archive nodes and documents are characterized by [[metadata]] that can be viewed in the upper right part of the window
+
=== The Place of VALEP within the Digital Humanities ===
  
* (lower right/blue part of the window) Files and documents can be watched in an integrated document viewer (already available) and they can be downloaded and printed (to be implemented in 2021)
+
VALEP's goals is the collection of large amounts of data. The history of Logical Empiricism, together with related currents, such as Neokantianism, French Positivism, British Empiricism, and American Pragmatism, comprises dozens of major and probably thousands of minor individuals, including academic and private scholars. Their papers can be found in public institutions and private collections, as well as additional material collected by academic and private institutions. There are thousands of manuscripts, publications, and probably millions of letters that might add significant research value in one way or another in our studies of Logical Empiricism. VALEP allows us to store, preserve, evaluate, and (in future implementations) also edit all these sources, as soon as they become available in digital form. One can organize, search and filter them  in VALEP according to ones needs and interests.
  
VALEP stores titles, descriptions and the like as Unicode. But some [[metadata]] categories that include date, location, language, persons, and institutions are stored here via references in a relational database and/or using special formats and parsing tools, e.g., EDTF for data, and an internal tool for the mereological grasp of locations. See the [[metadata]] page for the details.
+
=== VALEP's Innovative Approach ===
  
=== Is it digital humanities? ===
+
Existing tools for the management of archival sources include (1)  tools that academic archives, such as the [https://digital.library.pitt.edu/collection/archives-scientific-philosophy Archives of Scientific Philosophy] provide; (2) open tools such as [https://philarchive.org/ PhilArchive], where anybody can upload electronic documents; (3) tools tailored to the presentation of material of a specific origin, such as the papers of [http://www.wittgensteinsource.org/ Ludwig Wittgenstein]. All these tools have in common that they are more or less '''document oriented'''. They do not mirror the physical structure of an archive, but rather store documents according to a particular unit of metadata. This approach could be fruitful, if the documents are processed thoroughly and the associated metadata is clear, transparent and sufficiently complex.
  
If one expects from a digital humanities project the adoption of sophisticated statistical methods of experimental research, then the answer is clearly no. Though the data pool being built by VALEP might in the future be used for the adoption of such methods, VALEP neither now nor in the near future is planning to integrate any tools for complex statistical evaluation.  
+
However, most of the existing tools include only rudimentary metadata, and, in the case of public archives, single documents are not processed as a logical unit (e.g. letter from Otto Neurath to Rudolf Carnap from December 26, 1934), but rather, the archives provides constructed units, e.g., folders that contain several letters from Carnap to Neurath from the years 1923 to 1929. Sometimes, they might even include material unrelated to the main topic of the unit. In such cases, offering effective and meaningful metadata may not be possible at all, simply because the document units are too inconsistent.
  
On the other hand, VALEP is certainly aiming to collect large amounts of data. The history of Logical Empiricism, together with related currents such as Neokantianism, French Positivism, British Empiricism, and American Pragmatism, comprises of dozens of main figures and probably thousands of minor figures that include university and private scholars. The estates of many of these relevant figures are to be found in public institutions and private collections. Further material was collected by relevant universitarian and private institutions. There are thousands of manuscripts, publications, and probably millions of letters between representatives of the relevant currents that might be taken into account in one or another way, in our studies of Logical Empiricism. VALEP allows us to story any of these sources, as soon as we get them available in electronic form. Then, we can search them and filter them, in order to select the material that is relevant for us. This is, of course, also a variety of digital humanities.
+
=== Advantages of an Archive Oriented Presentation ===
  
=== Existing tools are document oriented and typically cover only rudimentary metadata ===
+
When a digital archive only provides rudimentary metadata and rather ambiguous documents, it might be helpful to present the digital material in a way that reflects the physical structure and arrangement of the archive. Archives typically structure their material into collections and subcollections (or series and subseries), shelves, boxes, folders, etc. This detailed structure already represents a certain order, e.g., distinguishing between manuscripts and correspondence, providing a chronological arrangement, and/or selecting topics or correspondence partners. Despite remaining inconsistencies, archive users are usually able to work with a mnemotechnical approach, often supported by the archival finding aids. Representing the digital material according to the physical arrangement of the analog archive, might increase the transparency and usability of digital archival sources.
  
Existing tools for the management of archival sources include (1)  those tools that university archives such as the [https://digital.library.pitt.edu/collection/archives-scientific-philosophy Archives of Scientific Philosophy] use; (2) open tools such as [https://philarchive.org/ PhilArchive] where everybody might upload electronic documents; (3) tools being tailored for the presentation of the material of a specific origin such as the papers of [http://www.wittgensteinsource.org/ Ludwig Wittgenstein]. All these tools have in common that they are more or less strictly '''document oriented'''. They do not mirror the physical structure of an archive but rather store documents that form a particular unit of metadata. This approach could be fruitful, if the processing of the documents might be rather well developed and the metadata might be clear and transparent and sufficiently complex.  
+
=== The Logarithmic Scale of Archival Work ===
 +
[[File:Logarithmicscale.jpg|thumb]]
 +
Structuring a digital archive according to its physical organization in the analog world also yields a major pragmatic benefit. If digitized copies are stored in folders that mirror the structure of the analog archive, then you could upload the entire material, basically, with a single mouse click. The archival work involved is close to zero. On the other hand, if this archive is really large as, for example, the Schlick Papers in Haarlem, which comprises some 50,000 pages, or the Carnap Papers at the ASP, which consists of close to 100,000 pages, then you can expect to process tens of thousands of documents. If an archivist, say, is processing 20 documents per hour, this eventually will amount to several years of full time work. As you can see, we are here comparing two extremes on the scale of gains divided by working time. Most archival resources won't be processed at such a micro level, for mere lack of time and funding. However, these sources might still be available, at the level of representation of the physical structure of the repository.
  
However, the problem is that most of the existing tools cover only rather rudimentary metadata, and, in the case of the tools being used by public archives, the problem is often that they hardly process single documents as forming a logical unit of some kind (e.g. letter from Otto Neurath to Rudolf Carnap from December 26, 1934) but rather focus on those units being naturally provided by the archive, viz., folders that contain, e.g., several letters from Carnap to Neurath from the years 1923 to 1929 and sometimes might also include further material that does not directly relate to the main theme. In cases like that, complex metadata may not be possible at all, simply because the document units are too vague.
+
=== The Importance of [[Metadata]] ===
  
=== An archive oriented presentation might be helpful ===
+
Metadata can be needlessly complex and confusing. Therefore, a careful selection is important. For example, a document should only be associated with relevant metadata categories. Only a letter, for example, has a recipient or a place of posting, whereas a manuscript, unlike a published book or article, may not offer any publication date. So, one important aspect of making metadata adequate is by restricting certain document categories to only category-relevant metadata.
  
In cases where a digital archive only covers rudimentary metadata and rather ambiguous documents, it might be most helpful to include a presentation of the digital material that represents the physical structure of an archive. Archives typically structure their material into collections and subcollections, shelfs, boxes, folders, and the like, and the finegrained structure of this organization of the material very often already represents a certain order, e.g., distinguishes between manuscripts and correspondence, puts some chronological order to the material and/or picks out certain topics or correspondence partners. Even if such an order is quite inconsistent and also covers pure chaos at times, users of an archive usually are able to use this order in a mnemotechnical way, often supported by useful finding aids that exist for an archive. Therefore, the most obvious way to make electronic archival sources more transparent and usable would be to add a perspective on the material that mirrors the physcial structure of the archive.
+
Metadata should have a consistent format. Especially crucial metadata such as [[Metadata#Date|date]] and [[Metadata#Location|location]]. Dates should be able to cover not only (several) single days but also entire months or years, and date ranges, e.g. from December 24, 1924, until October 1930. Such an approach will cover also cases where the date of a document is not clear or where a document was produced over a longer period and/or at different days or years. Locations, on the other hand, should become embedded into the mereological structure of geography. The fact that Vienna, for example, belongs to Austria and Europe but also to the Habsburg Empire and the German-speaking world, is a complex relationship structure to represent, but necessary in order to find all Viennese locations when filtering documents for Vienna, Austria, Europe, the Habsburg Empire, or the German-speaking world. Finally, in other cases, such as with individuals, institutions, languages, we can design an application with a precise search and filter function when we establish relational features and store them in predefined lists or tables. References to these predefined resources, however, should be optional, in order to keep the database structure as flexible as possible.
  
=== The logarithmic scale of archival work ===
+
=== Archival Documents from Different Repositories: The Merging of Formats and [[#Versions|Versions]] ===
[[File:Logarithmicscale.jpg|thumb]]
+
An important aspect of merging several archival sources is that documents tend to be located not only in one folder/box held by one collection. Rather, we often find the following scenarios:
Note also that representing an archive in the way in which it is physically organized also yields a very strong pragmatic benefit. If a digitization of an archive is stored in folders that mirror the structure of the archive, then in VALEP one may upload the entire material, basically, with a single mouse click. The archival work being involved here is close to zero. On the other hand, if this archive is really large as, for example, the Schlick papers in Haarlem that comprise some 50,000 pages or the Carnap papers at the ASP that might come close to 100,000 pages, then one might expect that a carefull processing of the documents that belong to this archive might possibly need to process several tenthousands of documents. If an archivist, say, processes 20 documents per hour, this might amount at several years of full time work of an archivist. So, we compare here two extremely different points on the scale of gains divided by working time. Most of the available archival resources will never become processed into a fine grained document structure, for mere lack of time and financial resources. However, these sources might still be made available, at the level of representation of the physical structure of the source.  
+
* An original document is held in archive X, whereas copies are located in other archives, e.g., carbon copies of a letter kept by the sender (which might contain relevant information that the original letter does not provide)
 +
* Written duplicates of a document might exist, transcriptions and translations, as well as commentaries that are located at different archival repositories.
 +
* A document might consist of several parts or chapters that, in turn, might be scattered among different archives (some of them might be the original, some might be copies, written duplicates, transcriptions, etc.)
  
=== Adequate [[metadata]] are important ===
+
=== A Note on Copyright ===
  
Metadata can be needlessly complex and confusing. A careful selection is important. This includes that a document should be associated only with these metadata categories that might become relevant for it. Only a letter, for example, has a receiver or a place of posting, whereas a manuscript unlike a published book or article may not offer any publication data. So, one important aspect of making making metadata adequate is to restrict documents of a certain category to those metadata categories being relevant here.  
+
The international copyright legislation generally stipulates open access to metadata, whereas reproductions may be published online only if (a) the copyright was granted to the publishers by the copyright holder(s), or (b) the legal situation allows for publication without explicit transfer of copyright. There are two typical scenarios for (b): (b-1) publication of a document is possible if the death dates of all authors involved is more than 70 years ago, which turns the material into public domain; (b-2) publication is possible if the publishers can prove that the copyright holders could not be identified, despite all reasonable efforts to do so.
  
But metadata should also be carefully selected, regarding their format. This holds, in particular, for key metadata such as [[Metadata#Date|date]] and [[Metadata#Location|location]]. Dates should be able to cover not only (several) single days but also entire months or years, and date ranges, e.g. from December 24 1924 until October 1930. This allows one to cover also cases where the date of a document is not sufficiently localized or where a document was produced over a longer period and/or at different days or years. Locations, on the other hand, should become embedded into the mereological structure of geography. That Vienna, for example, belongs to Austria and Europe but also to the Habsburg Empire and the German speeking world, is a fact that is not easily to be reproduced but is needed in order to pick out all Viennese locations, if one filters documents from Vienna, Austria, Europe, the Habsburg Empire, or the German speeking world. Finally, in many other cases, e.g., regarding persons, institutions, languages, a consistently searchable and filterable layout is easily obtained if the database uses relational features and stores these items in certain predefined lists or tables. References to these predefined resources, however, should typically be optional, in order to keep the structure of a database as flexible as possible.  
+
=== Copyright Protection vs. Research Access: The Flexible VALEP Approach ===
  
=== Documents might have instances ([[#Versions|versions]], [[#Chapters|chapters]]) being spread over different archival sources ===
+
A logarithmic scale is emerging here. Prominent individuals, such as Carnap, Reichenbach, or Quine, could usually obtain copyright for their works easily. However, among their papers is also a wealth of material authored by others - letters TO Carnap, Quine, Reichenbach - or material that touches on privacy rights of others - e.g. Carnap discussing a specific person in a letter. Resolving all copyright issues unfolding from a Nachlass can turn into a tedious and unmanageable task. Therefore, it would be advantageous if a database could deal with these issues in a flexible way. Some material might be either removed from public access in its entirety - metadata plus reproductions - or public access will be granted (because metadata are unproblematic), but with restricted access to reproductions. Moreover, access can be restricted to the internal/non-public realm of the database, allowing access only to authorized staff.
  
An important aspect of the integration of several archival sources is that documents tend to be located not only in one folder/box that belongs to collection X. Rather, the following holds quite frequently:
+
=== Improving Research Tools Across Archival Repositories ===
* There is the original document being located in archive X, whereas copies are to be found in other archives, e.g., blueprints of an letter being kept by the sender (which might contain relevant information that the original letter does not provide)
 
* There are written duplicates of a document, transcriptions, and translations as well as commentaries that lie at very different archival locations.
 
* Finally, a document might disintegrate into several parts or chapters that, in turn, might be spread over different archives (some of them might be the orignal source, some might be copies, written duplicates, transcriptions, etc.)
 
  
=== A note on copyright ===
+
To sum up, the following features would improve existing archival tools:
 +
* representation of the physical structure of an archive (in order to support the mnemotechnical skills of researchers and make existing finding aids more useful)
 +
* concentrating on areas with high gains and low costs first, continuing with the rest -- high gains and high costs -- only when feasible through available funding
 +
* providing a flexible approach towards metadata categories, matching them with the pertinent document categories
 +
* ensuring that critical metadata categories, such as date and location, use a flexible, consistent and transparent format, together with suitable parsing tools (to avoid inconsistent entries)
 +
* implementing additional critical metadata via predefined lists and tables in a relational database setting, while keeping data fields optional whenever possible
 +
* providing efficient tools that enable the processing of related documents held at different institutions, which can exist in a variety of formats, versions, expressions, such as carbon copies, transcriptions, or annotated duplicates
 +
* allowing for parts of the published material to remain restricted - access to the metadata but not to the reproductions - or even restricting both metadata and reproductions, as long as copyright issues remain unsolved
  
The international copyright situation dictates that it is unproblematic, in principle, to make all varieties of metadata openly available, whereas facsimiles may be published online only if (α) the copyright was granted to the publishers by the copyright holders, or (β) there is a legal situation that allows publication without explicit transfer of copyright. (β) falls into two typical case types: (β-1) publication of a document is possible if all involved authors died at least 70 years ago, which makes the material public domain; (β-2) publication is possible if the publishers can prove that the copyright holders could not be identified though the publishers tried to find them in several reasonable ways.
+
=== VALEP as an Innovative Tool ===
  
=== Keep material internal as long as the copyright issues could not be positively resolved ===
+
Indeed, VALEP does provide all aforementioned features. From the start, the database design was centered around the idea of reflecting both the physical structure of an archive as well as its content. From this initial focus, all additional innovative features have naturally evolved. One example are the concepts of 'versions' and 'chapters'. They manage to intermediate between the realms of (general) documents and archives. The development of these concepts were in part inspired through feedback from archivists and the designer's own archival experience.
  
Another logarithmic scale is emerging here. It is often quite easy for big figures such as Carnap, Reichenbach, or Quine, to get copyrights granted for everything they wrote. But in their papers there is also a wealth of material that was written by others - letters TO Carnap, Quine, Reichenbach - or touches upon privacy rights of others - when Carnap talks in a letter ABOUT a person X. To solve all the copyright issues that emerge in a huge Nachlass might become a tedious and almost unmanagable task. Thereofore, it might be desirable that a an database enables to deal with these issues in a maximally flexible way. Material might be either kept internal in its entirety - metadata plus facsimiles - or it might become published (because metadata are unproblematic, in principle) but without public access to the facsimiles. Moreover, it should be possible to restrict access to the internal level of a database to those parts of the material that the account holder is allowed to see.
+
=== Who Can Use VALEP? ===
  
=== Desirable Features ===
+
VALEP is available to the general public, and it's free of charge in all its applications. Typical users of VALEP might include:
 +
* public and private institutions that hold material on the history of Logical Empiricism, using VALEP as a tool to publicize their sources and join them with other relevant material
 +
* private researchers wanting to utilize VALEP not only to distribute and merge their sources but also to preserve them for the future
 +
* researchers from all over the world who obtained digitized archival copies and are willing to share them with the research community, and/or want to process and organize their sources
  
Along the lines of these considerations, the following features would be desirable additions to the typical coverage of existing archival tools:
+
If you are interested in using VALEP as an institution, private individual, or researcher, please contact [mailto:christian.damboeck@univie.ac.at Christian Damböck].
* To cover the physical structure of an archive (in order to serve the mnemotechnical skills of researchers and make existing finding aids more useful)
 
* To provide parts with high gains and low costs first and add the rest -- very high gains and very high costs -- only in these cases where the existing resources make this possible
 
* To provide a flexible handling of metadata categories that tailor them to the required document categories
 
* To ensure that critical metadata categories such as date and location use a most flexible, consistent and transparent format, together with suitable parsing tools (that avoid inconsistent entries)
 
* To implement other critical metadata via predefined lists and tables in a relational database setting, while keeping data fields optional whenever possible
 
* To provide suitable tools that enable the processing of decentralized documents that disintegrate into several versions and chapters
 
* To enable keeping parts of published material restricted - access to the metadata but not to the facsimiles - or even keeping metadata and the facsimiles internal, as long as copyright issues remain unsolved
 
  
=== VALEP offers them ===
+
== Hosts, Supporters, and Funding ==
  
Indeed, the aforementioned features are all offered by VALEP. The design of this tool was from the beginning centered around the idea of combining representation of an archive via its physical structure with representation via documents. The rest of the innovative features of VALEP in part directly followed from this key idea -- this is true, for example, for the implementation of versions and chapters who somewhat intermediate beteween (general) documents and archives --, and in part dived into the conception on the basis of feedback from archivists and the designer's own experience at the archives.
+
VALEP is located at the server [https://valep.vc.univie.ac.at/ valep.vc.univie.ac.at], which is hosted by the [https://www.univie.ac.at/ University of Vienna]. The [https://wienerkreis.univie.ac.at/forschung/valep-virtual-archiv/ Institute Vienna Circle], is maintaining the server. Further financial support is provided by the following sources:
  
=== Who can use VALEP? ===
+
* FWF research grant [https://homepage.univie.ac.at/christian.damboeck/carnap_2018-2021/index.html P31716]: € 16,000 for programming in 2020 and 2021, € 2,000 for data processing in 2021
 +
* The [https://www.univie.ac.at/vcs/ Vienna Circle Society]: € 5,000 for programming in 2021
 +
* FWF research grant [https://www.fwf.ac.at/forschungsradar/10.55776/P34887 P34887]: about 20 percent of the entire funds, equalling to approximately € 120,000 for the development of VALEP
 +
* FWF Grant for Digital Publication [https://www.fwf.ac.at/forschungsradar/10.55776/PUD31 PUD 31-G]: € 50,000 (2023-2025) "Digital Edition of the Diaries of Rudolf Carnap 1908-1935".
 +
* FWF Grant for Digital Publication [https://www.fwf.ac.at/forschungsradar/10.55776/PUD39 PUD 39-G]: € 50,000 (2024-2026) "Otto Neurath. Manuscripts and Correspondence".
  
VALEP is available to everybody and it's free of charge in all its varieties. Typical users of VALEP might include:
+
== Future Developments ==
* Public and private institutions that house material on the history of Logical Empiricism and want to use VALEP as a tool that helps them to distribute their sources and integrate them with other relevant material
 
* Private persons that hold collections being relevant for the history of Logical Empiricism and want to use VALEP not just to distribute and integrate their sources but also to safeguard them for the future
 
* Researchers from all over the world who digitized material in the archives and are willing to share this with others and/or want to use VALEP as a tool that allows them to process and better organize their sources
 
If you are interested in using VALEP as an institution, private person, or researcher, please contact [mailto:christian.damboeck@univie.ac.at Christian Damböck].
 
  
== Future prospects ==
+
The first version of VALEP was developed in 2020/21. In 2022 we added several bugfixes as well as the integration of VALEP with [https://phaidra.univie.ac.at/ Phaidra] as a mirror for all VALEP data and the possibility to add documents as subdocuments to other documents and therefore setup arbitrarily complex hierarchical structures inside of documents.
  
The recent (and first) version of VALEP was developed in 2020/21. Until fall 2021 we plan to implement, among other things, the following additional features:
 
* Persistent links to all documents, versions, files, and nodes of the archive tree
 
* Possibility to assign DOIs to general documents
 
  
Features to be implemented in 2022 (preliminary list)
+
Future plans include:
* Integration of [https://phaidra.univie.ac.at/ Phaidra]: each published VALEP object becomes stored in Phaidra
+
* traffic protocols, display of internal data and statistics for each VALEP element
* The possibility to selectively restore deleted VALEP objects  
+
* integration of digital representations of texts and comments using LaTeX, XML, and a Git hub
* Possibilities to mark objects in VALEP with flags, together with advanced filter tools
+
* WikiData integration for persons, institutions, locations, and other metadata items
* Bundles of documents can be loaded to the file viewer
+
* more powerful filter tools that also allow to search the archive tree
* The sequence of jpgs that is loaded to the file viewer can be downloaded as a pdf
+
* option to selectively restore deleted VALEP objects  
* In the internal view (construction site) the nested content of any node of the archive tree can be downloaded to the local computer
+
* option of flagging objects in VALEP, together with advanced filter tools
* For each node of the archive tree the number of files that belong to this node becomes displayed
+
* loading bundles of documents into the file viewer
 +
* downloading JPGs within the file viewer as PDFs
 +
* option to download the nested content of any node of the archive tree to a local computer
 +
* XML and LaTeX download
 +
* displaying the number of files belonging to each node of the archive tree  
  
If you found any bugs, want to report shortcomings of VALEP or point to desired features, please contact [mailto:christian.damboeck@univie.ac.at Christian Damböck].
+
If you encounter any bugs, want to report problems, or have any feedback or suggestions, please contact [mailto:christian.damboeck@univie.ac.at Christian Damböck].

Latest revision as of 07:31, 8 October 2024

VALEP was was launched in May 2021, e.g., on Daily Nous, Leiter Report and the IVC website. The development team includes Christian Damböck (design), Catherine Schlienger and Maximilian Damböck (programming), Helena Müller (UX Design), Brigitta Arden, Lucas Baccarat, Philipp Leon Bauer, Roman Jordan, Lois M. Rendl, and Miguel De la Riva (data processing).

VALEP and Phaidra

All VALEP data are mirrored in the University of Vienna's digital repository Phaidra. The staff and consultants of Phaidra support the development of VALEP, including aspects of archival science (Susanne Blumesberger), database design and the technical integration of Phaidra (Raman Ganguly), copyright issues (Seyavash Amini Khanimani), and all details of the metadata design of VALEP (Rastislav Hudak).

Cooperation Partners of VALEP

We are looking for cooperation partners among international archives that hold material on the history of Logical Empiricism. If you are interested in cooperating with VALEP as an institution, or if you are simply using VALEP for your own research and are interested in depositing your collected material, please contact Christian Damböck.

Archives of Scientific Philosophy (ASP), University of Pittsburgh Library System

The Archives of Scientific Philosophy of University of Pittsburgh Library System has been sharing their electronic resources with VALEP. This includes scans produced by the ASP of the Papers of Rudolf Carnap, Carl Gustav Hempel, Richard C. Jeffrey, Hans Reichenbach, Frank Plumpton Ramsey, and Rose Rand. The material comprises about 30,000 scans and is already available in VALEP. We would like to thank Ed Galloway for his most generous support and Clinton T. Graham for transferring the files.

Brenner Archiv, University of Innsbruck

The Brenner Archiv which houses the Nachlass of Wolfgang Stegmüller made available parts of the correspondence of Wolfgang Stegmüller and his dissertation and habilitation thesis in VALEP. This comprises more than 7,000 scans and correspondences with philosophers such as Hans Albert, Max Black, Rudolf Carnap, Herbert Feigl, Carl-Gustav Hempel, Thomas Kuhn, Paul Lorenzen, and Joseph Sneed. Thanks go to Michael Schorner for preparing the scans and to Ulrike Tanzer for granting the permission to put the material online.

Moritz Schlick Forschungsstelle, University of Rostock

In 2022 VALEP made a cooperation agreement with the Moritz Schlick Forschungsstelle in the form of a formal agreeement between the Univerities of Rostock and Vienna. Part of the cooperation will be that the Moritz Schlick Forschungsstelle step-by-step uploads scans of the entire Moritz Schlick Nachlass which comprises more than 50,000 pages. These scans of the Nachlass which is located at the Noord-Hollands Archief in Haarlem were prepared by the Moritz Schlick Forschungsstelle as part of their Moritz Schlick Edition Project. The Moritz Schlick Forschungsstelle will made available the scans in VALEP and it also will process the files into documents. Future plans also include an electronic edition of the published and unpublished works of Moritz Schlick. Thanks go to Matthias Wunsch and Martin Lemke for their generous support and fruitful cooperation.

The Wittgenstein Archives at the University of Bergen (WAB)

The WAB agreed to share material from its Bergen Nachlass Edition (BNE) of the estate of Ludwig Wittgenstein which is currently available at Wittgenstein Source and make it available in VALEP. This will include scans of the manuscripts and, in a future stage of the cooperation, also the XML transcriptions of the works from the Wittgenstein Nachlass. Thanks go to Alois Pichler for his generous support and to the Wren Library at Trinity College Cambridge for granting the permission to upload the material in VALEP.

The Technical Implementation of VALEP

The technical implementation of VALEP

VALEP uses the following system architecture

  • A backend that is implemented in Java - Spring-Boot and uses a PostgreSQL Database
  • Additional services that include
    • an EDTF-Service (NodeJS) for parsing data entries
    • (in future implementations) services for Data Import (LaTeX, external SQL databases) inside of the Java backend
    • (in future implementations) services for transfer from LaTeX to TEI-XML and HTML (written in Python)
  • Local users correspond with the backend via an Apache Webserver (Angular) and a Keycloadk - OAuth2 User Access Management

VALEP uses the following strategies for data security and long-term availability of data

  • Daily data backup with the standard tools provided by the University of Vienna
  • Regular data backup to non-local computers
  • Persistent links via DOI
  • Use of standard formats such as jpg, tiff, pdf, mp3, mp4 for electronic files and LaTeX, HTML, TEI-XML for electronic editions
  • Mirroring of all data at Phaidra
  • (in future implementations) integration of data on persons, institutions, locations, etc. in existing online databases such as Wikidata

The Scope and Mission of VALEP

VALEP-window-public.jpg

VALEP is an archive management and edition tool that is intended as a platform for the history of Logical Empiricism and related currents. (Presentation from September 2022)

VALEP functionalities:

  • (left/red part of the window) representation of the hierarchical structure of an archives, including archives, collections, digital reproductions, shelves, boxes, folders, files
  • (middle/green part of the window) transformation of archival material into objects that belong to a certain document category, document type and that are uniquely identified by metadata that include title, description, author, date
  • (upper right/yellow part of the window) archive nodes and documents are characterized by metadata that can be accessed in the upper right section of the screen
  • (lower right/blue part of the window) Files, documents, and editions can be viewed in an integrated document viewer (already available) and the objects can be downloaded and printed (to be implemented in 2021)

VALEP records titles, descriptions and the like in the text format unicode. But some metadata categories, including date, location, language, persons, and institutions, are stored in a relational database and/or using special formats and parsing tools, e.g., EDTF for dates, and an internal tool for the mereological structure of locations. See the metadata page for the details. In future implementations (presumably comming in 2024/25) VALEP will integrate its data on persons, institutions, locations in external resources such as Wikidata.

The Place of VALEP within the Digital Humanities

VALEP's goals is the collection of large amounts of data. The history of Logical Empiricism, together with related currents, such as Neokantianism, French Positivism, British Empiricism, and American Pragmatism, comprises dozens of major and probably thousands of minor individuals, including academic and private scholars. Their papers can be found in public institutions and private collections, as well as additional material collected by academic and private institutions. There are thousands of manuscripts, publications, and probably millions of letters that might add significant research value in one way or another in our studies of Logical Empiricism. VALEP allows us to store, preserve, evaluate, and (in future implementations) also edit all these sources, as soon as they become available in digital form. One can organize, search and filter them in VALEP according to ones needs and interests.

VALEP's Innovative Approach

Existing tools for the management of archival sources include (1) tools that academic archives, such as the Archives of Scientific Philosophy provide; (2) open tools such as PhilArchive, where anybody can upload electronic documents; (3) tools tailored to the presentation of material of a specific origin, such as the papers of Ludwig Wittgenstein. All these tools have in common that they are more or less document oriented. They do not mirror the physical structure of an archive, but rather store documents according to a particular unit of metadata. This approach could be fruitful, if the documents are processed thoroughly and the associated metadata is clear, transparent and sufficiently complex.

However, most of the existing tools include only rudimentary metadata, and, in the case of public archives, single documents are not processed as a logical unit (e.g. letter from Otto Neurath to Rudolf Carnap from December 26, 1934), but rather, the archives provides constructed units, e.g., folders that contain several letters from Carnap to Neurath from the years 1923 to 1929. Sometimes, they might even include material unrelated to the main topic of the unit. In such cases, offering effective and meaningful metadata may not be possible at all, simply because the document units are too inconsistent.

Advantages of an Archive Oriented Presentation

When a digital archive only provides rudimentary metadata and rather ambiguous documents, it might be helpful to present the digital material in a way that reflects the physical structure and arrangement of the archive. Archives typically structure their material into collections and subcollections (or series and subseries), shelves, boxes, folders, etc. This detailed structure already represents a certain order, e.g., distinguishing between manuscripts and correspondence, providing a chronological arrangement, and/or selecting topics or correspondence partners. Despite remaining inconsistencies, archive users are usually able to work with a mnemotechnical approach, often supported by the archival finding aids. Representing the digital material according to the physical arrangement of the analog archive, might increase the transparency and usability of digital archival sources.

The Logarithmic Scale of Archival Work

Logarithmicscale.jpg

Structuring a digital archive according to its physical organization in the analog world also yields a major pragmatic benefit. If digitized copies are stored in folders that mirror the structure of the analog archive, then you could upload the entire material, basically, with a single mouse click. The archival work involved is close to zero. On the other hand, if this archive is really large as, for example, the Schlick Papers in Haarlem, which comprises some 50,000 pages, or the Carnap Papers at the ASP, which consists of close to 100,000 pages, then you can expect to process tens of thousands of documents. If an archivist, say, is processing 20 documents per hour, this eventually will amount to several years of full time work. As you can see, we are here comparing two extremes on the scale of gains divided by working time. Most archival resources won't be processed at such a micro level, for mere lack of time and funding. However, these sources might still be available, at the level of representation of the physical structure of the repository.

The Importance of Metadata

Metadata can be needlessly complex and confusing. Therefore, a careful selection is important. For example, a document should only be associated with relevant metadata categories. Only a letter, for example, has a recipient or a place of posting, whereas a manuscript, unlike a published book or article, may not offer any publication date. So, one important aspect of making metadata adequate is by restricting certain document categories to only category-relevant metadata.

Metadata should have a consistent format. Especially crucial metadata such as date and location. Dates should be able to cover not only (several) single days but also entire months or years, and date ranges, e.g. from December 24, 1924, until October 1930. Such an approach will cover also cases where the date of a document is not clear or where a document was produced over a longer period and/or at different days or years. Locations, on the other hand, should become embedded into the mereological structure of geography. The fact that Vienna, for example, belongs to Austria and Europe but also to the Habsburg Empire and the German-speaking world, is a complex relationship structure to represent, but necessary in order to find all Viennese locations when filtering documents for Vienna, Austria, Europe, the Habsburg Empire, or the German-speaking world. Finally, in other cases, such as with individuals, institutions, languages, we can design an application with a precise search and filter function when we establish relational features and store them in predefined lists or tables. References to these predefined resources, however, should be optional, in order to keep the database structure as flexible as possible.

Archival Documents from Different Repositories: The Merging of Formats and Versions

An important aspect of merging several archival sources is that documents tend to be located not only in one folder/box held by one collection. Rather, we often find the following scenarios:

  • An original document is held in archive X, whereas copies are located in other archives, e.g., carbon copies of a letter kept by the sender (which might contain relevant information that the original letter does not provide)
  • Written duplicates of a document might exist, transcriptions and translations, as well as commentaries that are located at different archival repositories.
  • A document might consist of several parts or chapters that, in turn, might be scattered among different archives (some of them might be the original, some might be copies, written duplicates, transcriptions, etc.)

A Note on Copyright

The international copyright legislation generally stipulates open access to metadata, whereas reproductions may be published online only if (a) the copyright was granted to the publishers by the copyright holder(s), or (b) the legal situation allows for publication without explicit transfer of copyright. There are two typical scenarios for (b): (b-1) publication of a document is possible if the death dates of all authors involved is more than 70 years ago, which turns the material into public domain; (b-2) publication is possible if the publishers can prove that the copyright holders could not be identified, despite all reasonable efforts to do so.

Copyright Protection vs. Research Access: The Flexible VALEP Approach

A logarithmic scale is emerging here. Prominent individuals, such as Carnap, Reichenbach, or Quine, could usually obtain copyright for their works easily. However, among their papers is also a wealth of material authored by others - letters TO Carnap, Quine, Reichenbach - or material that touches on privacy rights of others - e.g. Carnap discussing a specific person in a letter. Resolving all copyright issues unfolding from a Nachlass can turn into a tedious and unmanageable task. Therefore, it would be advantageous if a database could deal with these issues in a flexible way. Some material might be either removed from public access in its entirety - metadata plus reproductions - or public access will be granted (because metadata are unproblematic), but with restricted access to reproductions. Moreover, access can be restricted to the internal/non-public realm of the database, allowing access only to authorized staff.

Improving Research Tools Across Archival Repositories

To sum up, the following features would improve existing archival tools:

  • representation of the physical structure of an archive (in order to support the mnemotechnical skills of researchers and make existing finding aids more useful)
  • concentrating on areas with high gains and low costs first, continuing with the rest -- high gains and high costs -- only when feasible through available funding
  • providing a flexible approach towards metadata categories, matching them with the pertinent document categories
  • ensuring that critical metadata categories, such as date and location, use a flexible, consistent and transparent format, together with suitable parsing tools (to avoid inconsistent entries)
  • implementing additional critical metadata via predefined lists and tables in a relational database setting, while keeping data fields optional whenever possible
  • providing efficient tools that enable the processing of related documents held at different institutions, which can exist in a variety of formats, versions, expressions, such as carbon copies, transcriptions, or annotated duplicates
  • allowing for parts of the published material to remain restricted - access to the metadata but not to the reproductions - or even restricting both metadata and reproductions, as long as copyright issues remain unsolved

VALEP as an Innovative Tool

Indeed, VALEP does provide all aforementioned features. From the start, the database design was centered around the idea of reflecting both the physical structure of an archive as well as its content. From this initial focus, all additional innovative features have naturally evolved. One example are the concepts of 'versions' and 'chapters'. They manage to intermediate between the realms of (general) documents and archives. The development of these concepts were in part inspired through feedback from archivists and the designer's own archival experience.

Who Can Use VALEP?

VALEP is available to the general public, and it's free of charge in all its applications. Typical users of VALEP might include:

  • public and private institutions that hold material on the history of Logical Empiricism, using VALEP as a tool to publicize their sources and join them with other relevant material
  • private researchers wanting to utilize VALEP not only to distribute and merge their sources but also to preserve them for the future
  • researchers from all over the world who obtained digitized archival copies and are willing to share them with the research community, and/or want to process and organize their sources

If you are interested in using VALEP as an institution, private individual, or researcher, please contact Christian Damböck.

Hosts, Supporters, and Funding

VALEP is located at the server valep.vc.univie.ac.at, which is hosted by the University of Vienna. The Institute Vienna Circle, is maintaining the server. Further financial support is provided by the following sources:

  • FWF research grant P31716: € 16,000 for programming in 2020 and 2021, € 2,000 for data processing in 2021
  • The Vienna Circle Society: € 5,000 for programming in 2021
  • FWF research grant P34887: about 20 percent of the entire funds, equalling to approximately € 120,000 for the development of VALEP
  • FWF Grant for Digital Publication PUD 31-G: € 50,000 (2023-2025) "Digital Edition of the Diaries of Rudolf Carnap 1908-1935".
  • FWF Grant for Digital Publication PUD 39-G: € 50,000 (2024-2026) "Otto Neurath. Manuscripts and Correspondence".

Future Developments

The first version of VALEP was developed in 2020/21. In 2022 we added several bugfixes as well as the integration of VALEP with Phaidra as a mirror for all VALEP data and the possibility to add documents as subdocuments to other documents and therefore setup arbitrarily complex hierarchical structures inside of documents.


Future plans include:

  • traffic protocols, display of internal data and statistics for each VALEP element
  • integration of digital representations of texts and comments using LaTeX, XML, and a Git hub
  • WikiData integration for persons, institutions, locations, and other metadata items
  • more powerful filter tools that also allow to search the archive tree
  • option to selectively restore deleted VALEP objects
  • option of flagging objects in VALEP, together with advanced filter tools
  • loading bundles of documents into the file viewer
  • downloading JPGs within the file viewer as PDFs
  • option to download the nested content of any node of the archive tree to a local computer
  • XML and LaTeX download
  • displaying the number of files belonging to each node of the archive tree

If you encounter any bugs, want to report problems, or have any feedback or suggestions, please contact Christian Damböck.