popular questions and answers related to the topic at hand
The National Archives Catalog has recently introduced a full-text search feature in its catalog and all-collections search, making it easier for users to search through text within records more effectively. This feature is part of broader digitization and modernization efforts [2].
In addition, the National Archives is accelerating the release of large batches of digitized records, such as the recent release of over 230,000 pages of documents related to the MLK assassination [1].
To stay updated on new features and releases from the National Archives Catalog, it is recommended to follow the National Archives news and announcements directly on their official site (archives.gov) or sign up for newsletters. For example, internship newsletters highlight holdings in the catalog and transcription projects that reach over 400,000 subscribers, indicating such newsletters are an existing channel for updates [4].
The National Archives Catalog is a 100% online volunteer program with no deadlines or assignments. Users can transcribe individual pages of a record, but not the entire record itself [3]. Transcribing text increases word searchability within the Catalog [5].
When transcribing, users should aim to transcribe exactly what they see in the document, leaving any misspellings in the original document and adding the correct spelling next to it in brackets [6]. If a page is too hard to read, it's best to delete the transcription so others can give it a try [7].
Common issues to fix in extracted text include hyphenated words, stray marks, format, and line breaks [8]. If a page is only available as a large PDF, users can email the National Archives at [email protected] to request individual pages [9].
Transcribed records are manually reviewed and removed from the missions so that other users don't have to click too far to find a record that they can work on [10]. Records in foreign languages can be transcribed, but the Catalog does not have a field to sort by language [11].
Tags allow Citizen Archivists to add keywords to a record to make it more searchable. Proper names, locations, and subjects make good tags, especially when they're not already in the title or scope and content note [12]. Comments are meant to be information from outside sources, such as a genealogist adding more information about the individual not found in the record [13].
Maps can be transcribed, but it is not expected to transcribe everything. Instead, think about transcribing the legend or other titles or text on the map rather than transcribing the places on the map [14]. If a page is a duplicate, it is suggested to add [duplicate page] in the transcription [15].
The program is open to anyone in the world, regardless of citizenship [16]. For best performance, use Safari, Google Chrome, or Microsoft Edge web browsers to view the Catalog [17]. If you encounter an error or find a typo in the Catalog, including the URL to the exact page, email the National Archives at [email protected] [18].
National Archives Catalog user accounts are managed through the government system, Login.gov, which uses two-factor authentication and stronger passwords [19]. If you have difficulty with the login.gov portion of your account, please see https://login.gov/help/ [20].
When transcribing a signature, please add - [signature] if you can read the name, or [signature] [illegible] if you cannot [21]. If a PDF is the only version of the record available, it can be transcribed, but large PDFs are not advised [22]. Featured Records are single records that need to be transcribed and are sorted by difficulty level [23].
Citizen Archivist events can work well in a classroom, as a library program, a community volunteer event, and more. Please contact the Community Managers at [email protected] for instructions for your event [24]. As we add tags or transcriptions to records, it helps improve search results and unlocks sometimes difficult to read text for all to understand [25].
If you have a question that isn't answered here, contact us at [email protected] [26]. If a form has handwritten answers, you should transcribe all the words, both the text on the form and the handwritten answers [27]. If you start transcribing a page, no other Citizen Archivist can work on it for 60 minutes of inactivity [28].
When available, OCR (Optical Character Recognition) and Artificial Intelligence are applied to records, but this extracted text is seldom as accurate as manual transcription, especially when ink bleeds through from the other side of the page, there are stamps and markings, and in many cases a mix of cursive and typewritten records [29]. If a page is blank, it is suggested to add [blank page] in the transcription panel [30].
If you reside outside the United States, you may have to select Back Up codes as your authentication method with login.gov [31]. When transcribing a seal, please add - [seal] and transcribe what it says at the end of the page [32]. For more information on how to formulate searches, please visit Search Tips [33]. Completed records are removed from the missions so that other users don't have to click too far to find a record that they can work on [34].
The purpose of transcription is to increase word searchability within the Catalog; altering the format will not help searchability [35]. The comment field is not the place for words or phrases better suited as tags or to add a transcription [36]. Maps can be transcribed, but it is not expected to transcribe everything. Instead, think about transcribing the legend or other titles or text on the map rather than transcribing the places on the map [37].
- The National Archives Catalog employs Optical Character Recognition (OCR) and artificial intelligence in its records, but manual transcription is often more accurate, particularly for records with ink bleeds, stamps, and a mix of cursive and typewritten text.
- Artificial intelligence is a valuable component of modernization efforts at the National Archives, including the newly introduced full-text search feature in the National Archives Catalog and all-collections search, making it simpler for users to search through text within records more effectively.