Tools for Tibetan | Diamond Cutter Classics
A collection of useful tools for working with Tibetan texts. Gofer (macOS + Windows) Hypercontext Translit XLitToTibetan TibetanEnglishDictionary
VOSviewer is a software tool for constructing and visualizing bibliometric networks. These networks may for instance include journals, researchers, or individual publications, and they can be constructed based on citation, bibliographic coupling, co-citation, or co-authorship relations. VOSviewer also offers text mining functionality that can be used to construct and visualize co-occurrence networks of important terms extracted from a body of scientific literature.
Bookends is a full-featured bibliography, reference, and information management system for students and professionals. Bookends requires macOS 10.13 or later. A highly configurable, interactive, and editable interface lets you work with reference information the way you want. View Groups or Term Lists (Authors, Keywords, etc.) on the left. In the concise reference view on the right, arrange fields in any order, show just the ones that you find useful, and label them as you like. Editing or entering information is a single click away. Show attachments (pdfs, text files, images, etc.), or use the reference’s URL to show live web pages of its contents. Notecards let you enter, edit, and rearrange your thoughts, and make citing pages in footnotes a snap. Tag clouds let you visualize your terms and word use, and quickly tunnel down to the references you want.
The Programming Historian
We publish novice-friendly, peer-reviewed tutorials that help humanists learn a wide range of digital tools, techniques, and workflows to facilitate research and teaching. We are committed to fostering a diverse and inclusive community of editors, writers, and readers.
The Digital Orientalist
Practical examples and theoretical reflections on the do's and don'ts of using digital tools for your study and research in African and Asian Studies.
Multilingual Mac
Devoted to tips and other info on how to use your Mac to read and write languages other than English
Sublime Text
The sophisticated text editor for code, markup and prosevailable on Mac, Windows and Linux.
AntConc | Laurence Anthony
AntConc is a freeware corpus analysis toolkit for concordancing and text analysis. The website of Laurence Anthony. Professor at Waseda University Japan, developer of AntConc, a freeware concordancer software program for Windows, Linux, and Macintosh OS X.
SDL Trados Studio
SDL Trados Studio Freelance představuje přední program společnosti SDL, založený na překladové paměti. Nabízí kompletní prostředí pro profesionální překladatele, kteří mohou editovat a korekturovat projekty, používat schválenou terminologii a také využít nástrojů automatického překladu v jediné, jednoduché desktop aplikaci. Pokud se rozhodnete pro SDL Trados Studio, stanete se součástí největší překladatelské komunity na světě, tento program používá po celém světě přes 200 000 uživatelů.
Highbrow | Harvard Library Lab
Highbrow applies the design principles of genome browsers to textual analysis and annotations. It shows, at a high level, which regions of a text are densely annotated and then supports zooming in to inspect annotations in detail. Initial applications were for the study of heavily annotated texts with standardized coordinate systems, such as the Bible, the Koran, or the works of Plato, but it can also be used to support student annotations of texts in a classroom setting and similar interactive cases.
TEITOK is a web-based platform for viewing, creating, and editing corpora with both rich textual mark-up and linguistic annotation, initially developed at the Centro de Linguística da Universidade de Lisboa, later at CELGA-ILTEC, and currently maintained at the ÚFAL institute of Charles University, Prague. The system has a modular design with numerous modules making serving a wide range of different corpus types. Below are some examples of some of those, and the type of corpora TEITOK can deal with. More modules are added frequently, and it is possible to add custom modules as well. The source is maintained at GitLab and some conversion tools are maintained on GitHub.
SiteSucker | Rick's Apps
SiteSucker is a Macintosh application that automatically downloads websites from the Internet. It does this by asynchronously copying the site's webpages, images, PDFs, style sheets, and other files to your local hard drive, duplicating the site's directory structure. Just enter a URL (Uniform Resource Locator), press return, and SiteSucker can download an entire website.
Gephi is the leading visualization and exploration software for all kinds of graphs and networks. Gephi is open-source and free.
Where AI meets historical documents Transkribus is a comprehensive platform for the digitisation, AI-powered text recognition, transcription and searching of historical documents.
kraken is a turn-key OCR system optimized for historical and non-Latin script material.
MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. MALLET includes sophisticated tools for document classification: efficient routines for converting text to "features", a wide variety of algorithms (including Naïve Bayes, Maximum Entropy, and Decision Trees), and code for evaluating classifier performance using several commonly used metrics. In addition to classification, MALLET includes tools for sequence tagging for applications such as named-entity extraction from text. Algorithms include Hidden Markov Models, Maximum Entropy Markov Models, and Conditional Random Fields. These methods are implemented in an extensible system for finite state transducers. Topic models are useful for analyzing large collections of unlabeled text. The MALLET topic modeling toolkit contains efficient, sampling-based implementations of Latent Dirichlet Allocation, Pachinko Allocation, and Hierarchical LDA. Many of the algorithms in MALLET depend on numerical optimization. MALLET includes an efficient implementation of Limited Memory BFGS, among many other optimization methods. In addition to sophisticated Machine Learning applications, MALLET includes routines for transforming text documents into numerical representations that can then be processed efficiently. This process is implemented through a flexible system of "pipes", which handle distinct tasks such as tokenizing strings, removing stopwords, and converting sequences into count vectors. An add-on package to MALLET, called GRMM, contains support for inference in general graphical models, and training of CRFs with arbitrary graphical structure.
Natural Language Toolkit
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum. Thanks to a hands-on guide introducing programming fundamentals alongside topics in computational linguistics, plus comprehensive API documentation, NLTK is suitable for linguists, engineers, students, educators, researchers, and industry users alike. NLTK is available for Windows, Mac OS X, and Linux. Best of all, NLTK is a free, open source, community-driven project. NLTK has been called “a wonderful tool for teaching, and working in, computational linguistics using Python,” and “an amazing library to play with natural language.” Natural Language Processing with Python provides a practical introduction to programming for language processing. Written by the creators of NLTK, it guides the reader through the fundamentals of writing Python programs, working with corpora, categorizing text, analyzing linguistic structure, and more. The online version of the book has been been updated for Python 3 and NLTK 3. (The original Python 2 version is still available at
