This is now an old project and superseded by the more recent Simsapa Dhamma Reader
Table of Contents
- Simsapa Dictionary Tool
This tool generates EPUB, MOBI, Stardict (.zip) and Babylon (.gls) dictionary files.
I hope it is useful for:
- looking up Pali words more easily on different devices
- with fulltext search (looking for English words in Pali)
- in an offline context
The main result are the downloadable files:
Download Pali - English dictionaries: See the Releases page.
You are going to need a dictionary application for your desktop or mobile device, then download and use a suitable format to open it.
For example the StarDict format is widely supported, search for dictionary app stardict format for Windows / Mac / Android
or similar to find an app which
works for you.
Download one of the formats and open it with the dictionary app.
Specific steps to use this with GoldenDict.
Note that the goldendict.org website only has an old version (1.0.1) for Windows.
Download the more recent v1.5 version:
- Early Access Builds for Windows
- Early Access Builds for Mac OS X
- Early Access Builds for Linux Portable
On Linux distributions, you can also install the goldendict
package from your package manager.
Download one of the StarDict .zip
files from the Releases page.
Extract the .zip
to a folder, if will contain four files, such as:
combined-dictionary-stardict/
combined-dictionary.dict.dz
combined-dictionary.idx
combined-dictionary.ifo
combined-dictionary.syn
- Open GoldenDict.
- Select the
Edit > Dictionaries
menu. It usually opens with theSources > Files
tab open. - Click the
Add...
button, and select the folder where you extracted the.zip
. - Click
OK
. The menu will close. - Use the top input field to search for words.
- Use the
Search > Full-text Search
menu to search in the word definition texts (such as looking for English to Pali).
Add more dictionaries in other languages if you wish. Search for example portuguese stardict dictionary
.
The StarDict format is created in two steps:
- generate an
.xml
withsimsapa_dictionary
- use
stardict-text2bin
to generate the StarDict files (.idx, .dict.gz, .syn, .ifo
)
This only works on Linux systems. Install the stardict-tools
package which contains the above binary.
sudo apt-get install stardict-tools
The package doesn't install the binary to /usr/local/bin
, so you will have to specify the full path when using it.
On Ubuntu, the path is /usr/lib/stardict-tools/stardict-text2bin
.
For example, say you have a dictionary file in MS Excel Spreadsheet, dictionary.xlsx
.
This has to have a Word entries
and Metadata
sheet (see the sample [ncped
with space.xlsx](./tests/data/data with space/ncped with space.xlsx)).
First run simsapa_dictionary_linux
to generate the .xml
(for more cli options, see ./src/cli.yml):
./simsapa_dictionary_linux xlsx_to_stardict_xml \
--source_path "./dictionary.xlsx" \
--output_path "./dictionary.xml"
Then, stardict-text2bin
to generate the StarDict files:
/usr/lib/stardict-tools/stardict-text2bin dictionary.xml dictionary.ifo
This is going to create four files, dictionary{.idx, .dict.gz, .syn, .ifo}
.
You may wish to ZIP them if you are going to distribute it.
Copy your dictionary.xlsx
to a folder.
Open ./assets/xlsx_to_stardict.sh, Right-click
on the [Raw]
button, select Save as..
, save to the folder.
Copy simsapa_dictionary_linux
there as well.
Remember to set execution rights for xlsx_to_stardict.sh
and
simsapa_dictionary_linux
, either with chmod +x
in the terminal, or the
Right-click > Permissions menu in the file manager.
dictionary/
dictionary.xlsx
simsapa_dictionary_linux
xlsx_to_stardict.sh
Open this folder in a terminal and run:
./xlsx_to_stardict.sh dictionary.xlsx
The script combines the above steps and creates dictionary-stardict.zip
.
To see progress log messages, add a .env
file with RUST_LOG=info
in the folder:
echo "RUST_LOG=info" > .env
This causes the tool to print messages such as:
Running simsapa_dictionary_linux ... [2019-12-11T13:55:30Z INFO simsapa_dictionary] 🚀 Launched
[2019-12-11T13:55:30Z INFO simsapa_dictionary::app] process_first_arg()
[2019-12-11T13:55:30Z INFO simsapa_dictionary::app] process_cli_args()
[2019-12-11T13:55:30Z INFO simsapa_dictionary] Subcommand given: XlsxToStardict
[2019-12-11T13:55:30Z INFO simsapa_dictionary::app] === Begin processing XLSX "ncped.xlsx" ===
The pyglossary tool can convert to a wide range of dictionary formats.
You can use the StarDict files as input format.
The binary executables (simsapa_dictionary.exe
, _linux
, _osx
) are command line applications.
If you simply double click to run it, it will do nothing. If you run it in a terminal, it will display some usage notes.
It is a conversion utility, which can be used in small shell scripts to create or update dictionary files.
The dictionary source texts are in the simsapa-dictionary-data repo.
You can download the source text, edit and generate updated EPUB and MOBI files using this tool.
To generate MOBI files, also download Kindlegen from Amazon (free download).
- JSON format dictionaries published at suttacentral/sc-data
- Nyanatiloka: Buddhist Dictionary published by what-buddha-said.net
Use the *-stardict.zip
files, extract them and add the folder to the dictionary list in GoldenDict.
Version 1.5 includes Search menu > Full text search
, useful for English to Pali searches.
For Windows and OSX, download v1.5 from the Early Access Builds.
Read mode on the wiki pages.
On Linux, install goldendict
from your package manager.
Use one of the *.mobi
files and copy them to your Kindle. It will appear in the Dictionaries category.
The *.epub
files can be used with ebook readers which read the Epub format.
- iBooks on iOS
- Calibre on desktop
Search for applications which can open or import StarDict
format dictionaries.
You might have to copy-paste the link of a *-stardict.zip
file from the
Releases page, or download it and extract it to a folder where the dictionary
application can find it.
Such apps include:
- SuttaCentral dictionary lookup: https://suttacentral.net/define/kusala
- Critical Pali Dictionary
- http://dictionary.sutta.org/
- http://www.buddha-vacana.org/toolbox/dico.html
- English-Pali Dictionary (budsas.org)
See an example dictionary content below. It starts with metadata describing the dictionary, followed by the word entries. Each word entry starts with a TOML formatted block, followed by the definition text in Markdown syntax.
Use a text editor such as Notepad++ and copy the example to a file, for example ncped-example.md
.
The file extension must be .md
.
Arrange the files in a folder:
dictionary/
kindlegen.exe
ncped-example.md
simsapa_dictionary.exe
On Windows, drag-and-drop ncped-example.md
on the simsapa_dictionary.exe
.
On Linux and Mac, open a terminal in the folder and run ./simsapa_dictionary ./ncped-example.md
.
The default action is to generate a MOBI if kindlegen.exe
is also present in the folder, otherwise to generate an EPUB.
More options are available, see them with simsapa_dictionary.exe --help
. An overview is included below.
ndped-example.md
--- DICTIONARY METADATA ---
``` toml
title = "New Concise Pali - English Dictionary (NCPED)"
description = "Pali - English"
creator = "Simsapa Dhamma Reader"
source = "https://simsapa.github.io"
cover_path = "default_cover.jpg"
book_id = "NcpedDictionarySimsapa"
created_date_human = ""
created_date_opf = ""
```
--- DICTIONARY WORD ENTRIES ---
``` toml
dict_label = "NCPED"
word = "ababa"
summary = "the name of a hell, or place in Avīci, where one s"
grammar = ""
inflections = []
```
ababa
masculine the name of a hell, or place in Avīci, where one suffers for an *ababa* of years.
``` toml
dict_label = "NCPED"
word = "abbhantara"
summary = "interior, internal; being within, included in, amo"
grammar = ""
inflections = []
```
abbhantara
mfn. & neuter
1. (mfn.) interior, internal; being within, included in, among; belonging to one ‘s house, personal, intimate.
2. (n.)
1. intermediate space, interval; the inside, interior.
2. a measure of length (= 28 hatthas).
``` toml
dict_label = "NCPED"
word = "ajjhokāse"
summary = "in the open air, in the open."
grammar = ""
inflections = []
```
ajjhokāse
ind. in the open air, in the open.
Use the help
command to discover the command line options, or see src/cli.yml.
./simsapa_dictionary help
Both the tool and the dictionary content has some rough edges.
The dictionary entries can be edited using the files at simsapa-dictionary-data, and the dictionary formats re-generated with this tool.
Dictionary corrections or bug reports about the tool are welcome. Open an Issue here or see my email in the Cargo.toml.