Introduction

The Open Dictionary Project (ODict for short), is an open-source alternative to proprietary dictionary file formats Babylon and Apple Dictionaries. Similar to other dictionaries, Open Dictionary files are converted from XML (currently in a draft specification) to compressed, serialized, bite-sized files. ODict is built on some of the fastest technologies to ensure maximum speed when performing lookups: Google Snappy (fastest compression), Bleve (fastest Go indexer), and Flatbuffers (fastest serialization). Entries are searched in log(n) time.

Motivation

ODict was developed out of the need for a fast, readily available, and open solution to cross-platform dictionary formats. Current formats such as Apple Dictionaries, Babylon, or StarDict dictionaries are specifically designed to work with a specific application (usually developed by the same company that made the format), and as a result are somewhat uni-directional (there is official documentation on how to write a dictionary in their format, but not on how to read one). This forces users to write dictionaries that only work with one specific dictionary app. Sure, there are scattered third-party articles online detailing these file formats and how to read them, but they are often long, convoluted, and difficult to implement yourself.

Wouldn't it be nice if there was a completely open-source format, with documentation on both reading and writing the format, that anyone could use in any dictionary app?

That's where ODict comes in.

How It Works

ODict files are binary files, or more specifically serialized Flatbuffers wrapped in a Snappy compressor. These binary files are generated from the ODict XML markup (ODXML for short – remember it, you'll be seeing it a lot). Details of this markup can be found elsewhere in this documentation. These binary files are generated from the ODict compiler, which is the main application for generating, reading, and searching ODict dictionaries. Most of the other repositories in the ODict organization either convert other dictionary formats to ODXML, or allow you to read ODict dictionaries in the language of your choice (such as odict-java).

Last updated