Introduction

The Open Dictionary Project (ODict for short), is an open-source alternative to proprietary dictionary file formats Babylon and Apple Dictionaries. Similar to other dictionaries, Open Dictionary files are converted from XML (currently in a draft specification) to compressed, serialized, bite-sized files. ODict is built on some of the fastest technologies to ensure maximum speed when performing lookups: Google Snappy (fastest compression), RapidXML (fastest XML parsing), and Flatbuffers (fastest serialization). Entries are searched in log(n) time.

Motivation

ODict was developed out of the need for a fast, readily available, and open solution to cross-platform dictionary formats. Current formats such as Apple Dictionaries, Babylon, or StarDict dictionaries are specifically designed to work with a specific application (usually developed by the same company that made the format), and as a result are somewhat uni-directional (there is official documentation on how to write a dictionary in their format, but not on how to read one). This forces users to write dictionaries that only work with one specific dictionary app. Sure, there are scattered third-party articles online detailing these file formats and how to read them, but they are often long, convoluted, and difficult to implement yourself.

Wouldn't it be nice if there was a completely open-source format, with documentation on both reading and writing the format, that anyone could use in any dictionary app?

That's where ODict comes in.

How It Works

ODict files are binary files, or more specifically serialized Flatbuffers wrapped in a Snappy compressor. These binary files are generated from the ODict XML markup (ODXML for short – remember it, you'll be seeing it a lot). Details of this markup can be found elsewhere in this documentation. These binary files are generated from the ODict C++ compiler, which is the main application for generating, reading, and searching ODict dictionaries. Most of the other repositories in the ODict organization either convert other dictionary formats to ODXML, or allow you to read ODict dictionaries in the language of your choice (such as odict-java).

Prerequisites

To build ODict, you must have Facebook's Buck build tool installed. Additionally, you'll need a recent version of CMake, Google's flatc compiler (instructions on how to install can be found here), and perhaps some additional libraries, depending on any warnings or errors you get while building.

Building the Compiler

To use the compiler (which is not on any package manager yet but we hope it will be soon), you must clone the compiler repository, then run the handy build script we wrote:

$ ./build.sh

This will build the ODict executable and library and output both to a bin directory inside the cloned folder. Run:

$ ./bin/odict

for usage instructions.