Distributed Datalog Materialisation with Dynamic Data Exchange

Temitope Ajileye and Boris Motik and Ian Horrocks
Department of Computer Science, University of Oxford, Oxford, United Kingdom.

Paper

The executables and test files published here refer to the paper written by the aforementioned authors and available here.

Executable

The executable was compiled from C++ source code. Download the Windows or the Linux versions. To run a script with DMAT, use the command:
 
./DMAT.exe -shell . <script file name>

Folder Structure

 <partition root>

Partitions Generator

To partition the data (and compile the occurrences), run

./DMAT.exe -shell 

Then, in interactive mode, run

partition <number of partitions> <source1> [<source2>...]

This will use graph partitioning. For hash partitioning

partition hash <number of partitions> <source1> [<source2>...]

The output will be a set of files named source1.p0.etl.gz, source1.p1.etl.gz, .. and an occurrence file source1.ocr.gz or source1.ocr, depending on whether the input was compressed. Move the part files into the fact folder of the appropriate partition root, together with a copy of the occurrence file.

Script Generator

For now we reccomend first preparing all partition root folders on a single server, then moving them to the appropriate servers. For networks with more than a server, we also reccomend creating an aditional node to serve as terminal (it will load the entire dictionary and import the program). The script testgen.py supports this workflow and copying via scp.

  1. Place the python script alongside the partition root folders, named <prefix>-0, <prefix>-1, ...., <prefix>-n. The last one will be for the terminal node.
  2. In the same directory create a hosts.txt file with a single server address specification on each line. 
  3. In the same directory create a settings.txt file with the following values in each line
  4. Run python testgen.py . This will populate all script folders with appropriate scripts and create two three additional sets of script:
    1. A transfer.sh executable to copy all partition folders to the target hosts
    2. A transferDMAT.sh executable to copy the DMAT executable to the target hosts.
    3. A test<numthreads> executable in each partition root. You can use this to run DMAT when on the hosting server
  5. Run a, b, and c in each host to run a test.

Test Files

We have included two example rulesets, LUBM_L (lower bound) and LUBM_LE (lowebound extended with additional transitivity rules).