Peter Schwabe — Ph.D. thesis and related software

Thesis

Ph.D. thesis: High-Speed Cryptography and Cryptanalysis, Eindhoven University of Technology, The Netherlands, 2011.

pdf (printed version),
pdf (corrected version 2011-11-01), see errata,
bibtex entry,
pdf of the cover, cover design by Paula Schwabe,
pdf of the statements accompanying the thesis.

Errata

Page 8: There is a comma missing after Xiao-Ping Liang.
Page 8: I forgot to thank my former colleagues Natalie Naehrig and Anke Schmeink at the Institute for Theoretical Information Technology at RWTH Aachen.
Page 20: "NVIDIA CPUs" should be "NVIDIA GPUs".
Page 91: "E(F_p) = n" should be "|E(F_p)| = n".
Page 131: "Bernstein suggests to generate 2^b(B-ib) entries" should be "Bernstein suggests to generate 2^b+(B-ib) entries per list".
Page 145: The values for r are wrong, they should be (from top to bottom) 192, 640, 896, 1024, 1472, 1984. The discussion in the text uses the correct values.
Page 148: Reference [BCC+10] has been published in Guang Gong and Kishan Chand Gupta, editors, Progress in Cryptology — INDOCRYPT 2010, volume 6498 of Lecture Notes in Computer Science, pages 328—346, Springer-Verlag Berlin Heidelberg, 2010.
Page 151: The title of reference [BLN+09] is "FSBDay: Implementing Wagner's generalized birthday attack against the SHA-3 round-1 candidate FSB". An earlier version of the paper was presented at SHARCS 2009 with the title "FSBDay: Implementing Wagner's generalized birthday attack against the SHA-3 candidate FSB" (without the "round-1").
Page 153: Reference [Cor03] should be [Int03].
Page 157: Rainer Leupers and Heinrich Meyr are not authors of reference [KZS+09]; they are authors of the full version of the paper that was publicized as report 2009/056 of the Cryptology ePrint archive.
Page 159: Author of Reference [Mon85] should be "Peter L. Montgomery"
Page 162: Author of Reference [vdL07] should be "Wladimir J. van der Laan"

Qhasm

Most sofware described in my thesis is (at least partially) written in the qhasm programming language. All software packages on this website do not require qhasm to build the software — the assembly files resulting from compilation with qhasm are included. The qhasm files (file ending .q) are also included, in order to rebuild the assembly files from the qhasm files (for example after applying changes), first install qhasm:

wget http://cryptojedi.org/users/peter/thesis/data/qhasm-phdthesis-schwabe.tar.bz2
tar xjvf qhasm-phdthesis-schwabe.tar.bz2
cd qhasm-phdthesis-schwabe
sh do

This qhasm version is based on the official qhasm version 20070207 but contains updated machine-description files required to compile the qhasm files contained in the software packages on this website.
The qhasm command used to compile .q files into .s files depends on the architecture the .q file is written for. Available architectures and corresponding commands are as follows:

AMD64 (X64, X86_64): qhasm-amd64,
X86: qhasm-x86,
UltraSPARC: qhasm-sparc,
32-bit PowerPC (for Linux): qhasm-ppc32-linux,
32-bit PowerPC (for Mac OSX): qhasm-ppc32-macos,
Synergistic Processor Units (SPUs) of the Cell Broadband Engine: qhasm-cell-spu,
NVIDIA GT200b GPU (found in GTX 285 and GTX 295 graphics cards), 128 threads: qhasm-cudasm128.

Compilation of a .q file into a .s file using one of these commands (e.g., qhasm-x86) is done as follows:

PATH_TO_QHASM/qhasm-x86 < FILENAME.q > FILENAME.s

The qhasm command for NVIDIA GPUs (qhasm-cudasm128) translates to an "inofficial" assembly language, translation to binary format requires van der Laan's cudasm.

Chapter 3: Implementations of the Advanced Encryption Standard

All implementations of the Advanced Encryption Standard described in the thesis follow the eSTREAM API. Based on the more portable version of the eSTREAM benchmarking suite by Bernstein (version 20080905) I put together a version containing all AES-CTR and AES-GCM software described in my thesis. To build this suite and run the benchmarks please do the following:

wget http://cryptojedi.org/users/peter/thesis/data/estreambench-20080905-phdthesis-schwabe.tar.bz2
tar xjvf estreambench-20080905-phdthesis-schwabe.tar.bz2
cd estreambench-20080905-phdthesis-schwabe
chmod +x start scripts/*
(echo '';echo '';echo '';echo '';echo '') | scripts/configure
cd reports-`hostname`
(../scripts/run; echo y|../scripts/collect) > run.log 2>&1 </dev/null &

This will benchmarks all ciphers included in the benchmarking suite, the results will be collected in a structure of html files linked from reports-`hostname`/index.html.
It is also possible to only benchmark the implementations of AES; to do so simply replace the last line by

(../scripts/run aes; echo y|../scripts/collect) > run.log 2>&1 </dev/null &

Chapter 4: Elliptic-curve cryptography on Cell processors

The implementation of the Curve25519 software for the SPUs of a Cell Broadband Engine follows the SUPERCOP API. The software is included in current versions of the SUPERCOP benchmarking suite, for the thesis I used version 20101111 of the suite, this version is mirrored here. Benchmarking results of the software can be obtained as follows:

wget http://cryptojedi.org/users/peter/thesis/data/supercop-20101111-phdthesis-schwabe.tar.bz2
tar xjvf supercop-20101111-phdthesis-schwabe.tar.bz2
cd supercop-20101111-phdthesis-schwabe
sh do

This will run benchmarks of all primitives included in the SUPERCOP benchmarking suite and can take a long time (more than a week!). After the do script has finished, the results of the benchmarks are contained in the file bench/HOSTNAME/data.gz. For details about the format of this file and conversion to more human-readable format see the official SUPERCOP website.

Chapter 5: Pairing computation on AMD64 processors

Downloading and building the pairing software requires the following steps:

wget http://cryptojedi.org/users/peter/thesis/data/dclxvi-phdthesis-schwabe.tar.bz2
tar xjvf dclxvi-phdthesis-schwabe.tar.bz2
cd dclxvi-phdthesis-schwabe/
make

This will produce 6 binaries: bilintest-check, bilintest-c, bilintest-as, speedtest-check, speedtest-c, and speedtest-as.
The binaries ending on -check perform all arithmetic with included overflow check as described in the thesis. The files ending on -c are non-optimized implementations in C and the implementations ending on -as have all speed-critical functions implemented in qhasm
The bilintest binaries perform NTESTS bilinearity (and non-degeneracy) tests, the value NTESTS is defined in the Makefile. The speedtest binaries measure the cycles counts of the most speed-critical functions.

Chapter 6: Solving the elliptic-curve discrete-logarithm challenge ECC2K-130

The software for the Cell Broadband Engine and for the NVIDIA GTX 295 graphics card will be published here as soon as the discrete-logarithm computation has finished.

Chapter 7: Implementing Wagner's generalized birthday attack

Running the code we used to attack FSB₄₈ with Wagner's generalized birthday attack requires the following hardware:

8 computers with a 64-bit processor, running a 64-bit Linux system,
8 GB of RAM on each of the computers,
a hard disk with a free 700-GB data partition in each of the computers, and
some network connection between the computers (our attack used switched Gigabit ethernet).

To run the attack on this cluster perform the following steps (see also the FSBday website):

Installing MPICH-2

The implementation uses the MPI implementation MPICH-2. We followed this installation guide to set up MPICH-2 on our cluster machines running Ubuntu 9.04 (aka Jaunty Jackalope).
As a remark: We had problems with the setup of MPI when the hostname was mapped to 127.0.0.1 in /etc/hosts, we just removed this entry.

Downloading and running our code

To download and compile the code run

wget http://polycephaly.org/projects/fsbday/data/fsbday-20091213.tar.bz2
tar xjvf fsbday-20091213.tar.bz2
cd fsbday-20091213
make

If the data partition for the attack is not /dev/sda1, change the definition of IODEVICE in params.h accordingly before compiling, otherwise running the attack may destroy data on /dev/sda1!

This will produce a binary called fsbday, which you have to copy to each of the cluster nodes in the same directory. For this you can use the bash script mpicp.sh.

Running Phase 1 (Determine clamping constants)

The first phase of the attack (for the description of the two phases please read the paper) is started through the bash script fsbstart.sh, before running the script you might have to change the value of the HOSTFILE (the path to the MPI host file, set to ./hosts by default) in the script. The script takes as only argument the path to a logfile, so the first phase of the attack can be started with

./fsbstart fsbday.log

The only output on stdout is the clamping constants tried and whether they lead to success, more information including the values yielding a collision can be found in the log file.

Running Phase 2 (Compute colliding matrix positions)

To determine the matrix positions from the clamping constants you have to run the program fsbfinal twice, once for the left and once for the right half-tree. The fsbfinal program needs as input the clamping constants used to generate the lists on level 0 and the value on level 3 that leads to success. The clamping constants are output on standard output, the value you will find in the logfile n a line looking like

0: Result: 54e28b37d58e75 781538a2cdf5639d Part: 1f8

when you grep for "Result". The first number (before the colon) is the number of the node.
Let's assume the clamping constants in the rightmost quarter-tree are you determined in phase 1 are 4, 4, 0, 0, and let's assume the value as in the line above, then you first have to convert 54e28b37d58e75 to decimal, yielding 23892985608769141, same for 781538a2cdf5639d yielding 8652884530953544605 and for the part 1f8 yielding 504.
You then run fsbfinal as follows

mpirun -l -machinefile hosts -np 8 ./fsbfinal --nlists=8 --startlevel=0 --startlist=0 --cmp=23892985608769141 --remaining=8652884530953544605 --part=504 --node=0 0 0 0 0 0 0 0 0

mpirun -l -machinefile hosts -np 8 ./fsbfinal --nlists=8 --startlevel=0 --startlist=8 --cmp=23892985608769141 --remaining=8652884530953544605 --part=504 --node=0 0 0 0 0 0 0 4 4

This will print the positions (relative positions in the according blocks and half-blocks) to stderr.

For information about how to change to code to other attack scenarios please refer to the FSBday website.

Collision attack against the full hash function

To download and run the code of the simple attack against the full FSB₄₈ hash function mentioned in my thesis do the following:

wget http://cryptojedi.org/users/peter/thesis/data/fsb48-pollard.tar.bz2
tar xjvf fsb48-pollard.tar.bz2
cd fsb48-pollard
make
./fsb48-pollard

michael

peter

crypto

papers

programming

music

misc

gallery

recipes

links