Peter Schwabe — Ph.D. thesis and related software
Thesis
Ph.D. thesis: High-Speed Cryptography and Cryptanalysis, Eindhoven University of Technology, The Netherlands, 2011.
- pdf (printed version),
- pdf (corrected version 2011-11-01), see errata,
- bibtex entry,
- pdf of the cover, cover design by Paula Schwabe,
- pdf of the statements accompanying the thesis.
Errata
- Page 8: There is a comma missing after Xiao-Ping Liang.
- Page 8: I forgot to thank my former colleagues Natalie Naehrig and Anke Schmeink at the Institute for Theoretical Information Technology at RWTH Aachen.
- Page 20: "NVIDIA CPUs" should be "NVIDIA GPUs".
- Page 91: "E(Fp) = n" should be "|E(Fp)| = n".
- Page 131: "Bernstein suggests to generate 2b(B-ib) entries" should be "Bernstein suggests to generate 2b+(B-ib) entries per list".
- Page 145: The values for r are wrong, they should be (from top to bottom) 192, 640, 896, 1024, 1472, 1984. The discussion in the text uses the correct values.
- Page 148: Reference [BCC+10] has been published in Guang Gong and Kishan Chand Gupta, editors, Progress in Cryptology — INDOCRYPT 2010, volume 6498 of Lecture Notes in Computer Science, pages 328—346, Springer-Verlag Berlin Heidelberg, 2010.
- Page 151: The title of reference [BLN+09] is "FSBDay: Implementing Wagner's generalized birthday attack against the SHA-3 round-1 candidate FSB". An earlier version of the paper was presented at SHARCS 2009 with the title "FSBDay: Implementing Wagner's generalized birthday attack against the SHA-3 candidate FSB" (without the "round-1").
- Page 153: Reference [Cor03] should be [Int03].
- Page 157: Rainer Leupers and Heinrich Meyr are not authors of reference [KZS+09]; they are authors of the full version of the paper that was publicized as report 2009/056 of the Cryptology ePrint archive.
- Page 159: Author of Reference [Mon85] should be "Peter L. Montgomery"
- Page 162: Author of Reference [vdL07] should be "Wladimir J. van der Laan"
Qhasm
Most sofware described in my thesis is (at least partially) written in the qhasm programming language. All software packages on this website do not require qhasm to build the software — the assembly files resulting from compilation with qhasm are included. The qhasm files (file ending .q) are also included, in order to rebuild the assembly files from the qhasm files (for example after applying changes), first install qhasm:
tar xjvf qhasm-phdthesis-schwabe.tar.bz2
cd qhasm-phdthesis-schwabe
sh do
This qhasm version is based on the official qhasm version 20070207 but contains updated machine-description
files required to compile the qhasm files contained in the software packages on this website.
The qhasm command used to compile .q files into .s files depends on the architecture the .q file
is written for. Available architectures and corresponding commands are as follows:
- AMD64 (X64, X86_64): qhasm-amd64,
- X86: qhasm-x86,
- UltraSPARC: qhasm-sparc,
- 32-bit PowerPC (for Linux): qhasm-ppc32-linux,
- 32-bit PowerPC (for Mac OSX): qhasm-ppc32-macos,
- Synergistic Processor Units (SPUs) of the Cell Broadband Engine: qhasm-cell-spu,
- NVIDIA GT200b GPU (found in GTX 285 and GTX 295 graphics cards), 128 threads: qhasm-cudasm128.
Compilation of a .q file into a .s file using one of these commands (e.g., qhasm-x86) is done as follows:
The qhasm command for NVIDIA GPUs (qhasm-cudasm128) translates to an "inofficial" assembly language, translation to binary format requires van der Laan's cudasm.
Chapter 3: Implementations of the Advanced Encryption Standard
All implementations of the Advanced Encryption Standard described in the thesis follow the eSTREAM API. Based on the more portable version of the eSTREAM benchmarking suite by Bernstein (version 20080905) I put together a version containing all AES-CTR and AES-GCM software described in my thesis. To build this suite and run the benchmarks please do the following:
tar xjvf estreambench-20080905-phdthesis-schwabe.tar.bz2
cd estreambench-20080905-phdthesis-schwabe
chmod +x start scripts/*
(echo '';echo '';echo '';echo '';echo '') | scripts/configure
cd reports-`hostname`
(../scripts/run; echo y|../scripts/collect) > run.log 2>&1 </dev/null &
This will benchmarks all ciphers included in the benchmarking suite, the results
will be collected in a structure of html files linked from reports-`hostname`/index.html.
It is also possible to only benchmark the implementations of AES; to do so simply
replace the last line by
Chapter 4: Elliptic-curve cryptography on Cell processors
The implementation of the Curve25519 software for the SPUs of a Cell Broadband Engine follows the SUPERCOP API. The software is included in current versions of the SUPERCOP benchmarking suite, for the thesis I used version 20101111 of the suite, this version is mirrored here. Benchmarking results of the software can be obtained as follows:
tar xjvf supercop-20101111-phdthesis-schwabe.tar.bz2
cd supercop-20101111-phdthesis-schwabe
sh do
This will run benchmarks of all primitives included in the SUPERCOP benchmarking suite and can take a long time (more than a week!). After the do script has finished, the results of the benchmarks are contained in the file bench/HOSTNAME/data.gz. For details about the format of this file and conversion to more human-readable format see the official SUPERCOP website.
Chapter 5: Pairing computation on AMD64 processors
Downloading and building the pairing software requires the following steps:
tar xjvf dclxvi-phdthesis-schwabe.tar.bz2
cd dclxvi-phdthesis-schwabe/
make
This will produce 6 binaries: bilintest-check, bilintest-c, bilintest-as, speedtest-check, speedtest-c, and speedtest-as.
The binaries ending on -check perform all arithmetic with included overflow check as described in the thesis.
The files ending on -c are non-optimized implementations in C and the implementations ending on -as have all speed-critical functions implemented in
qhasm
The bilintest binaries perform NTESTS bilinearity (and non-degeneracy) tests, the value NTESTS is defined in the Makefile.
The speedtest binaries measure the cycles counts of the most speed-critical functions.
Chapter 6: Solving the elliptic-curve discrete-logarithm challenge ECC2K-130
The software for the Cell Broadband Engine and for the NVIDIA GTX 295 graphics card will be published here as soon as the discrete-logarithm computation has finished.
Chapter 7: Implementing Wagner's generalized birthday attack
Running the code we used to attack FSB48 with Wagner's generalized birthday attack requires the following hardware:
- 8 computers with a 64-bit processor, running a 64-bit Linux system,
- 8 GB of RAM on each of the computers,
- a hard disk with a free 700-GB data partition in each of the computers, and
- some network connection between the computers (our attack used switched Gigabit ethernet).
Installing MPICH-2
The implementation uses the MPI implementation
MPICH-2.
We followed this installation guide
to set up MPICH-2 on our cluster machines running Ubuntu 9.04 (aka Jaunty Jackalope).
As a remark: We had problems with the setup of MPI when the hostname was mapped to 127.0.0.1 in /etc/hosts,
we just removed this entry.
Downloading and running our code
To download and compile the code run
tar xjvf fsbday-20091213.tar.bz2
cd fsbday-20091213
make
If the data partition for the attack is not /dev/sda1, change the definition of IODEVICE in params.h accordingly before compiling, otherwise running the attack may destroy data on /dev/sda1!
This will produce a binary called fsbday, which you have to copy to each of the cluster nodes in the same directory. For this you can use the bash script mpicp.sh.
Running Phase 1 (Determine clamping constants)
The first phase of the attack (for the description of the two phases please read the paper) is started through the bash script fsbstart.sh, before running the script you might have to change the value of the HOSTFILE (the path to the MPI host file, set to ./hosts by default) in the script. The script takes as only argument the path to a logfile, so the first phase of the attack can be started with
The only output on stdout is the clamping constants tried and whether they lead to success, more information including the values yielding a collision can be found in the log file.
Running Phase 2 (Compute colliding matrix positions)
To determine the matrix positions from the clamping constants you have to run the program fsbfinal twice, once for the left and once for the right half-tree. The fsbfinal program needs as input the clamping constants used to generate the lists on level 0 and the value on level 3 that leads to success. The clamping constants are output on standard output, the value you will find in the logfile n a line looking like
when you grep for "Result". The first number (before the colon) is the number of the node.
Let's assume the clamping constants in the rightmost quarter-tree are you determined in phase 1 are 4, 4, 0, 0,
and let's assume the value as in the line above, then you first have to convert 54e28b37d58e75 to decimal, yielding 23892985608769141,
same for 781538a2cdf5639d yielding 8652884530953544605 and for
the part 1f8 yielding 504.
You then run fsbfinal as follows
mpirun -l -machinefile hosts -np 8 ./fsbfinal --nlists=8 --startlevel=0 --startlist=8 --cmp=23892985608769141 --remaining=8652884530953544605 --part=504 --node=0 0 0 0 0 0 0 4 4
This will print the positions (relative positions in the according blocks and half-blocks) to stderr.
For information about how to change to code to other attack scenarios please refer to the FSBday website.
Collision attack against the full hash function
To download and run the code of the simple attack against the full FSB48 hash function mentioned in my thesis do the following:
tar xjvf fsb48-pollard.tar.bz2
cd fsb48-pollard
make
./fsb48-pollard