Organizing literature


Most people working in research know the problem: Whenever you read an interesting paper you download the pdf and it ends up somewhere on your hard disk. Next time you want to read the same paper again it's too much effort to find it on the hard disk so you use your favorite web search engine again to find the pdf. Of course if you simply always use web search to find a paper you do not have to store anything on your hard disk but this system comes with two problems: First some papers are hard to find online, they may disappear, or be behind paywalls. Just because you found and could download a paper once doesn't necessarily mean that you will always be able to find and download it again. Second you need to be online and it just happened quite often to me that I needed access to this one particular paper when I was on a flight without Internet access.
To solve this problem I decided to organize the pdf files on my hard disk so that I can easily find papers. I can imagine various systems to organize them: by title, by year, by author, by conference or journal etc. As I want access to a paper by any of these search keys I set up the following system:

Now I can just open the html file in my browser, search for a paper by any information I remember and then open it by one click. The script I use to generate html from bibtex is called bib.py. If you also want to use it, simply change the value of the variable LITERATURE_DIR in line 8 to the directory containing all pdf (or postscript) files and change the value of the variable BIBFILE in line 9 to the path of your bibtex file. The script will not only generate links to local copies but also to online copies (from URLs given in the bibtex file). I do not claim that the script works with all bibtex files. It works with my bibtex file.

As a last step I want the html file automatically updated whenever I put a new pdf or postscript file in my literature directory or if I edit the bibtex file. This can be easily achieved by running a script in the background that uses inotifywait (part of the inotify-tools package) to monitor the literature directory. I wrote such a script called watchbib.sh. It should be self-explanatory; simply change the 3 variables LITERATURE_DIR, BIB, and BIBHTML to the correct paths. I run it in the background of fluxbox by putting the following line into my ~/.fluxbox/startup file:

~/programming/scripts/watchbib.sh &