Bitcoin and the blockchain in 10 minutes

(This post is an extract from another post written a year ago on my other, more personal blog.)

I finally got around to studying the math behind bitcoin.

If you more or less know what a hash is (the hash is as a short string, e.g. 32 characters, than can be calculated from a file of arbitrary size; if even one byte in the file changes, the hash will be completely different; read more on wikipedia, or ask me in the comments) and you more or less know how the public and private keys in asymmetric cryptography work (you can encrypt (encode) something with the public key, ONLY its matching secret private key can decode it; you can SIGN any file with a secret private key, the authenticity of that signature can be proven by anyone with matching public key; read more on wikipedia, or ask in the comments!) you can more or less understand bitcoin in particular and cryptocurrency in general.

Let’s say you were to generate a completely random private key, you can then use a well-known procedure to derive its matching public key. By applying two successive hash functions to that public key, you have a bitcoin address!

If I were to owe you money, you could then give me that bitcoin address.

I could then pay you back by writing a specially crafted message called a bitcoin transaction, in which I describe that I am transferring some bitcoins TO the address that you gave me FROM another bitcoin address (henceforth the source address), of which I have the matching secret private key.

In that message, I cryptographically sign the input part, a modified version of the whole transaction, including source and destination address, with the (secret) private key matching that source address. The signature mathematically proves that I own the bitcoins I am about to transfer, and it mathematically locks in the whole transaction, so that the destination addresses also can’t be changed. I generally also allocate a very small amount (by leaving money unaccounted for) as a transaction fee. We’ll see why in a minute.

I broadcast the signed transaction to the bitcoin network, where it eventually gets picked up by one or more of the bitcoin miners. Miners batch together a number of transactions into a block, together with a hash of the last successfully mined block, and a piece of random data called the nonce. They then proceed to continuously hash the block, changing the nonce every time so that the hash changes, until the first few digits of the hash are zeroes.

Based on the nature of cryptographic hashes, this will statistically take a very long time. One could get lucky and get the correct hash early, but generally it requires a whole lot of number crunching, which means kilowatts, which means actual money. The special hash resulting from this number crunching is called the proof of work.

When a miner has hit the jackpot, they broadcast the block to the network, which recognises that it’s the next valid block by checking the hash, and then, in a peer to peer fashion, irreversibly records this as the next block in the globally shared block chain. The successful miner currently receives 12.5 bitcoins (on 2017-06-03 worth about 27500 EURO) as well as all of the included per-transaction fees. This reward is set to halve again sometime in the future.

Now you probably understand why so many people are mining so enthusiastically. (No, you can’t really participate anymore with your home PC like you could in the early days; you have to acquire a large room full of bitcoin mining ASICs, circuitry that has been purpose-designed for one thing: bitcoin mining, to make any kind of impact. On the other hand, if you play the lottery, you might as well fire up your PC.)

You could now go and print out your private key (or its QR code) and the matching bitcoin address (actually you only need the private key, the public key and address can be derived from it) and then destroy all of your computers. Whenever you need to send that bitcoin somewhere, you simply type in the private key or rather scan the QR code, and then repeat the process of creating a bitcoin transaction, using your private key.

The money is never actually stored anywhere, only transactions encoding the movement of money from one random virtual address to another are. The block-chain is mathematically unbreakable and unforgeable.

I find the relative simplicity of the whole thing utter genius: A usable and versatile currency backed by hard math.

Further reading

The two sources that helped me the most were Bitcoin transactions, metaphorically (Part 1) and Bitcoin transactions, technically (Part 2), both on the What does the quant say? blog.

Fixing crux-open-with on Ubuntu

The crux package for Emacs contains, amongst a list of useful functions (crux does stand for A Collection of Ridiculously Useful eXtensions for Emacs after all), the function crux-open-with. This interactive function opens the file you currently have open, or the file under your cursor in dired, using the system application for that file.

On Ubuntu 16.04, this function unfortunately does not work. It looks like the application tries to start (icon visible in the dash) and then dies.

After some debugging I discovered that on Ubuntu, the system tool xdg-open (which is invoked by crux) starts gvfs-open which on its part does not wait for the app to finish, but returns immediately.

The internal Emacs function start-process used by crux-open-with already runs the child process asynchronously. The early return of gvfs-open results in the sub-process closing, which means your app never gets off the ground.

To fix this, we have to get xdg-open to work in generic (i.e. non-Gnome / GVFS) mode and thus to wait for the app it spawns to return. We do this by setting the environment variable XDG_CURRENT_DESKTOP to X-Generic.

Fortunately, we can do this fairly neatly by wrapping the crux-open-with invocation in a let where we temporarily change the environment.

In my case it looks like this:

(use-package crux
  :config
  (defun cpb-crux-open-with (arg)
    (interactive "P")
    (let ((process-environment (cons "XDG_CURRENT_DESKTOP=X-Generic" process-environment)))
      (crux-open-with arg)))
  (global-set-key (kbd "C-c o") 'cpb-crux-open-with))

If you don’t also use use-package (you really should!) just copy out the part after :config.

This has been tested to work on Ubuntu 16.04 in both Unity and Gnome Shell 3.18.

Updates

  • 2017-04-18: Fixed environment variable + tested in Gnome Shell.

Miniconda3, TensorFlow, Keras on Google Compute Engine GPU instance: The step-by-step guide.

Google recently announced the availability of GPUs on Google Compute Engine instances. For my deep learning experiments, I often need more beefy GPUs than the puny GTX 750Ti in my desktop workstation, so this was good news. To make the GCE offering even more attractive, their GPU instances are also available in their EU datacenters, which is in terms of latency a big plus for me here on the Southern tip of the African continent.

Last night I had some time to try this out, and in this post I would like to share with you all the steps I took to:

  1. Get a GCE instance with GPU up and running with miniconda, TensorFlow and Keras
  2. Create a reusable disk image with all software pre-installed so that I could bring up new instances ready-to-roll at the drop of a hat.
  3. Apply the pre-trained Resnet50 deep neural network on images from the web, as a demonstration that the above works. Thanks to Keras, this step is fun and fantastically straight-forward.

Pre-requisites

I started by creating a project for this work. On the Compute Engine console, check that this project is active at the top.

Before I was able to allocate GPUs to my instance, I had to fill in the “request quote increase” form available from the Compute Engine quotas page. My request for two GPUs in the EU region was approved within minutes.

I installed my client workstation’s id_rsa.pub public SSH key as a project-wide SSH key via the metadata screen.

Start an instance for the first time

I configured my GPU instance as shown in the following screenshot:

gce-create-instance.png

  • Under Machine type switch to Customize to be able to select a GPU.
  • I selected an Ubuntu 16.04 image, and changed the persistent disk to SSD.
  • I selected the europe-west1-b zone. Choose whatever is closest for you. The interface will warn you if the selection does NOT support GPUs.

After this, click on the Create button and wait for your instance to become ready.

Once it’s up and running, you’ll be able to ssh to the displayed public IP. I used the ssh on my client workstation, but of course you could opt for the Google-supplied web-based versions.

Install NVIDIA drivers and CUDA

I used the following handy script from the relevant GCE documentation:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda; then
  # The 16.04 installer works with 16.10.
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  apt-get update
  apt-get install cuda -y
fi

After this, download the CUDNN debs from the NVIDIA download site using your developer account. Install the two debs using dpkg -i.

To confirm that the drivers have been installed, run the nvidia-smi command:

gce-nvidia-smi.png

Install miniconda, tensorflow and keras

I usually download the 64bit Linux miniconda installer from conda.io and then install it into ~/miniconda3 by running the downloaded .sh script.

After this, I installed TensorFlow 1.0.1 and Keras 2.0.1 into a new conda environment by doing:

conda create -n ml python=3.6
conda install jupyter pandas numpy scipy scikit-image
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp36-cp36m-linux_x86_64.whl
pip install keras h5py

The keras package also installed theano, which I then uninstalled using pip uninstall theano in the active ml environment.

To test, run ipython and then type import keras. It should look like this:

gce-import-keras.png

Note that it’s picking up the TensorFlow backend, and successfully loading all of the CUDA librarias, including CUDNN.

Save your disk as an image for later

You will get billed for each minute that the instance is running. You also get billed for persistent disks that are still around, even if they are not used by any instance.

Creating a reusable disk image will enable you to delete instances and disks, and later to restart an instance with all of your software already installed.

To do this, follow the steps in the documentation, which I paraphrase and extend here:

  1. Stop the instance.
  2. In the instance list, click on the instance name itself; this will take you to the edit screen.
  3. Click the edit button, and then uncheck Delete boot disk when instance is deleted.
  4. Click the save button.
  5. Delete the instance, but double-check that delete boot disk is unchecked in the confirmation dialog.
  6. Now go to the Images screen and select Create Image with the boot disk as source.

Next time, go to the Images screen, select your image and then select Create Instance. That instance will come with all of your goodies ready to go!

Apply the ResNet50 neural network on images from the interwebs

After connecting to the instance with an SSH port redirect:

ssh -L 8889:localhost:8888 cpbotha@EXTERNAL_IP

… and then starting a jupyter notebook on the GCE instance:

cpbotha@instance-1:~$ source ~/miniconda3/bin/activate ml
(ml) cpbotha@instance-1:~$ jupyter notebook

So that I can connect to the notebook on my localhost:8889, I enter and execute the following code (adapted from the Keras documentation) in a cell:

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np
from PIL import Image

model = ResNet50(weights='imagenet')

# adapted from https://github.com/fchollet/deep-learning-models
# to accept also a PIL image
def load_and_predict_image(img_or_path):
    target_size = (224,224)
    if type(img_or_path) is str:
        img = image.load_img(img_or_path, target_size=target_size)

    else:
        img = img_or_path.resize(target_size)

    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)

    preds = model.predict(x)
    # decode the results into a list of tuples (class, description, probability)
    # (one such list for each sample in the batch)
    print('Predicted:', decode_predictions(preds, top=3)[0])

In the next cell, I do:

from PIL import Image
import urllib.request

url1 = "https://upload.wikimedia.org/wikipedia/commons/thumb/c/c8/KuduKr%C3%BCger.jpg/1920px-KuduKr%C3%BCger.jpg"
url2 = "https://upload.wikimedia.org/wikipedia/commons/thumb/9/9d/Struthio_camelus_-_Etosha_2014_%283%29.jpg/800px-Struthio_camelus_-_Etosha_2014_%283%29.jpg"
im = Image.open(urllib.request.urlopen(url2))

load_and_predict_image(im)

To be greeted with the following results:

gce-keras-resnet.png

The pre-trained ResNet50 network identified the Kudu I gave it initially as an ostrich, so I decided to make it a bit easier for the poor network by actually giving it an ostrich, which it did identify with a 99.98% probability.

Looking at the photos, the former taken in the Kruger National Park and the second in Etosha, I can image that the network could identify the former as the latter due to similar background and foreground colouring, and clearly having not been trained on Kudu.

Let’s file that under future work!

Recursive text search in project without projectile

If you’re not using projectile, but you would like to be able to perform interactive, search-as-you-type, recursive searches through the current project, this is pretty easy to do if you have counsel and find-file-in-project installed.

Add the following Emacs Lisp to your init:

(defun cpb/counsel-ag-in-project ()
  "Use `ffip' and `counsel-ag' for quick project-wide text searching."
  (interactive)
  (let ((project-root (ffip-project-root)))
    ;; if ffip could not find project-root, it will already have
    ;; shown an error message. We only have to check for non-nil.
    (if project-root
        (counsel-ag nil project-root nil
                    (format "Search in PRJ %s" project-root)))))

(global-set-key (kbd "C-c s") 'cpb/counsel-ag-in-project)

This function, which I have bound to ctrl-c s, finds the project root using find-file-in-project's function for this, and then invokes the handy counsel-ag on its result.

From org file with local bibtex to LaTeX and PDF

vxlabs-emacs-org-ref-pdf-example.png
Screenshot of orgmode source, PDF preview on the right, interactive citation selection in the minibuffer. Click for full resolution.

I have (co-)written a few LaTeX documents in my time.

However, as I have been writing my life and lab notes and many of my technical blog posts in Emacs orgmode for the past few years, I wanted to see how one would go about using BiBTeX references in orgmode files (using John Kitchin’s org-ref package) such that they would render correctly in orgmode’s export LaTeX and PDF.

My use case is of course slightly different than the norm: I maintain my master bibliography using the wonderful Zotero software, and I prefer exporting document-specific BiBTeX files from that main database.

This post shows you how to do it.

Pre-requisites

  • Make sure that latexmk is installed on your system. This is one of the easiest ways to make sure that pdflatex and bibtex are executed in the correct sequence, and a sufficient number of times.
  • With M-x package-install RET org-ref install the org-ref package. This will enable interactive selection and insertion of bibtex references from your local bib file.
  • You should also have the ox-latex Emacs package installed. This is what gives orgmode its LaTeX output powers.

Emacs setup

In your init.el somewhere, use the following code to ensure that the orgmode LaTeX exporter invokes latexmk in the correct way:

(setq org-latex-pdf-process
    '("latexmk -pdflatex='pdflatex -interaction nonstopmode' -pdf -bibtex -f %f"))

A minimal but complete orgmode example

The usual org-ref workflow is that you use it to manage a single main BiBTeX file, which is configured in your init.el. As mentioned above, I prefer using small document-specific BiBTeX files.

This is what the first three lines sets up using an Emacs file variable. Note that the local bib file is specified without quotes of any kind. Also, because there’s no path specified, it lives together with the orgmode document itself, which I prefer.

# Local Variables:
# org-ref-default-bibliography: local-bibtex-file.bib
# End:

After this, I have two LaTeX-specific lines ensuring that the output PDF looks prettier than default. You can thank me later.

The third LATEX_HEADER line imports the natbib package, which means I have extra cite commands to differentiate between for example textual citing (the first example) and parenthetical citing (the second example).

#+LATEX_CLASS_OPTIONS: [a4paper, 11pt, colorlinks=true, citecolor=., linkcolor=black, urlcolor=black]
#+LATEX_HEADER: \usepackage{fourier}
#+LATEX_HEADER: \usepackage{natbib}

#+TITLE: Example orgmode to LaTeX

* Introduction

  This is that really difficult intro spection, as was also
  demonstrated by cite:huang_identification_2016.

* Conclusions

  This is the even more difficult final section. We will now use a
  parenthetical citation citep:meyer_four-level_2012.

Finally, very importantly, I have the orgmode orgmode markup to select a suitable bibliography style, and again to specify the bib file. This will get automatically translated to the suitable LaTeX or even HTML versions.

Putting it all together

With something like the example above in your orgmode file, and a suitable bib-file, you can do C-c C-e l o to export to LaTeX, execute the latexmk command we specified above, and view the PDF.

In the screenshot at the start of this post, you can see what this looks like on my setup. I am using PDFTools for viewing the PDF inside of Emacs (this gets updated automatically whenever I rebuild the PDF).

You can also see the ivy-based completion as I’m in the process of interactively inserting a new citation into the document.

The default helm-based completion is more attractive, but it takes over the whole display, which is why I made use of ivy for instruction’s sake.