Improving fastai’s mixed precision support with NVIDIA’s Automatic Mixed Precision.

TL;DR: For best results with mixed precision training, use NVIDIA’s Automatic Mixed Precision together with fastai, and remember to set any epsilons, for example in the optimizer, correctly.

Background

Newer NVIDIA GPUs such as the consumer RTX range, the Tesla V100 and others have hardware support for half-precision / fp16 tensors.

This is interesting, because many deep neural networks still function perfectly if you store most of their parameters using the far more compact 16-bit floating point precision. The newer hardware (sometimes called TensorCores) is able to accelerate further these half precision operations.

In other words, with one of the newer cards, you’ll be able to fit a significantly larger neural network into the usually quite limited GPU memory (with CNNs, I can work with networks that are 80% larger), and you’ll be able to train that network substantially faster.

fastai has built-in support for mixed-precision training, but NVIDIA’s AMP has better support due to its support of dynamic, instead of static, loss scaling.

In the rest of this blog post, I briefly explain the two steps you need to take to get all of this working.

Step 1: Set epsilon so it doesn’t disappear under fp16

I’m mentioning this first so you don’t miss it.

Even after adding AMP to your configuration, you might still see NaNs during network training.

If you’re lucky, you will run into this post on the PyTorch forums.

In short, the torch.optim.Adam optimizer, and probably a number of other optimizers in PyTorch, take an epsilon argument which is added to possibly small denominators to avoid dividing by zero.

The default value of epsilon is 1e-8. Whoops!

Under fp16 encoding, 1e-8 becomes 0, and so it won’t really help to fix your possibly zero denominators.

The fix is simple, supply a larger epsilon.

Because I’m using fastai’s Learner directly, and this takes a callable for the optimization function, I created a partial:

# create fp16-safe AdamW
# see: https://discuss.pytorch.org/t/adam-half-precision-nans/1765/4
# default 1e-8 rounded to 0
# down to 1e-7 can still be handled
# this eps is used to prevent divide by zero errors
from functools import partial
AdamW16 = partial(torch.optim.Adam, betas=(0.9,0.99), eps=1e-4)

# then stick model + databunch into new Learner
learner = fai.basic_train.Learner(data, model, loss_func=ml_sm_loss, metrics=metrics, opt_func=AdamW16)

Step 2: Setup NVIDIA’s Automatic Mixed Precision

fastai’s built-in support for mixed precision training certainly works in many cases. However, it uses a configurable static loss scaling parameter (default 512.0), which in some cases won’t get you as far as dynamic loss scaling.

With dynamic loss scaling, the scaling factor is continuously adapted to squeeze the most out of the available precision.

(You could read sgugger’s excellent summary of mixed precision training on the fastai forums.)

I was trying to fit a squeeze and excitation ResNeXt-50 32×4 with image size 400×400 and batch size 24 into the 8GB RAM of the humble but hard-working RTX2070 in my desktop, so I needed all of the dynamic scaling help I could get.

After having applied the epsilon fix mentioned above, you will then install NVIDIA Apex, and finally make three changes to your and fastai’s code.

Install NVIDIA Apex

Download and install NVIDIA Apex into the Python environment you’re using for your fastai experiment.

conda activate your_fastai_env
cd ~
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

If apex does not build, you can also try without --cude_ext --cpp_ext, although it’s best if you can get the extensions built.

Modify your training script

At the top if your training script, before any other imports (especially anything to do with PyTorch), add the following:

from apex import amp
amp_handle = amp.init(enabled=True)

This will initialise apex, enabling it to hook into a number of PyTorch calls.

Modify fastai’s training loop

You will have to modify fastai’s basic_train.py, which you should be able to find in your_env_dir/lib/python3.7/site-packages/fastai/. Check and double-check that you have the right file.

At the top of this file, before any other imports, add the following:

from apex.amp import amp
# retrieve initialised AMP handle
amp_handle = amp._DECORATOR_HANDLE

Then, edit the loss_batch function according to the following instructions and code-snippet. You will only add two new code lines which will replace the loss.backward() that you will be commenting out.

if opt is not None:
    loss = cb_handler.on_backward_begin(loss)

    # The following lines REPLACE the commented-out "loss.backward()"
    # opt is an OptimWrapper -- unwrap to get real optimizer
    with amp_handle.scale_loss(loss, opt.opt) as scaled_loss:
        scaled_loss.backward()

    # loss.backward()

All of this is merely following NVIDIA AMP’s usage instructions, which I most recently tested on fastai v1.0.42, latest at the time of this writing.

Results

If everything goes according to plan, you should be able to obtain the following well-known graph with a much larger network that you otherwise would have been able to.

The below example learning-rate finder plot was done with the se-resnext50-32x4d, image size 400×400, batch size 24 on my RTX 2070 as mentioned above. The procedure documented in this post works equally well on high end units such as the V100.

lr_finder_v20181127-se_resnext50_32x4d-320-48-400-24-fp16-v5-ml_sm-aug_wd1e-5_frozen_2019-02-04_09-39-29.png

A Simple Ansible script to convert a clean Ubuntu 18.04 to a CUDA 10, PyTorch 1.0 preview, fastai, miniconda3 deep learning machine.

I have prepared a simple Ansible script which will enable you to convert a clean Ubuntu 18.04 image (as supplied by Google Compute Engine or PaperSpace) into a CUDA 10, PyTorch 1.0 preview, fastai 1.0.x, miniconda3 powerhouse, ready to live the (mixed-precision!) deep learning dream.

I built this script specifically in order to be able to do mixed-precision neural network training on NVIDIA’s TensorCores. It currently makes use of the vxlabs.com build of PyTorch 1.0, because we need full CUDA 10 for the new TensorCores.

When I run this in order to configure a V100-equipped paperspace machine with 8 cores and 30GB of RAM, it takes about 13 minutes from start to finish.

Here’s a 20x sped up video showing the script doing it’s work on a GCE V100 machine, also with 8 cores and 30 GB RAM:

After running the script, you’ll be able to ssh or mosh in, type conda activate pt, and then start your NVIDIA-powered deep learning engines.

You can find the whole setup, including detailed instructions, at the ansible-ubu-to-pytorch github repo.

Updates

2018-11-24

Updated to latest 2018-11-24 build of PyTorch 1.0 preview with the new magma 2.4.0 packages.

To update an existing install, you can either just re-run the whole playbook, or you can run just the miniconda3-related tasks like this:

ansible-playbook -i inventory.cfg deploy.yml --tags "miniconda3"

Configuring Emacs, lsp-mode and Microsoft’s Visual Studio Code Python language server.

In a previous post I showed how to get Palantir’s Python Language Server working together with Emacs and lsp-mode.

In this post, we look at the brand new elephant in the room, Microsoft’s arguably far more powerful own Python Language Server, and how to integrate it with Emacs.

Motivation

Since that previous post on Palantir’s language server, I’ve been using Emacs far more intensively for Python coding in tmux on remote machines with GPUs for deep learning.

The interactive programming possibilities of Emacs (remember that Lisp programmers have been doing this since the 60s) make for a great development solution: I can interact with my remotely running neural network code, start a long-running training, and then detach from the running tmux. When I re-attach the next morning (or the next), I can continue interactively experimenting with the still-running Python instance, for example further fine-tuning the training.

Palantir’s Python Language Server can become sluggish at times, so I switched back to elpy which is usually quite snappy, and affords an impressive suite of code intelligence features for such a small package.

However, the elpy RPC Python process has a tendency to die quite often, and even M-x elpy-rpc-restart simply stops working at some point.

This, together with the fact that Microsoft’s own Python Language Server, the one used for both Visual Studio Code and also in Microsoft’s flagship Visual Studio IDE, has enjoyed and will continue to enjoy a far larger share of developer mindshare and attention, encouraged me to try and get this language server also working with Emacs.

This exercise cost many more hours than I was planning to spend, but everything seems to be working now.

/In this post, I will explain step-by-step how you too can enjoy the highly multi-threaded and actively developed Microsoft Python Language Server in your Emacs./

The Goal

If you follow the steps set out further down in this blog post, your Emacs too could end up looking like the one showed in the following screenshots.

emacs-ms-lsp-docbox-term_2018-11-19_16-21-42.png
LSP showing the blue documentation overlay box, with Emacs in a tmux, i.e. terminal mode, on a remote Linux machine, editing PyTorch / fastai training code.
emacs-ms-lsp-docbox-gui_2018-11-19_16-23-44.png
The same documentation showed locally, i.e. in GUI mode, on the macOS desktop.
emacs-ms-lsp-docbox-and-describe-term_2018-11-19_16-26-28.png
lsp-describe-thing-at-point on the right for more persistent documentation, blue overlay box on the left.
ms-pyls-threads_2018-11-19_16-28-45.png
C#’s strong multi-threading support comes in handy when parsing Python code! Ironic?

Two methods: The new and improved, and the old and complicated

Shortly after I wrote this blog post, Andrew Christianson turned everything here into a far easier to use Emacs package which you can conveniently download from github.

You can safely ignore the next subsection, which I’m keeping here purely for historical purposes.

It’s really great when you put together a blog post, and then someone else takes that ball and RUNS with it!

The OLD way: Four steps to combining Emacs with Microsoft’s Python Language Server

Read through all of the steps first, and then follow them carefully.

Let me know in the comments if anything should be further clarified.

Build the Microsoft Python Language Server using dotnet

If you don’t have this yet, download the .NET Core SDK for your platform from Microsoft’s .NET download site and install it.

Next, build the language server:

mkdir ~/build
cd ~/build
git clone https://github.com/Microsoft/python-language-server.git
cd python-language-server/src/LanguageServer/Impl
dotnet build -c Release

Install required Emacs packages

In Emacs, install the required and some optional packages using for example M-x package-install:

  • lsp-mode – the main language server protocol package
  • lsp-ui – UI-related LSP extras, such as the sideline info, docs, flycheck, etc.
  • company-lsp – company-backend for LSP-based code completion.
  • projectile or find-file-in-project – we use a single function from here to determine the root directory of a project.

Required change to lsp-ui-doc.el

Microsoft’s Python Language Server likes to replace spaces in the documentation it returns with   HTML entities.

Furthermore, there seems to be an additional misunderstanding between Emacs lsp-mode and the MS PyLS with regard to the interpretation of markdown and plaintext docstrings.

Both of these issues impact the blue documentation overlay, and should be worked around by editing the lsp-ui-doc-extract function in lsp-ui-doc.el.

Right before the line with

((gethash "kind" contents) (gethash "value" contents)) ;; MarkupContent

add the following sexp:

;; cpbotha: with numpy functions, e.g. np.array for example,
;; kind=markdown and docs are in markdown, but in default
;; lsp-ui-20181031 this is rendered as plaintext see
;; https://microsoft.github.io/language-server-protocol/specification#markupcontent

;; not only that, MS PyLS turns all spaces into   instances,
;; which we remove here this single additional cond clause fixes all
;; of this for hover

;; as if that was not enough: import pandas as pd - pandas is returned
;; with kind plaintext but contents markdown, whereas pd is returned
;; with kind markdown. fortunately, handling plaintext with the
;; markdown viewer still looks good, so here we are.
((member (gethash "kind" contents) '("markdown" "plaintext"))
 (replace-regexp-in-string " " " " (lsp-ui-doc--extract-marked-string contents)))

Emacs configuration

Add the following to your Emacs init.el, and don’t forget to read the comments.

If you’re not yet using use-package now would be a good time to upgrade.

(use-package lsp-mode
  :ensure t
  :config

  ;; change nil to 't to enable logging of packets between emacs and the LS
  ;; this was invaluable for debugging communication with the MS Python Language Server
  ;; and comparing this with what vs.code is doing
  (setq lsp-print-io nil)

  ;; lsp-ui gives us the blue documentation boxes and the sidebar info
  (use-package lsp-ui
    :ensure t
    :config
    (setq lsp-ui-sideline-ignore-duplicate t)
    (add-hook 'lsp-mode-hook 'lsp-ui-mode))

  ;; make sure we have lsp-imenu everywhere we have LSP
  (require 'lsp-imenu)
  (add-hook 'lsp-after-open-hook 'lsp-enable-imenu)

  ;; install LSP company backend for LSP-driven completion
  (use-package company-lsp
    :ensure t
    :config
    (push 'company-lsp company-backends))

  ;; dir containing Microsoft.Python.LanguageServer.dll
  (setq ms-pyls-dir (expand-file-name "~/build/python-language-server/output/bin/Release/"))

  ;; this gets called when we do lsp-describe-thing-at-point in lsp-methods.el
  ;; we remove all of the " " entities that MS PYLS adds
  ;; this is mostly harmless for other language servers
  (defun render-markup-content (kind content)
    (message kind)
    (replace-regexp-in-string " " " " content))
  (setq lsp-render-markdown-markup-content #'render-markup-content)

  ;; it's crucial that we send the correct Python version to MS PYLS, else it returns no docs in many cases
  ;; furthermore, we send the current Python's (can be virtualenv) sys.path as searchPaths
  (defun get-python-ver-and-syspath (workspace-root)
    "return list with pyver-string and json-encoded list of python search paths."
    (let ((python (executable-find python-shell-interpreter))
          (init "from __future__ import print_function; import sys; import json;")
          (ver "print(\"%s.%s\" % (sys.version_info[0], sys.version_info[1]));")
          (sp (concat "sys.path.insert(0, '" workspace-root "'); print(json.dumps(sys.path))")))
      (with-temp-buffer
        (call-process python nil t nil "-c" (concat init ver sp))
        (subseq (split-string (buffer-string) "\n") 0 2))))

  ;; I based most of this on the vs.code implementation:
  ;; https://github.com/Microsoft/vscode-python/blob/master/src/client/activation/languageServer/languageServer.ts#L219
  ;; (it still took quite a while to get right, but here we are!)
  (defun ms-pyls-extra-init-params (workspace)
    (destructuring-bind (pyver pysyspath) (get-python-ver-and-syspath (lsp--workspace-root workspace))
      `(:interpreter (
                      :properties (
                                   :InterpreterPath ,(executable-find python-shell-interpreter)
                                   ;; this database dir will be created if required
                                   :DatabasePath ,(expand-file-name (concat ms-pyls-dir "db/"))
                                   :Version ,pyver))
                     ;; preferredFormat "markdown" or "plaintext"
                     ;; experiment to find what works best -- over here mostly plaintext
                     :displayOptions (
                                      :preferredFormat "plaintext"
                                      :trimDocumentationLines :json-false
                                      :maxDocumentationLineLength 0
                                      :trimDocumentationText :json-false
                                      :maxDocumentationTextLength 0)
                     :searchPaths ,(json-read-from-string pysyspath))))  

  (lsp-define-stdio-client lsp-python "python"
                           #'ffip-get-project-root-directory
                           `("dotnet" ,(concat ms-pyls-dir "Microsoft.Python.LanguageServer.dll"))
                           :extra-init-params #'ms-pyls-extra-init-params)

  ;; lsp-python-enable is created by macro above 
  (add-hook 'python-mode-hook
            (lambda ()
              (lsp-python-enable)))

Conclusions

Although I would have preferred to do this with the two lsp-mode work-arounds, I am pretty satisfied with this setup.

With the number of users and development effort Microsoft’s Python Language Server has been enjoying and will probably continue to enjoy, it’s great knowing we can make use of this functionality in Emacs.

I am curious how well eglot, a smaller Emacs LSP package than lsp-mode, would do based on the integration above. (hint hint…)

Updates

2018-12-26

Andrew Christianson converted all of the complicated instructions in this blog post into a far easier to use Emacs package called lsp-python-ms!

2018-11-22

Uploaded new version of emacs-lisp init code with two improvements:

  • Thanks to reddit user cyanj for the print_function import suggestion, and commenter Erik Hetzner below for the old-style string interpolation suggestion, this setup should now also work with Python 2, in addition to 3.
  • The ms-pyls database directory is now created as a subdirectory of the bindir.

2018-11-20

So far, I have logged two issues, one with MS PyLS and one with lsp-mode, so that we can hopefully one day remove some of the work-arounds detailed above:

PyTorch 1.0 preview (Dec 6, 2018) packages with full CUDA 10 support for your Ubuntu 18.04 x86_64 systems.

(The wheel has now been updated to the latest PyTorch 1.0 preview as of December 6, 2018.)

You’ve just received a shiny new NVIDIA Turing (RTX 2070, 2080 or 2080 Ti), or maybe even a beautiful Tesla V100, and now you would like to try out mixed precision (well mostly fp16) training on those lovely tensor cores, using PyTorch on an Ubuntu 18.04 LTS x86_64 system.

tensor-core.jpg

The idea is that these tensor cores chew through fp16 much faster than they do through fp32. In practice, neural networks tolerate having large parts of themselves living in fp16, although one does have to be careful with this. Furthermore, fp16 promises to save a substantial amount of graphics memory, enabling one to train bigger models.

For full fp16 support on the Turing architecture, CUDA 10 is currently the best option. Also, a number of CUDA 10 specific improvements were made to PyTorch after the 0.4.1 release.

However, PyTorch 1.0 (first release after 0.4.1) is not quite ready yet, and neither is it easy to find CUDA 10 builds of the current PyTorch 1.0 preview / PyTorch nightly.

Oh noes…

Well, fret no more!

Here you’ll be able to find a fully CUDA 10 based build (pip wheel format) of PyTorch master as on November 10 (updated!), 2018, up to and including commit b5db6ac. I’ve linked it with a fully CUDA 10 based build of MAGMA 2.4.0 as well, which I built as a conda package.

Installing and using these packages.

Ensure that you have an Ubuntu 18.04 LTS system with CUDA 10 and CUDNN installed and configured. See this great CUDA 10 howto by Puget Systems.

After this, you will also need to download CUDNN 7.1 packages for your system from the NVIDIA Developer site. An NVIDIA developer account (free signup) is required for this. I downloaded and installed libcudnn7_7.4.1.5-1+cuda10.0_amd64.deb and libcudnn7-dev_7.4.1.5-1+cuda10.0_amd64.deb but you’ll probably only need the former.

Setup a suitable conda environment with Python 3.7. Setup and activate with something like the following:

conda create -n pt python=3.7 numpy mkl mkl-include setuptools cmake cffi typing
conda activate pt
conda install -c mingfeima mkldnn

You can now download the PyTorch nightly wheel of 2018-12-06 (347MB) and install with:

pip install torch-1.0.0a0+b5db6ac+20181206-cp37-cp37m-linux_x86_64.whl

The libraries in the wheel don’t have the conda-style relative RUNPATH correctly set, so you have to set LD_LIBRARY_PATH every time when starting your jupyter or any other Python code. This should work:

LD_LIBRARY_PATH=$CONDA_PREFIX/lib jupyter lab

You’re now good to go!

First tests of mixed precision training with fast.ai on Tesla V100.

I fired up a Google Compute Engine with Tesla V100 node in Amsterdam to check that everything works.

I used the latest version of the fastai library, and specifically the callbacks.fp16 notebook which forms part of the brilliant new fastai documentation generation system. See for example the generated page on the fp16 callbacks.

Below I show the MNIST example code where I tried to compare fp32 with fast.ai fp16 (well, mixed precision to be precise) training.

The simple CNN trains up to 97% accuracy in 8 seconds, which is pretty quick already, but I could not see any training speed difference between fp16 and fp32. This could very well be because the network is so tiny.

However, I could confirm that the model parameters (at the very least) were all stored in fp16 floats when using the fast.ai to_fp16() Learner method.

Train CNN with fp16

from fastai import *
from fastai.vision import *
path = untar_data(URLs.MNIST_SAMPLE)
data = ImageDataBunch.from_folder(path)
model = simple_cnn((3,16,16,2))
learn = Learner(data, model, metrics=[accuracy]).to_fp16()
learn.fit_one_cycle(5)
Total time: 00:08
epoch  train_loss  valid_loss  accuracy
1      0.202592    0.139505    0.948970  (00:01)
2      0.112530    0.103523    0.967125  (00:01)
3      0.079813    0.063746    0.973994  (00:01)
4      0.066733    0.056465    0.976938  (00:01)
5      0.069775    0.055017    0.977429  (00:01)

Check that type of parameters is half:

for p in model.parameters():
    print(p.type())
torch.cuda.HalfTensor
torch.cuda.HalfTensor
torch.cuda.HalfTensor

Train CNN with fp32

model32 = simple_cnn((3,16,16,2))
learn32 = Learner(data, model32, metrics=[accuracy])
learn32.fit_one_cycle(5)
Total time: 00:08
epoch  train_loss  valid_loss  accuracy
1      0.213889    0.151780    0.942100  (00:01)
2      0.106975    0.092190    0.966634  (00:01)
3      0.084529    0.083353    0.973013  (00:01)
4      0.069017    0.066023    0.976938  (00:01)
5      0.060235    0.056738    0.980373  (00:01)

Check that type of model parameters is full float:

for p in model32.parameters():
    print(p.type())
torch.cuda.FloatTensor
torch.cuda.FloatTensor
torch.cuda.FloatTensor

Importing all of your orgmode notes into Apple Notes for mobile access.

Over the years, I’ve built up quite a collection of notes as Org mode text files. So far, it has proven to be the most expressive and the most robust note-taking modality out of a long list of candidates that I’ve tried.

Note-taking using Org mode has one big drawback however: Mobile accessibility.

In other words, consulting one’s org mode notes database from a mobile device is painful. This should not be the case; notes should be always and instantly available, even on mobile.

In this blog post, I show you how to import your complete org mode notes database into Apple Notes, including typesetting and attachments, using the org mode publishing functionality.

To be clear: Org mode on the desktop remains my primary note-taking system. The goal of importing all of my notes into Apple Notes is only to have my personal knowledge base accessible from my mobile devices.

End result

After configuring and running org-publish, and then importing the whole directory of exported HTML files and attachments into Apple Notes on macOS, your notes will look like one of the two examples below: First macOS, then iOS on the phone.

Note the Emacs-supplied syntax highlighting, and the inline image.

If you import these to your icloud account (the default) the notes will be available on all of your iOS mobile devices.

These imported notes are fully indexed, and hence searchable from all of your devices.

screenshot_2018-10-29_13-21-06.png
An example note in Apple Notes on macOS.
screenshot_2018-10-29_13-44-23.png
The same note as above, but now in Apple Notes on iOS.

org-publish configuration

In order to make this happen, we make use of the org-publish functionality. We also configure one or two Apple Notes-specific changes to improve rendering.

Add the configuration below to your init.el.

There are two publish targets: One for the org files (called pkb4000 below), and one for all of the attachments (called pkb4000-static below).

As an aside, pkb4000 is short for Personal Knowledge Base 4000. I chose the name as a joke, as this synced directory of org mode files felt like just the Nth in a long series of knowledge base iterations. Little did I know how well this one would stick.

Remember to change both the :base-directory properties to the top-level directory of your notes database. :publishing-directory is anywhere convenient where you would like to store the published HTML files and attachments.

;; https://orgmode.org/manual/HTML-preamble-and-postamble.html
;; disable author + date + validate link at end of HTML exports
(setq org-html-postamble nil)

(defun org-html-publish-to-html-for-apple-notes (plist filename pub-dir)
  "Convert blank lines to <br /> and remove <h1> titles."
  ;; temporarily configure export to convert math to images because
  ;; apple notes obviously can't use mathjax (the default)
  (let* ((org-html-with-latex 'imagemagick)
         (outfile
          (org-publish-org-to 'html filename
                              (concat "." (or (plist-get plist :html-extension)
                                              org-html-extension
                                              "html"))
                              plist pub-dir)))
    ;; 1. apple notes handles <p> paras badly, so we have to replace all blank
    ;;    lines (which the orgmode export accurately leaves for us) with
    ;;    <br /> tags to get apple notes to actually render blank lines between
    ;;    paragraphs
    ;; 2. remove large h1 with title, as apple notes already adds <title> as
    ;; the note title
    (shell-command
     (format "sed -i \"\" -e 's/^$/<br \\/>/' -e 's/<h1 class=\"title\">.*<\\/h1>$//' %s"
             outfile))
    outfile))

(setq org-publish-project-alist
      '(("pkb4000"
         :base-directory "~/Dropbox/notes/pkb4000/"
         :publishing-directory "~/Downloads/pkb4000"
         :recursive t
         :publishing-function org-html-publish-to-html-for-apple-notes
         :section-numbers nil
         :with-toc nil)
        ("pkb4000-static"
         :base-directory "~/Dropbox/notes/pkb4000/"
         :base-extension "css\\|js\\|png\\|jpg\\|gif\\|pdf\\|mp3\\|ogg\\|swf"
         :publishing-directory "~/Downloads/pkb4000/"
         :recursive t
         :publishing-function org-publish-attachment
         )))

Publish your database and import to Apple Notes

Start the process by M-x org-publish, at which point Emacs will ask you to select a target. You have to org-publish both of the targets.

If you have a large database, Emacs might report errors in your org files. Fix these, and restart the process. org-publish caches its output, so re-runs should not take that long.

After a successful publish, import the whole :publishing-directory into Apple Notes on macOS by selecting File | Import from the menu.

Apple Notes will create a new Imported Notes folder containing your whole notes hierarchy.

This process can be easily repeated when you want to refresh the database on your Apple Notes, but unfortunately Notes will create a new Imported Notes N folder.

Alternatively, you can re-publish, and then import just changed HTML files one by one, after which you’ll have to move them back into the correct Apple Notes folder.

Conclusion

This solves the problem of being able to search rapidly and consult your whole org mode notes database using your iOS mobile device.

However, it does not yet solve the problem of importing Apple Notes you create on the mobile device back into Orgmode. This is something one could consider trying to solve using AppleScript.

Whatever the case may be, this is still a nice improvement over my previous workflow!

Bonus round: Convert orgmode buffer to Apple Note using AppleScript

Before I tried org-publish, I worked on some emacs-lisp and AppleScript to convert the current org mode buffer to HTML, and then to inject that into Apple Notes using Apple Script.

I am posting this here in case it might be useful to someone. However it is NOT required for the org-publish workflow described above.

This assumes that you have an orgmode folder.

Although far more humble than org-publish, this code will only create a new note if it does not exist yet. If the note already exists, it will simply update its contents to the current org file.

;; https://www.emacswiki.org/emacs/string-utils.el
(defun string-utils-escape-double-quotes (str-val)
  "Return STR-VAL with every double-quote escaped with backslash."
  (save-match-data
    (replace-regexp-in-string "\"" "\\\\\"" str-val)))

(defun string-utils-escape-backslash (str-val)
  "Return STR-VAL with every backslash escaped with an additional backslash."
  (save-match-data
    (replace-regexp-in-string "\\\\" "\\\\\\\\" str-val)))

(setq as-tmpl "set TITLE to \"%s\"
set NBODY to \"%s\"
tell application \"Notes\"
        tell folder \"orgmode\"
                if not (note named TITLE exists) then
                        make new note with properties {name:TITLE}
                end if
                set body of note TITLE to NBODY
        end tell
end tell")

(defun oan-export ()
  (interactive)
  (let ((title (file-name-base (buffer-file-name))))
    (with-current-buffer (org-export-to-buffer 'html "*orgmode-to-apple-notes*")
      (let ((body (string-utils-escape-double-quotes
                   (string-utils-escape-backslash (buffer-string)))))
        ;; install title + body into template above and send to notes
        (do-applescript (format as-tmpl title body))
        ;; get rid of temp orgmode-to-apple-notes buffer
        (kill-buffer))
      )))

p