The emacsclient(1) program is used to connect to Emacs running as a daemon. emacsclient(1) can go in your EDITOR/VISUAL environment variables so that you can edit things like Git commit messages and sudoers files in your existing Emacs session, rather than starting up a new instances of Emacs. It’s not only that this is usually faster, but also that it means you have all your session state available – for example, you can yank text from other files you were editing into the file you’re now editing.

Another, somewhat different use of emacsclient(1) is to open new Emacs frames for arbitrary work, not just editing a single, given file. This can be in a terminal or under a graphical display manager. I use emacsclient(1) for this purpose about as often as I invoke it via EDITOR/VISUAL. I use emacsclient -nc to open new graphical frames and emacsclient -t to open new text-mode frames, the latter when SSHing into my work machine from home, or similar. In each case, all my buffers, command history etc. are available. It’s a real productivity boost.

Some people use systemd socket activation to start up the Emacs daemon. That way, they only need ever invoke emacsclient, without any special options, and the daemon will be started if not already running. In my case, instead, emacsclient on PATH is a wrapper script that checks whether a daemon is running and starts one if necessary. The main reason I have this script is that I regularly use both the installed version of Emacs and in-tree builds of Emacs out of emacs.git, and the script knows how to choose what to launch and what to try to connect to. In particular, it ensures that the in-tree emacsclient(1) is not used to try to connect to the installed Emacs, which might fail due to protocol changes. And it won’t use the in-tree Emacs executable if I’m currently recompiling Emacs.

I’ve recently enhanced my wrapper script to make it possible to have the primary Emacs daemon always running under gdb. That way, if there’s a seemingly-random crash, I might be able to learn something about what happened. The tricky thing is that I want gdb to be running inside an instance of Emacs too, because Emacs has a nice interface to gdb. Further, gdb’s Emacs instance – hereafter “gdbmacs” – needs to be the installed, optimised build of Emacs, not the in-tree build, such that it’s less likely to suffer the same crash. And the whole thing must be transparent: I shouldn’t have to do anything special to launch the primary session under gdb. That is, if right after booting up my machine I execute

% emacsclient foo.txt

then gdbmacs should start, it should then start the primary sesion under gdb, and finally the real emacsclient(1) should connect to the primary session and request editing foo.txt. I’ve got that all working now, and there are some nice additional features. If the primary session hits a breakpoint, for example, then emacsclient requests will be redirected to gdbmacs, so that I can still edit files etc. without losing the information in the gdb session. I’ve given gdbmacs a different background colour, so that if I request a new graphical frame and it pops up with that colour, I know that the main session is wedged and I might like to investigate.

First attempt: remote attaching

My first attempt, which was running for several weeks, had a different architecture. Instead of having gdbmacs start up the primary session, the primary session would start up gdbmacs, send over its own PID, and ask gdbmacs to use gdb’s functionality for attaching to existing processes. In after-init-hook I had to code to check whether we are an Emacs that just started up out of my clone emacs.git, and if so, we invoke

% emacsclient --socket-name=gdbmacs --spw/installed \
              --eval '(spw/gdbmacs-attach <the pid>)'

The --spw/installed option asks the wrapper script to start up gdbmacs using the Emacs binary on PATH, not the one in emacs.git/. (We can’t use the server-eval-at function because we need the wrapper script to start up gdbmacs if it’s not already running.)

Over in gdbmacs, the spw/gdbmacs-attach function then did something like this:

(let ((default-directory (expand-file-name "~/src/emacs/")))
  (gdb (format "gdb -i=mi --pid=%d src/emacs" pid))
  (gdb-wait-for-pending (lambda () (gud-basic-call "continue"))))

Having gdbmacs attach to the existing process is more robust than having gdbmacs start up Emacs under gdb. If anything goes wrong with attaching, or with gdbmacs more generally, you’ve still got the primary session running normally; it just won’t be under a debugger. More significantly, the wrapper script doesn’t need to know anything about the relationship between the two daemons. It just needs to be able to start up both in-tree and installed daemons, using the --spw/installed option to determine which. The complexity is all in Lisp, not shell script (the wrapper is a shell script because it needs to start up fast).

The disadvantage of this scheme is that the primary session’s stdout and stderr are not directly accessible to gdbmacs. There is a function redirect-debugging-output to deal with this situation, and I experimented with having the primary session call this and send the new output filename to gdbmacs, but it’s much less smooth than having gdbmacs start up the primary session itself.

I think most people would probably prefer this scheme. It’s definitely cleaner to have the two daemons start up independently, and then have one attach to the other. But I decided that I was willing to complexify my wrapper script in order to have the primary session’s stdout and stderr attached to gdbmacs in the normal way.

Second attempt: daemons starting daemons

In this version, the relevant logic is shifted out of Lisp into the wrapper script. When we execute emacsclient foo.txt, the script first determines whether the primary session is already running, using something like this:

[ -e /run/user/1000/emac/server \
    -a -n "$(ss -Hplx src /run/user/1000/emacs/server)" ]

The ss(8) tool is used to determine if anything is listening on the socket. The script also uses flock(1) to have other instances of the wrapper script wait, in case they are going to cause the daemon to exit, or something. If the daemon is running, then we can just exec emacs.git/lib-src/emacsclient to handle the request. If not, we first have to start up gdbmacs:

installed_emacsclient=$(PATH=$(echo "$PATH" \
                   | sed -e "s#/directory/containing/wrapper/script##") \
                command -v emacsclient)
"$installed_emacsclient" -a '' -sgdbmacs --eval '(spw/gdbmacs-attach)'

spw/gdbmacs-attach now does something like this:

(let ((default-directory (expand-file-name "~/src/emacs/")))
  (gdb "gdb -i=mi --args src/emacs --fg-daemon")
  (gdb-wait-for-pending
   (lambda ()
     (gud-basic-call "set cwd ~")
     (gdb-wait-for-pending
      (lambda ()
        (gud-basic-call "run"))))))

"$installed_emacsclient" exits as soon as spw/gdbmacs-attach returns, which is before the primary session has started listening on the socket, so the wrapper script uses inotifywait(1) to wait until /run/user/1000/server appears. Then it is finally able to exec ~/src/emacs/lib-src/emacsclient to handle the request.

A particular kind of complexity

The wrapper script must be highly reliable. I use my primary Emacs session for everything, on the same laptop that I do my academic work. The main way I get at it is via a window manager shortcut that executes emacsclient -nc to request a new frame, such that if there is a problem, I won’t see any error output until I open an xterm and tail ~/.swayerr/~/.xsession-errors. And as starting gdbmacs and only then starting up less optimised, debug in-tree builds of Emacs is not fast, I would have to wait at least ten seconds without any Emacs frame popping up before I could suppose that something was wrong.

This is where the first scheme, where the complexity is all in Lisp, really seems attractive. My emacsclient(1) wrapper script has several other facilities and convenience features, some of which are general and some of which are only for my personal usage patterns, and the code for all those is now interleaved with the special cases for gdbmacs and the primary session that I’ve described in this post. There’s a lot that could go wrong, and it’s all in shell, and its output isn’t readily visible to the user. I’ve done a lot of testing, and I’m pretty confident in the script in its current form, but if I need to change or add features, I’ll have to do a lot of testing again before I can deploy to my usual laptop.

Single-threaded, readily interactively-debuggable Emacs Lisp really shines for this sort of “do exactly what I mean, as often as possible” code, and you find a lot of it in Emacs itself, third party packages, and peoples’ init.el files. You can add all sorts of special cases to your interactive commands to make Emacs do just what is most useful, and have confidence that you can manage the resulting complexity. In this case, though, I’ve got piles of just this sort of complexity out in an opaque shell script. The ultimate goal, though, is debugging Emacs, such that one can run yet more DJWIM Emacs Lisp, which perhaps justifies it.

Posted Thu 03 Nov 2022 18:50:28 UTC Tags: