Do not meddle in the affairs of wizards, for you are crunchy and good with ketchup.

XS Mechanics


This article is about XS. It explains what it is, why it is, how it works, and how to use it. It includes a complete, working example of an XS module, and a stub module that you can use as a starting point for your own code. It is an express goal of this article to provide the background and information necessary for you to write your own XS modules.

This article is in five parts
November Introduction motivation, definitions, examples
December Architecture the Perl interpreter, calling conventions, data representation
January Tools h2xs, xsubpp, DynaLoader
February Modules Math::Ackermann, Set::Bit
March Align::NW Needleman-Wunsch global optimal sequence alignment

What it is

XS is a (phonetic?) acronym for eXternal Subroutine, where external means external to Perl, i.e. written in some other language, such as C or C++. With XS, we can call C subroutines directly from Perl code, as if they were Perl subroutines.

Narrowly, XS is the name of the glue language that is used to specify the subroutine interfaces and data conversions necessary to call C from Perl. More broadly, XS comprises a system of programs and facilities that work together to make this happen: h2xs, MakeMaker, xsubpp, DynaLoader, and the XS language itself. We'll talk about all these later on.

Why it is

Perl is a Swiss-army chainsaw, but there are still some things that shouldn't be done in Perl. Examples include

More generally, Perl is an applications programming language. It provides powerful facilities, such as automatic data typing, automatic memory management, hash tables, and regular expressions. These make it easy to bolt together applications, without having to attend to every little detail. The tradeoff is that these facilities have substantial run-time costs.

In contrast, C and C++ are examples of systems programming languages. They provide control over every CPU cycle and every byte, so that inner loops can be fast and critical data structures can be small. The tradeoff is that you have to program every CPU cycle and every byte in the entire program: even parts that aren't processor bound.

XS lets us have the best of both worlds. With XS, we can use Perl for the bulk of our code, and C for just those parts that require fine control over system resources.

A fork in the road

Now you have to decide whether you want to write XS, or whether you just want to get the job done.

If you just want to get the job done, consider using the Simplified Wrapper and Interface Generator (SWIG). SWIG is a software development tool that connects assorted application programming languages, such as Perl, Python, and Tcl, to assorted system programming languages, such as C, C++, and Objective-C.

SWIG is very easy to use. At its simplest, you just hand it your .c file, tell it what your application language is, and it does the rest. Here's an example, adapted from the SWIG documentation:

unix> swig -perl5 -module example example.c
unix> gcc -c example.c example_wrap.c
unix> ld -G example.o example_wrap.o -o
unix> perl5.005
use example;
print example::factorial(4), "\n";

I could write a tutorial on SWIG, but it would be superfluous: SWIG already has extensive documentation. SWIG is available online, it is free, and it works. If you just want to get the job done, this SWIG is for you.

Learning XS

If you want to write XS, you have to learn it. Learning XS is very difficult, for two reasons.

The first is that the core Perl docs, such as perlxs and perlguts, tacitly assume that you already understand XS. Accordingly, they omit or gloss over crucial assumptions and background information. This sounds bad, but it is actually rather common in the Unix world.

The second is that you can't learn XS. Not as such. Not from the top down. This problem is much more profound than the first, and it stems not from any inadequacy in the documentation, but from what XS is—and isn't.

The Perl docs refer to XS as a language, but it isn't. XS is a collection of macros. The XS langauge processor is a program called xsubpp, where pp is short for PreProcessor, and PreProcessor is a polite term for macro expander. xsubpp expands XS macros into the bits of C code necessary to connect the Perl interpreter to your C-language subroutines.

Because XS isn't a language, it lacks structure. The underlying C code has structure, but you can't see it, because it is hidden behind the macros. This makes it virtually impossible to learn XS on its own terms.

Back to Basics

In order to learn XS, you have to work from the bottom up. You have to learn the Perl C API. You have to understand Perl's internal data structures. You have to understand how the Perl stack works, and how a C subroutine gets access to it. You have to understand how C subroutines get linked into the Perl executable. You have to understand the data paths through the DynaLoader module that bind the name of a Perl subroutine to the entry point of a C subroutine.

Once you understand all this, you don't strictly need XS: you can code directly to the Perl C API, and your C code will link and run under the Perl interpreter.

If you do code directly to the Perl C API, you will find that it is difficult, error-prone, tedious, and repetitive. You keep writing the same little bits of code to move parameters on and off the Perl stack; to convert data from Perl's internal representation to C variables; to check for null pointers and other Bad Things. When you make a mistake, you don't get bad output: you crash the interpreter.


Eventually, you begin to see the advantage of wrapping these little bits of code in macros, so that you can write them once and then stop worrying about them. And what do you know, someone has already written some macros for you; there is even this macro expander called xsubpp.

Now you understand XS.

A hard enough problem

The first thing you need in order to write an XS module is a program that you absolutely cannot write in straight Perl. Writing C and XS when you could be writing Perl would be an egregious failure to be Lazy.

My favorite number cruncher used to be the Fast Fourier Transform, but as I think of it now, it seems—well—dated. It's so classical, so linear, so old-tech. Besides, it runs in O(n*log(n)), which is almost tractable in Perl.

Instead, I'm going to code up the Needleman-Wunsch (NW) dynamic programming algorithm for global optimal sequence alignment. Sequence alignment is an important problem in the bleeding-edge field of genomics. Here is

Sequence alignment is a combinatorial problem, and naive algorithms run in exponential time. The Needleman-Wunsch algorithm runs in (more or less) O(n^3), which is still bad enough that the genomics community uses specialized hardware and networked databases to do their alignments.

As a benchmark, I aligned 2 sequences of 200 characters each. This is a rather modest problem by the standards of genomics. The Perl implementation aligns them in something like 200 or 400 seconds. The exact time is irrelevant: it takes longer than I am willing to wait.

The O(n^3) step in the NW algorithm is filling in the score matrix; everything else runs in linear time. I wrote a C program that fills in the score matrix.

It runs the benchmark 200x200 alignment in 3 seconds, or about 100 times faster than the Perl implementation.

I don't want to rewrite the rest of the Perl implementation in C. Parts of the algorithm are intricate, and it relies heavily on Perl for housekeeping and memory management. It is the sort of code that is a joy in Perl and a burden in C.

Instead, I want to use the C implementation to fill in the score matrix, use the Perl implementation for everything else, and use XS to call from one to the other. In the next four parts of this article, we'll see how to do that.

Next Month: Architecture

I claimed earlier that XS must be learned from the bottom up. The bottom turns out to be the von Neumann architecture for stored-program computers, and it's a long climb up from there. Instead of taking that path, we'll start at the top, and work our way down by analysis. This will give us the concepts that we need to understand XS.

There is a lot of material to cover, but either you understand the architecture beneath XS, or you are crunchy and good with ketchup.


rather common
I can still recall my bewilderment when I first stumbled upon the awk(1) man page.
pp is short for PreProcessor
Actually, pp is short for Perl Pseudocode, but it sounded good...
Another advantage of coding to XS is that it shields your code from changes to the Perl C API.
1965, as it happens.
J. W. Cooley and J. W. Tukey, "An Algorithm for the Machine Calculation of Complex Fourier Series", Mathematics of Computation, Vol. 19, 1965, pp. 297-301.
Needleman, S.B. and Wunsch, C.D. 1970. "A general method applicable to the search for similarities in the amino acid sequences of two proteins" Journal of Molecular Biology. 48: 443-453.
See also
Smith, T.F. and Waterman, M.S. 1981. "Identification of common molecular subsequences" Journal of Molecular Biology. 147: 195-197
O(n^2), if the gap-open penalty is zero
linear time
The Smith-Waterman algorithm needs O(n^2) time to find the highest scoring cell in the matrix.


German courtesy of
Russian courtesy of Steve O'Neil

Creative Commons License
XS Mechanics by Steven W. McDougall is licensed under a Creative Commons Attribution 3.0 Unported License.
Steven W. McDougall / resume / / 2000 February 11