Building packages on SmartOS

This is a how-to for building pkgsrc packages on SmartOS. It builds upon and integrates knowledge and scripts from Filip Hajny and Jonathan Perkin. The first part documents how I integrated their two pkgsrc approaches for maximal gain with minimal configuration change.

The immediate impetus in getting into pkgsrc was in trying to figure out how to build IPython for the purposes of supporting its parallel computation abilities. IPython has basic support within pkgsrc, but use beyond an advanced REPL (which is how most think of IPython) is limited without additional requirements. The specific dependencies that were required were missing from the Joyent pkgsrc repository, and in some cases conflicted with what was available on the normal public repository. So the second part of this blog post documents some problems that might be encountered in building packages in this configuration.

This is very much a document of the moment, so it may no longer be relevant past the first quarter of 2013. Joyent has indicated that packaging will change, that the distinctions between 32- and 64-bit architectures may disappear, that bulk builds may be better supported, and that the big distinctions between quarterly releases may blur or disappear.

Although verbatim details in this post are very likely to change, there may be useful lessons in this and following work on the topic. Be sure to check the date of this blog post and to verify any results within your own computing environment.

Configuring pkgsrc on SmartOS

Pkgsrc, originally for NetBSD, is the native packaging system on SmartOS, which is available as a stand-alone OS, on the Joyent Public Cloud, and on SmartDataCenter-enabled cloud environments. Numerous packages — likely the majority that you’ll ever need — are pre-built for SmartOS users, but there will eventually be a time when a package or a configuration or a package version will not be to your liking. When this happens, you might want to build a package yourself to improve configuration repeatability or speed of machine provisioning.

Jonathan Perkin’s post on pkgsrc on SmartOS was the main reference and starting point for the configuration described here. Please read that article (and its followup for getting serious about fixing builds) before proceeding.

Basic setup

Once you have a machine provisioned, the standard advice applies — login as root, update and upgrade your existing packages, and install the basics for bootstrapping the pkgsrc environment:

pkgin -y up
pkgin -y fug
pkgin -y in gcc47 scmgit-base

I tend to keep my pkgsrc builds in similar places to earlier recommendations than what Jonathan recommends — many built-in scripts assume some locations, and it’s easier to work with them:

cd /opt && git clone git://github.com/joyent/pkgsrc.git
cd /opt/pkgsrc && git checkout joyent/release/2012Q4 && git submodule init && git submodule update

I use the pkgsrc mirror exactly as Joyent keeps it. The entire tree is huge and quickly evolving. It is not my intention to keep up with that tree by maintaining my own fork.

However, all I need to customize the packages that interest me is contained within the somewhat deprecated pk framework. pk (from long-time Joyeur Filip ‘mamash’ Hajny) is an intricate system of shell scripts, makefiles, and overlay directories that allow for more complex integration with SmartOS (e.g., the SMF framework) than is feasible to push upstream to the main pkgsrc source repository. Overall, it conflicts a bit with Jonathan’s approach that we’re referencing, but thanks to some help from both Jonathan and Filip via IRC and Twitter, we can take advantage of its most powerful parts.

Finally, I do maintain a fork of the original pk, where I can push updates and tweaks on particular packages. I install this in /opt/pkgsrc-meta, out of the way of the originally intended /opt/pk because the built-in /opt/local/etc/pk.conf is configured to import the whole of the pk framework if it finds it there. We don’t quite want this.

cd /opt && git clone git://github.com/atl/pk.git pkgsrc-meta

I recently made the default branch the most up-to-date pkgsrc_2012Q2 branch, so if you take my fork, there’s less need to checkout the newest branch. Otherwise, if you are working from mamash’s original repository for example, you will want to checkout the most recent branch.

Configuration

What is of most interest from the pk framework is the SMF integration, INSTALL scripts, and the definition of good default options for SmartOS use. In order to do that, we make slightly more extensive configuration changes than Jonathan in the default /opt/local/etc/mk.conf. The below fragment is inserted near the end of the existing pk.conf, just after the TOOLS_PLATFORM.* definitions:

## insert into
# /opt/local/etc/mk.conf

# Put our files under /opt but away from the two repos:
DISTDIR=        /opt/packages/distfiles

# Grab the tail of the branch:
PKGRELEASE!= ${TOOLS_PLATFORM.awk} -F/ '{print $$NF}' /opt/pkgsrc/.git/HEAD

# Put the finished files in /opt/packages
PACKAGES=       /opt/packages/${PKGRELEASE}/${ABI}
WRKOBJDIR=      /var/tmp/pkgsrc-build

# The SMF support files overlay is here:
METAOVERLAY=    /opt/pkgsrc-meta/meta

# The following configures that overlay:
.include "/opt/pkgsrc-meta/config/smf.inc"
# smf.inc sets SMFBASE, so we need to reset it here
SMFBASE=        ${METAOVERLAY}

# These are from Jonathan:
ALLOW_VULNERABLE_PACKAGES=      yes
SKIP_LICENSE_CHECK=             yes
MAKE_JOBS=                      4
FETCH_USING=                    curl
DEPENDS_TARGET=                 bin-install

# Adjust BINPKG_SITES, depending upon the dataset chosen, it should be the URL
# from /opt/local/etc/pkgin/repositories.conf without the trailing 'All'.
.if   $(ABI) == 32
    BINPKG_SITES=   http://pkgsrc.joyent.com/sdc6/2012Q2/i386
.elif $(ABI) == 64
    BINPKG_SITES=   http://pkgsrc.joyent.com/sdc6/2012Q2/x86_64
.else
    BINPKG_SITES=   http://pkgsrc.joyent.com/sdc6/2012Q2/${ABI}
.endif

# We want to pull in additional INSTALL script templates from the overlay:
.if exists(${METAOVERLAY}/${PKGPATH}/INSTALL)
    INSTALL_TEMPLATES+=        ${METAOVERLAY}/${PKGPATH}/INSTALL
.endif

With this configuration, on an sdc:sdc:base:1.8.4 SmartMachine, I am able to build more recent 2012Q4-based packages on top of a 2012Q2 base. This general recipe works on a base64 image as well.

The configuration script could be endlessly tweaked and made more generic — it is fairly dependent upon my choice of paths described above. I welcome improvements on it if you find the base useful.

Recap

To review, the below are key directories and files used in our pkgsrc build machine(s):

/opt/                           # writable directory for local changes
/opt/local/                     # standard local packages and configs
/opt/local/etc/mk.conf          # config hook for pkgsrc
/opt/pkgsrc/                    # pkgsrc package sources
/opt/pkgsrc-meta/               # pk framework
/opt/pkgsrc-meta/meta/          # main overlay tree for SmartOS extensions
/opt/pkgsrc-meta/config/smf.inc # main hook for configuring the above
/opt/packages/                  # local working and output directory
/opt/packages/201nQm/           # binary packages
/opt/packages/distfiles/        # upstream source cache
/var/tmp/pkgsrc-build           # working directory for builds

Basic builds using pkgsrc

With this configuration in place, you can proceed with building packages. All of the packages I have wanted recently compile without any problems, which speaks to both the maturity of pkgsrc and the work Joyent has done to integrate it as a first-class tool in SmartOS.

The above configuration attempts to download existing binaries from the default Joyent-provided pkgsrc repository in preference to building them from scratch. This is far preferable to the alternative of attempting to build all dependencies, and it makes it explicit which packages are necessary beyond the default packages.

The first build that we can do in order to test our configuration and the build system is the fundamental data transport for IPython’s Notebook and Parallel features: py-zmq, listed amongst IPython’s optional dependencies.

cd /opt/pkgsrc/net/py-zmq && bmake install

First, it detects that some fundamental packages for pkgsrc builds, like digest, nbpatch, and libtool-base are not installed, and installs them from the Joyent pkgsrc binary package repository. Then pkgsrc builds the zeromq package and then the Python bindings for it, and then installs them all. The verbose output should end with:

===> Building binary package for py27-zmq-2.1.11
=> Creating binary package /opt/packages/2012Q4/64/All/py27-zmq-2.1.11.tgz
===> Install binary package of py27-zmq-2.1.11

Jonathan’s tutorial goes into much more detail on building packages, including the useful OPTIONS framework. However, very few Python-based packages use options, so I won’t be delving into that topic here. You could build hundreds of packages and not encounter any problems.

Dealing with out-of-date dependencies

Building 2012Q4-branch (and more recent work-in-progress) packages on top of a 2012Q2 base is not without complications, however. The reason why packaging systems have coherent release is so that dependencies can progress in a structured manner. When building IPython’s optional dependencies, some packages failed to build because they required more recent packages than were installed or otherwise available on the Joyent binary package repository.

After installing py-zmq, the next basic dependency (for IPython’s notebook feature) is Tornado.

cd /opt/pkgsrc/wip/py-tornado && bmake install

This works without a hitch as of today. However, as I’ve been developing this tutorial, there were points at which the dependency chain had trouble. Tornado depends on py-curl, which depends on curl, which depends on OpenSSL, which received a recent security update. For a few weeks, the further dependency chain meant that a new version of Perl needed to be built before proceeding. Of course, since several fundamental packages depend on that, it’s unrealistic (and in some cases impossible) to uninstall all of the reverse dependency chain.

In that situation, then, it was necessary to force-delete the Perl package, and then replace it with a new build, and then repeat for OpenSSL:

pkg_delete -f perl-5.14.2nb5
cd /opt/pkgsrc/lang/perl5 && bmake install

pkg_delete -f openssl-0.9.8x
cd /opt/pkgsrc/security/openssl && bmake install

It was hard to pinpoint exactly where the dependency chain failed, and what had to be replaced. I got there by trial and error. Be sure to bmake clean when attempting to rebuild a package you’ve visited before, however.

Finish the dependencies with:

cd /opt/pkgsrc/textproc/py-pygments && bmake install
cd /opt/pkgsrc/devel/py-ipython013 && bmake install

The IPython build takes advantage of an SMF manifest, an INSTALL script, and other configuration items in the pk fork pulled in eariler.

The server configuration and processes are arranged for as quick a start as possible. IPython is endlessly and minutely configurable, but the two services defined are set up to be very reasonable defaults. This will be discussed at length in the next in the series.

Using newly built packages

I serve the files over HTTP from a long-running server using nginx. After building the set of packages I need, I find it worthwhile to sync all the packages to the web server and then shut down the machine. It’s time-consuming to pull down the pkgsrc git repository every time, but that can be automated, and hopefully the cost savings can make up for that.

Before shutting down the machine, I rsync the packages to my server:

cd /opt/packages && rsync -av 2012Q4 pkgsrc.atl.me:/opt/sites/pkgsrc.atl.me/

That machine serves the package files, and an ordinary pkg_add command can retrieve the files for use. Package dependencies that are satisfied in the main Joyent pkgsrc binary repository are picked up automatically, but soft dependencies amongst the newly built packages may need to be satisfied by hand.

Aside: I have read many references to pkgin being able to use multiple package repositories, which would be the obvious approach to supplementing the main binary package repository with your own local builds. I have never been able to get this working on SmartOS, and have heard confirmation that such behavior is “undefined.” If it works for you, then we’d all like to hear about it!

This is an example session of the bare minimum install needed for getting IPython installed and ready for basic parallel computation:

# export ABI=`bmake -D BSD_PKG_MK -f /opt/local/etc/mk.conf -V ABI`

# pkg_add -v http://pkgsrc.atl.me/2012Q4/$ABI/All/py27-zmq-2.1.11.tgz | grep 'registered in'
Package binutils-2.22 registered in /opt/local/pkg/binutils-2.22
Package gcc47-4.7.0nb2 registered in /opt/local/pkg/gcc47-4.7.0nb2
Package zeromq-2.1.11 registered in /opt/local/pkg/zeromq-2.1.11
Package py27-zmq-2.1.11 registered in /opt/local/pkg/py27-zmq-2.1.11

# pkg_add -v http://pkgsrc.atl.me/2012Q4/$ABI/All/py27-ipython-0.13.1.tgz | grep 'registered in'
Package py27-readline-0nb5 registered in /opt/local/pkg/py27-readline-0nb5
Package py27-ipython-0.13.1 registered in /opt/local/pkg/py27-ipython-0.13.1

If you ran into a build situation where you had to rebuild and update an already-existing basic dependency, then you might have to install it with a the -update flag:

pkg_add -u http://pkgsrc.atl.me/2012Q4/32/All/openssl-1.0.1dnb2.tgz

I hope this is provides a decent start for those who wish to explore pkgsrc on Joyent further. In the next installment, I will explore how to provision, configure, and get started using a cluster of machines connected via IPython.