This technote documents how sciplat-lab (the JupyterLab RSP container) is built.
1 Repository¶
The Lab container resides at the LSST SQuaRE GitHub sciplat-lab repository.
1.1 Layout¶
There are several different categories of files in the repository directory.
Makefile
andDockerfile.template
directly control the build process; GNU Make is used to generate aDockerfile
from the template and arguments, and thendocker build
generates thesciplat-lab
container.bld
provides compatibility with our old build system and is a wrapper formake
; it will be removed soon.- The
stage
shell files are executed during thedocker build
and each control a fairly large section of the container build.texlive.profile
is used to control the build ofTeX
in the container. - The other executable files, except for
lsstlaunch.bash
, are used during JupyterLab startup. The most important, and most likely to need modification, isrunlab.sh
, which sets up the JupyterLab environment prior to launching the Lab. - Everything else is copied into the container during build and controls various runtime behaviors of the Lab.
1.2 Branch Conventions¶
Standard Lab containers (that is, dailies, weeklies, release candidates,
and releases) are built from the prod
branch. Experimental
containers may be built from any branch. The build process enforces
this condition, and will force the tag to an experimental one when
building from a non-prod branch.
Note that from the GitHub perspective, prod
rather than main
is
the default branch.
1.3 Updating the Default Branch¶
- Do your work in a ticket branch, as with any other repository.
- PR that ticket branch into
main
. Note that the default branch to PR into is going to beprod
and you will have to change the selection tomain
. - Rebase (if possible) or cherry-pick the changes from
main
intoprod_update
. At the time of writing, there’s no difference betweenmain
andprod_update
, but as we migrate between major versions of JupyterLab, it is possible for the two branches to diverge significantly (as they did in the JL2-JL3 transition). - Merge
prod_update
intoprod
.
It is worth noting that the only place we use a PR in this process is
getting changes into main
. Typically you would build an
experimental container from your branch, test that, and once satisfied,
proceed with the PR.
Once your changes are on main
, in the usual case where main
and
prod_update
do not differ, the following incantation will suffice:
git checkout main && \
git pull && \
git checkout prod_update && \
git rebase main && \
git push && \
git checkout prod && \
git merge prod_update && \
git push
2 Build Process¶
GNU Make is used to drive the build process. The Makefile accepts three arguments and has three useful targets.
The arguments are as follows:
tag
– mandatory: this is the tag on the input DM Stack container, e.g.w_2021_50
. If it starts with av
thatv
becomes anr
in the output version.image
– optional: this is the name of the image you’re building and pushing. It defaults todocker.io/lsstsqre/sciplat-lab
.input
– optional: this is the name, and any tag prefix, of the input image you’re starting with. It defaults todocker.io/lsstsqre/centos:7-stack-lsst_distrib-
.Note that if there is no tag prefix, the image name should end with a colon, and also that if you do specify the input image, you’re on your own: SQuaRE expects its containers to be built on top of the DM stack image.
If you’re just adding things to the stack image for your input container, you are likely to be fine, but it’s entirely possible to introduce version incompatibilities while so doing. It is certainly not going to work if you start with something that isn’t based on the stack image.
supplementary
– optional: if specified, this turns the build into an experimental build where the tag starts withexp_
and ends with_<supplementary>
.
The targets are one of:
clean
– remove the generatedDockerfile
. Not terribly useful on its own, but a good first step before running the next target (because the template rarely changes,make
cannot tell on its own that theDockerfile
needs rebuilding when the arguments change).dockerfile
– just generate the Dockerfile from the template and the arguments. Do not build or push.image
– build the Lab container, but do not push it.push
– build and push the container.
push
is the default, and all
is a synonym for it. build
is a
synonym for image
. Note that we assume that the building user
already has appropriate push credentials for the repository to which the
image is pushed, and that any necessary docker login
has already
been performed.
If the image is built from a branch that is not prod
, and the
supplementary
tag is not specified, the supplementary tag will be
set to a value derived from the branch name. This prevents building
standard containers from branches other than prod
.
2.1 Dockerfile template substitution¶
Dockerfile.template
substitutes {{TAG}}
, {{IMAGE}}
, {{INPUT}}
and
{{VERSION}}
. Despite the fact that we use double-curly-brackets,
the substitution is nothing as sophisticated as Jinja 2: instead, we
just run sed
in the dockerfile
target of the
Makefile.
2.2 Examples¶
Build and push the weekly 2021_50 container:
make tag=w_2021_50
Build and push an experimental container with a newnumpy
supplementary tag:
make tag=w_2021_50 supplementary=newnumpy
Just create the Dockerfile
for w_2021_49
:
make dockerfile tag=w_2021_49
Build the newnumpy
container, but don’t push it:
make image tag=w_2021_50 supplementary=newnumpy
Build and push w_2021_50
to ghcr.io
:
make tag=w_2021_50 image=ghcr.io/lsst-sqre/sciplat-lab
Build and push a Telescope and Site image based on their sal-sciplat
image
(note differing tag format):
make tag=w_2021_49_c0023.008 input=ts-dockerhub.lsst.org/sal-sciplat: image=ts-dockerhub.lsst.org/sal-sciplat-lab
3 Modifying Lab container Contents¶
This is probably why you’re reading this document.
You will need to understand the structure of Dockerfile.template
a little. It is very likely that the piece you need to modify is in one
of the stage*.sh
scripts, although it is plausible that what you
want is actually one of the container setup-at-runtime pieces.
3.1 stage*.sh scripts¶
Most of the action in the Dockerfile
comes from five shell scripts
executed by docker build
as RUN
actions.
These are, in order:
stage1-rpm.sh
– we will always be building on top ofcentos
in the current regime. This stage first reinstalls all the system packages but with man pages this time (the Stack container isn’t really designed for interactive use, but ours is), and then adds some RPM packages we require, or at least find helpful, for our user environment.stage2-os.sh
– this installs os-level packages that are not packaged via RPM. Currently the biggest and hairiest of these is TeXLive–the conda TeX packaging story is not good, and if we don’t install TeXLive a bunch of the export-as options in JupyterLab will not work.stage3-py.sh
– this is probably where you’re going to be spending your time. Mamba is faster and reports errors better than conda, so we install and then use it. Anything that is packaged as a Conda package should be installed from conda-forge. However, that’s not everything we need. Thus, the first thing we do is add all the Conda packages we need. Then we do a pip install of the rest, and a little bit of bookkeeping to create a kernel for the Stack Python. It is likely that what you need to do will be done by inserting (or pinning versions of) python packages in the mamba or pip sections.stage4-jup.sh
– this is for installation of Jupyter packages–mostly Lab extensions, but there are also server and notebook extensions we rely upon. Use pre-built Lab extensions if at all possible, which will mean they are packaged as conda-forge or pip-installable packages and handled in the previous Python stage.stage5-ro.sh
– this is Rubin Observatory-specific setup. This, notably, creates quite a big layer because, among other things, it checks out the tutorial notebooks as they existed at build time, and people keep checking large figure outputs into these notebooks.
3.2 Other files¶
The rest of the files in this directory are either things copied to
various well-known locations (for example, all the local*.sh
files
end up in /etc/profile.d
) or they control various aspects of the Lab
startup process. For the most part they are moved into the container by
COPY
statements in the Dockerfile
. They do not often need
modification.
runlab.sh is the
other file you are likely to need to modify. This is executed, as the
target user, and the last thing it does is start jupyterlab
(well,
almost: it also knows if it’s a dask worker or a noninteractive
container, and does something different in those cases).
3.3 Indentation conventions¶
There’s a lot of shell scripting in here. Please use four-space indentations, and convert tabs to spaces, if you’re working on the scripts.