aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorBoris Kolpackov <boris@codesynthesis.com>2024-03-06 11:10:27 +0200
committerBoris Kolpackov <boris@codesynthesis.com>2024-03-06 11:10:27 +0200
commite98b7d27bc969762ec4952f82634bb6e6375b8c2 (patch)
tree4d0559ef2ee2ba98a7ad0001e2d8199e7eb3969d /doc
parent87953be10ff92b98b63e9d478dbd8da91bd3ea5e (diff)
Document auxiliary machine semantics in manual
Diffstat (limited to 'doc')
-rw-r--r--doc/manual.cli295
1 files changed, 237 insertions, 58 deletions
diff --git a/doc/manual.cli b/doc/manual.cli
index 41f0eeb..885fc48 100644
--- a/doc/manual.cli
+++ b/doc/manual.cli
@@ -41,12 +41,24 @@ that are executed on the build host. Inside virtual machines/containers,
agent. Virtual machines and containers running a \c{bbot} instance in the
worker mode are collectively called \i{build machines}.
+In addition to a build machine, a build task may also require one or more
+\i{auxiliary machines} which provide additional components that are required
+for building or testing a package and that are impossible or impractical to
+provide as part of the build machine itself.
+
Let's now examine the workflow in the other direction, that is, from a worker
-to a controller. Once a build machine is booted (by the agent), the worker
-inside connects to the TFTP server running on the build host and downloads the
-\i{build task manifest}. It then proceeds to perform the build task and
-uploads the \i{build artifacts archive}, if any, followed by the \i{build
-result manifest} (which includes build logs) to the TFTP server.
+to a controller. Once a build machine (plus auxiliary machines, if any) are
+booted (by the agent), the worker inside the build machine connects to the
+TFTP server running on the build host and downloads the \i{build task
+manifest}. It then proceeds to perform the build task and uploads the \i{build
+artifacts archive}, if any, followed by the \i{build result manifest} (which
+includes build logs) to the TFTP server.
+
+Unlike build machines, auxiliary machines are not expected to run \c{bbot}.
+Instead, on boot, they are expected to upload to the TFTP server a list of
+environment variables to propagate to the build machine (see the
+\c{auxiliary-environment} task manifest value as well as \l{#arch-worker
+Worker Logic} for details).
Once an agent receives a build task for a specific build machine, it goes
through the following steps. First, it creates a directory on its TFTP server
@@ -94,12 +106,14 @@ implementation of the build artifacts upload handling.
\h#arch-machine-config|Configurations|
-The \c{bbot} architecture distinguishes between a \i{machine configuration},
-\i{build target configuration}, and a \i{build package configuration}. The
-machine configuration captures the operating system, installed compiler
-toolchain, and so on. The same build machine may be used to \"generate\"
-multiple \i{build target configurations}. For example, the same machine can
-normally be used to produce 32/64-bit and debug/optimized builds.
+The \c{bbot} architecture distinguishes between a \i{build machine
+configuration}, \i{build target configuration}, and a \i{build package
+configuration}. The machine configuration captures the operating system,
+installed compiler toolchain, and so on. The same build machine may be used to
+\"generate\" multiple \i{build target configurations}. For example, the same
+machine can normally be used to produce debug/optimized builds.
+
+\h2#arch-machine-config-build-machine|Build Machine Configuration|
The machine configuration is \i{approximately} encoded in its \i{machine
name}. The machine name is a list of components separated with \c{-}.
@@ -110,24 +124,24 @@ component.
The encoding is approximate in a sense that it captures only what's important
to distinguish in a particular \c{bbot} deployment.
-The first component normally identifies the operating system and has the
-following recommended form:
+The first three components normally identify the architecture, operating
+system, and optional variant. They have the following recommended form:
\
-[<arch>_][<class>_]<os>[_<version>]
+<arch>-[<class>_]<os>[_<version>][-<variant>]
\
For example:
\
-windows
-windows_10
-windows_10.1607
-i686_windows_xp
-bsd_freebsd_10
-linux_centos_6.2
-linux_ubuntu_16.04
-macos_10.12
+x86_64-windows
+x86_64-windows_10
+x86_64-windows_10.1607
+x86_64-windows_10-devmode
+x86_64-bsd_freebsd_10
+x86_64-linux_ubuntu_16.04
+x86_64-linux_rhel_9.2-bindist
+aarch64-macos_10.12
\
The second component normally identifies the installed compiler toolchain and
@@ -144,38 +158,53 @@ gcc
gcc_6
gcc_6.3
gcc_6.3_mingw_w64
+clang_3.9
clang_3.9_libc++
-clang_3.9_libstdc++
msvc_14
-msvc_14u3
-icc
+msvc_14.3
+clang_15.0_msvc_msvc_17.6
+clang_16.0_llvm_msvc_17.6
\
Some examples of complete machine names:
\
-windows_10-msvc_14u3
-macos_10.12-clang_10.0
-linux_ubuntu_16.04-gcc_6.3
-aarch64_linux_debian_11-gcc_12.2
+x86_64-windows_10-msvc_14.3
+x86_64-macos_10.12-clang_10.0
+aarch64-linux_ubuntu_16.04-gcc_6.3
+aarch64-linux_rhel_9.2-bindist-gcc_11
\
+\h2#arch-machine-config-build-target-config|Build Target Configuration|
+
Similarly, the build target configuration is encoded in a \i{configuration
name} using the same overall format. As described in \l{#arch-controller
Controller Logic}, target configurations are generated from machine
configurations. As a result, it usually makes sense to have the first
component identify the operating systems and the second component \- the
-toolchain with the rest identifying a particular target configuration variant,
-for example, optimized, sanitized, etc. For example:
+compiler toolchain with the rest identifying a particular target configuration
+variant, for example, optimized, sanitized, etc:
+
+\
+[<class>_]<os>[_<version>]-<toolchain>[-<variant>]
+\
+
+For example:
\
-windows-vc_14-O2
-linux-gcc_6-O3_asan
+windows_10-msvc_17.6
+windows_10-msvc_17.6-O2
+windows_10-msvc_17.6-static_O2
+windows_10-msvc_17.6-relocatable
+windows_10-clang_16.0_llvm_msvc_17.6_lld
+linux_debian_12-clang_16_libc++-static_O3
\
-While we can also specify the \c{<arch>} component in a build target
-configuration, this information is best conveyed as part of \c{<target>} as
-described in \l{#arch-controller Controller Logic}.
+Note that there is no \c{<arch>} component in a build target configuration:
+this information is best conveyed as part of \c{<target>} as described in
+\l{#arch-controller Controller Logic}.
+
+\h2#arch-machine-config-build-package-config|Build Package Configuration|
A package can be built in multiple package configurations per target
configuration. A build package configuration normally specifies the options
@@ -187,6 +216,42 @@ originate from the package manifest \c{*-build-config}, \c{*-builds},
\l{bpkg#manifest-package Package Manifest} for more information on these
values.
+
+\h2#arch-machine-config-auxiliary|Auxiliary Machines and Configurations|
+
+Besides the build machine and the build configuration that is derived from it,
+a package build may also involve one or more \i{auxiliary machines} and the
+corresponding \i{auxiliary configurations}.
+
+An auxiliary machine provides additional components that are required for
+building or testing a package and that are impossible or impractical to
+provide as part of the build machine itself. For example, a package may need
+access to a suitably configured database, such as PostgreSQL, in order to run
+its tests.
+
+The auxiliary machine name follows the same overall format as the build
+machine name except that the last component captures the information about the
+additional component in question rather that the compiler toolchain. For
+example:
+
+\
+x86_64-linux_debian_12-postgresql_16
+aarch64-linux_debian_12-mysql_8
+\
+
+The auxiliary configuration name is automatically derived from the machine
+name by removing the \c{<arch>} component. For example:
+
+\
+linux_debian_12-postgresql_16
+linux_debian_12-mysql_8
+\
+
+\N|Note that there is no generation of multiple auxiliary configurations from
+the same auxiliary machine since that would require some communication of the
+desired configuration variant to the machine.|
+
+
\h#arch-machine-header|Machine Header Manifest|
@@ TODO: need ref to general manifest overview in bpkg, or, better yet,
@@ -201,16 +266,28 @@ followed by the detailed description of each value in subsequent sections.
id: <machine-id>
name: <machine-name>
summary: <string>
+[role]: build|auxiliary
+[ram-minimum]: <kib>
+[ram-maximum]: <kib>
\
For example:
\
-id: windows_10-msvc_14-1.3
-name: windows_10-msvc_14
+id: x86_64-windows_10-msvc_14-1.3
+name: x86_64-windows_10-msvc_14
summary: Windows 10 build 1607 with VC 14 update 3
\
+\
+id: aarch64-linux_debian_12-postgresql_16-1.0
+name: aarch64-linux_debian_12-postgresql_16
+summary: Debian 12 with PostgreSQL 16 test user/database
+role: auxiliary
+ram-minimum: 2097152
+ram-minimum: 4194304
+\
+
\h2#arch-machine-header-id|\c{id}|
\
@@ -243,11 +320,34 @@ summary: <string>
The one-line description of the machine.
+\h2#arch-machine-header-role|\c{role}|
+
+\
+[role]: build|auxiliary
+\
+
+The machine role. If unspecified, then \c{build} is assumed.
+
+
+\h2#arch-machine-header-ram|\c{ram-minimum}, \c{ram-maximum}|
+
+\
+[ram-minimum]: <kib>
+[ram-maximum]: <kib>
+\
+
+The minimum and the maximum amount of RAM in KiB that the machine requires.
+The maximum amount is interpreted as the amount beyond which there will be no
+benefit. If unspecified, then it is assumed the machine will run with any
+minimum amount a deployment will provide and will always benefit from more
+RAM, respectively.
+
+
\h#arch-machine|Machine Manifest|
The build machine manifest contains the complete description of a build
machine on the build host (see the Build OS documentation for their origin and
-location). The machine manifest starts with the machine manifest header with
+location). The machine manifest starts with the machine header manifest with
all the header values appearing before any non-header values. The non-header
part of manifest synopsis is presented next followed by the detailed
description of each value in subsequent sections.
@@ -360,8 +460,11 @@ repository-url: <repository-url>
[dependency-checksum]: <checksum>
machine: <machine-name>
+[auxiliary-machine]: <machine-name>
+[auxiliary-machine-<name>]: <machine-name>
target: <target-triplet>
[environment]: <environment-name>
+[auxiliary-environment]: <environment-vars>
[target-config]: <tgt-config-args>
[package-config]: <pkg-config-args>
[host]: true|false
@@ -459,6 +562,21 @@ machine: <machine-name>
The name of the build machine to use.
+\h2#arch-task-auxiliary-machine|\c{auxiliary-machine}|
+
+\
+[auxiliary-machine]: <machine-name>
+[auxiliary-machine-<name>]: <machine-name>
+\
+
+The names of the auxiliary machines to use. These values correspond to the
+\c{build-auxiliary} and \c{build-auxiliary-<name>} values in the package
+manifest. While there each value specifies an auxiliary configuration pattern,
+here it specifies the concrete auxiliary machine name that was picked by the
+controller from the list of available auxiliary machines (sent as part of the
+task request) that match this pattern.
+
+
\h2#arch-task-target|\c{target}|
\
@@ -484,6 +602,50 @@ The name of the build environment to use. See \l{#arch-worker Worker Logic}
for details.
+\h2#arch-task-auxiliary-environment|\c{auxiliary-environment}|
+
+\
+[auxiliary-environment]: <environment-vars>
+\
+
+The environment variables describing the auxiliary machines. If any
+\c{auxiliary-machine*} values are specified, then after starting such
+machines, the agent prepares a combined list of environment variables that
+were uploaded by such machines and passes it in this value to the worker.
+
+The format of this value is a list of environment variable assignments
+one per line, in the form:
+
+\
+<name>=<value>
+\
+
+Whitespaces before \c{<name>}, around \c{=}, and after \c{<value>} as well as
+blank lines are ignored. The \c{<value>} part as a whole can be single ('\ ')
+or double (\"\ \") quoted. For example:
+
+\
+DATABASE_HOST=192.168.0.1
+DATABASE_PORT=1245
+DATABASE_USER='John \"Johnny\" Doe'
+DATABASE_NAME=\" test database \"
+\
+
+If the corresponding machine is specified as \c{auxiliary-machine-<name>},
+then its environment variables are prefixed with capitalized \c{<name>_}. For
+example:
+
+\
+auxiliary-machine-pgsql: x86_64-linux_debian_12-postgresql_16
+auxiliary-environment:
+\\
+PGSQL_DATABASE_HOST=192.168.0.1
+PGSQL_DATABASE_PORT=1245
+...
+\\
+\
+
+
\h2#arch-task-target-config|\c{target-config}|
\
@@ -699,7 +861,7 @@ Note that the overall \c{status} value should appear before any per-operation
The \c{skip} status indicates that the received from the controller build task
checksums have not changed and the task execution has therefore been skipped
-under the assumtion that it would have produced the same result. See
+under the assumption that it would have produced the same result. See
\c{agent-checksum}, \c{worker-checksum}, and \c{dependency-checksum} for
details.
@@ -765,9 +927,9 @@ The version of the worker logic used to perform the package build task.
An agent (or controller acting as an agent) sends a task request to its
controller via HTTP/HTTPS POST method (@@ URL/API endpoint). The task request
-starts with the task request manifest followed by a list of machine manifests.
-The task request manifest synopsis is presented next followed by the detailed
-description of each value in subsequent sections.
+starts with the task request manifest followed by a list of machine header
+manifests. The task request manifest synopsis is presented next followed by
+the detailed description of each value in subsequent sections.
\
agent: <name>
@@ -776,6 +938,7 @@ toolchain-version: <standard-version>
[interactive-mode]: false|true|both
[interactive-login]: <login>
[fingerprint]: <agent-fingerprint>
+[auxiliary-ram]: <kib>
\
@@ -842,6 +1005,18 @@ authentication in which case it should respond with the 401 (unauthorized)
HTTP status code.
+\h2#arch-task-req-auxiliary-ram|\c{auxiliary-ram}|
+
+\
+[auxiliary-ram]: <kib>
+\
+
+The amount of RAM in KiB that is available for running auxiliary machines. If
+unspecified, then assume there is no hard limit (that is, the agent can
+allocate up to the host's available RAM minus the amount required to run the
+build machine).
+
+
\h#arch-task-res|Task Response Manifest|
A controller sends the task response manifest in response to the task request
@@ -969,20 +1144,24 @@ established for a particular build target. The environment has three
components: the execution environment (environment variables, etc), build
system modules, as well as configuration options and variables.
-Setting up of the environment is performed by an executable (script, batch
-file, etc). Specifically, upon receiving a build task, if it specifies the
-environment name then the worker looks for the environment setup executable
-with this name in a specific directory and for the executable called
-\c{default} otherwise. Not being able to locate the environment executable is
-an error.
-
-Once the environment setup executable is determined, the worker re-executes
-itself as that executable passing to it as command line arguments the target
-name, the path to the \c{bbot} worker to be executed once the environment is
-setup, and any additional options that need to be propagated to the re-executed
-worker. The environment setup executable is executed in the build directory as
-its current working directory. The build directory contains the build task
-\c{task.manifest} file.
+Setting up of the execution environment is performed by an executable (script,
+batch file, etc). Specifically, upon receiving a build task, if it specifies
+the environment name then the worker looks for the environment setup
+executable with this name in a specific directory and for the executable
+called \c{default} otherwise. Not being able to locate the environment
+executable is an error.
+
+In addition to the environment executable, if the task requires any auxiliary
+machines, then the \c{auxiliary-environment} value from the task manifest is
+incorporated into the execution environment.
+
+Specifically, once the environment setup executable is determined, the worker
+re-executes itself in the auxiliary environment and as that executable passing
+to it as command line arguments the target name, the path to the \c{bbot}
+worker to be executed once the environment is setup, and any additional
+options that need to be propagated to the re-executed worker. The environment
+setup executable is executed in the build directory as its current working
+directory. The build directory contains the build task \c{task.manifest} file.
The environment setup executable sets up the necessary execution environment
for example by adjusting \c{PATH} or running a suitable \c{vcvars} batch file.
@@ -2211,7 +2390,7 @@ manifest. The matched machine name, the target, the environment name,
configuration options/variables, and regular expressions are included into the
build task manifest.
-Values in the \c{<tgt-config-arg>} list can be opionally prefixed with the
+Values in the \c{<tgt-config-arg>} list can be optionally prefixed with the
\i{step id} or a leading portion thereof to restrict it to a specific step,
operation, phase, or tool in the \i{worker script} (see \l{#arch-worker Worker
Logic}). The prefix can optionally begin with the \c{+} or \c{-} character (in