1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
|
// file : doc/manual.cli
// copyright : Copyright (c) 2014-2017 Code Synthesis Ltd
// license : MIT; see accompanying LICENSE file
"\name=build2-build-bot-manual"
"\subject=build bot"
"\title=Build Bot"
// NOTES
//
// - Maximum <pre> line is 70 characters.
//
"
\h0#preface|Preface|
This document describes \c{bbot}, the \c{build2} build bot.
\h1#intro|Introduction|
\h1#arch|Architecture|
The \c{bbot} architecture includes several layers for security and
manageability. At the top we have a \c{bbot} running in the \i{controller}
mode. The controller monitors various \i{build sources} for \i{build
tasks}. For example, a controller may poll a \c{brep} instances for any new
packages to built as well as monitor a \c{git} repository for any new commits
to test. There can be several layers of controllers with \c{brep} being just a
special kind. A machine running a \c{bbot} instance in the controller mode is
called a \i{controller host}.
Below the controllers we have a \c{bbot} running in the \i{agent} mode
normally on Build OS. The agent polls its controllers for \i{build tasks} to
perform. A machine running a \c{bbot} instance in the agent mode is called a
\i{build host}.
The actual building is performed in the virtual machines and/or containers
that are executed on the build host. Inside virtual machines/containers,
\c{bbot} is running in the \i{worker mode} and receives build tasks from its
agent. Virtual machines and containers running a \c{bbot} instance in the
worker mode are collectively called \i{build machines}.
Let's now examine the workflow in the other direction, that is, from a worker
to a controller. Once a build machine is booted (by the agent), the worker
inside connects to the TFTP server running on the build host and downloads the
\i{build task manifest}. It then proceeds to perform the build task and
uploads the \i{build result manifest} (which includes build logs) to the TFTP
server.
Once an agent receives a build task for a specific build machine, it goes
through the following steps. First, it creates a directory on its TFTP server
with the \i{machine name} as its name and places the build task manifest
inside. Next, it makes a throw-away snapshot of the build machine and boots
it. After booting the build machine, the agent monitors the machine directory
on its TFTP server for the build result manifest (uploaded by the worker once
the build has completed). Once the result manifest is obtained, the agent
shuts down the build machine and discards its snapshot.
To obtains a build task the agent polls via HTTP/HTTPS one or more
controllers. Before each poll request the agent enumerates the available build
machines and sends this information as part of the request. The controller
responds with a build task manifest that identifies a specific build machine
to use.
If the controller has higher-level controllers (for example, \c{brep}), then
it aggregates the available build machines from its agents and polls these
controllers (just as an agent would), forwarding build tasks to suitable
agents. In this case we say that the \i{controller act as an agent}. The
controller may also be configured to monitor build sources, such as SCM
repositories, directly in which case it generates build tasks itself.
In this architecture the build results are propagated up the chain: from a
worker, to its agent, to its controller, and so on. A controller that is the
final destination of a build result uses email to notify interested parties of
the outcome. For example, \c{brep} would send a notification to the package
owner if the build failed. Similarly, a \c{bbot} controller that monitors a
\c{git} repository would send an email to a committer if their commit caused a
build failure. The email would include a link (normally HTTP/HTTPS) to the
build logs hosted by the controller.
\h#arch-machine-config|Configurations|
The \c{bbot} architecture distinguishes between a \i{machine configuration}
and a \i{build configuration}. The machine configuration captures the
operating system, installed compiler toolchain, and so on. The same build
machine may be used to \"generate\" multiple \i{build configurations}. For
example, the same machine can normally be used to produce 32/64-bit and
debug/release builds.
The machine configuration is \i{approximately} encoded in its \i{machine
name}. The machine name is a list of components separated with \c{-}. Each
component can contain alpha-numeric characters, underscores, dots, and pluses
with the whole id being a portably-valid path component.
The encoding is approximate in a sense that it captures only what's important
to distinguish in a particular \c{bbot} deployment.
The first component normally identifies the operating system and has the
following recommended form:
\
[<arch>_][<class>_]<os>[_<version>]
\
For example:
\
windows
windows_10
windows_10.1607
i686_windows_xp
bsd_freebsd_10
linux_centos_6.2
linux_ubuntu_16.04
macos_10.12
\
The second component normally identifies the installed compiler toolchain and
has the following recommended form:
\
<id>[<version>][<runtime>]
\
For example:
\
gcc
gcc_6
gcc_6.3
clang_3.9_libc++
clang_3.9_libstdc++
msvc_14
msvc_14u3
icc
\
Some examples of complete machine names:
\
windows_10-msvc_14u3
macos_10.12-clang
linux_ubuntu_16.04-gcc_6.3
\
Similarly, the build configuration is encoded in a \i{configuration name}
using the same format. As described in \l{#arch-controller Controller Logic},
build configurations are generated from machine configurations. As a result,
it usually makes sense to have the first component identify the operating
systems and the second component \- the toolchain with the rest identifying
a particular build configuration. For example:
\
windows-vc_14-32-debug
linux-gcc_6-cross-arm-eabi
\
\h#arch-machine-header-manifest|Machine Header Manifest|
\
SYNOPSIS
id: <machine-id>
name: <machine-name>
summary: <string>
\
The build machine header manifest contains basic information about a build
machine on the build host. A list of machine header manifests is sent by
\c{bbot} agents to controllers.
\dl|
\li|\n\c{id: <machine-id>}\n
The \i{machine-id} uniquely identifies a machine version/revision/build.
For virtual machines this can be the disk image checksum. For a container
this can be UUID that is re-generated every time a container filesystem
is altered.|
\li|\n\c{name: <machine-name>}\n
The machine name as described above.|
\li|\n\c{summary: <string>}\n
A one-line description of the machine. For example:
\
id: windows_10-msvc_14-1.3
name: windows_10-msvc_14
summary: Windows 10 build 1607 with VC 14 update 3
\
||
\h#arch-machine-manifest|Machine Manifest|
\
SYNOPSIS
id: <machine-id>
name: <machine-name>
summary: <string>
type: <machine-type>
mac: <macaddr>
options: <machine-options>
\
The build machine manifest contains the complete description of a build
machine on the build host (see the Build OS documentation for their origin and
location). The machine manifest starts with the machine manifest header. All
the header values must appear before any non-header values.
\dl|
\li|\n\c{type: <machine-type>}\n
The machine type. Valid values are \c{kvm} (QEMU/KVM virtual machine) and
\c{nspawn} (\c{systemd-nspawn} container).|
\li|\n\c{mac: <macaddr>}\n
Optional fixed MAC address for the machine in the hexadecimal,
comma-separated format. For example:
\
mac: de:ad:be:ef:de:ad
\
If it is not specified, then a random address is generated on the first
machine bootstrap which is then reused for each build/re-bootstrap. Note
that it you specify a fixed address, then the machine can only be used by a
single \c{bbot} agent.|
\li|\n\c{options: <machine-options>}\n
Optional list of machine options. The exact semantics is machine
type-dependent (see below). A single level of quotes (either single or
double) is removed in each option before being passed on. Options can be
separated with spaces or newlines.
For \c{kvm} machines, if this value is present, then it replaces the
default network and disk configuration when starting the QEMU/KVM
hypervisor The options are pre-processed by replacing the question
mark in \c{ifname=?} and \c{mac=?} strings with the network interface
and MAC address, respectively.||
\h#arch-task-manifest|Task Manifest|
\
SYNOPSIS
name: <package-name>
version: <package-version>
#location: <package-url>
repository: <repository-url>
trust: <repository-fp>
machine: <machine-name>
target: <target-triplet>
config: <config-vars>
warning-regex: <warning-regexes>
\
The task manifest describes a build task. It consists of two groups of values.
The first group defines the package to build. The second group defines the
build configuration to use for building the package.
\dl|
\li|\n\c{name: <package-name>}\n
Package name to test.|
\li|\n\c{version: <package-version>}\n
Package version to test.|
\li|\n\c{repository: <repository-url>}\n
The \c{bpkg} repository that contains the package and its dependencies.|
\li|\n\c{trust: <repository-fp>}\n
The SHA256 repository certificate fingerprint to trust (see the \c{bpkg}
\c{--trust} option for details). This value may be specified multiple times
to establish the authenticity of multiple certificates. If the special
\c{yes} value is specified, then all repositories will be trusted without
authentication (see the \c{bpkg} \c{--trust-yes} option).
Note that while the controller may return a task with \c{trust} values,
whether they will be used is up to the agent's configuration. For example,
some agents may only trust their internally-specified fingerprints to
prevent the \"man in the middle\" type of attacks.|
\li|\n\c{machine: <machine-name>}\n
The name of the build machine to use.|
\li|\n\c{target: <target-triplet>}\n
The target triplet to build for. If not specified, then the default target
for this machine is used (which is usually the machine itself).
Compared to the autotools terminology, the \c{machine} value corresponds
to \c{--build} (the machine we are building on) and \c{target} \- to
\c{--host} (the machine we are building for). While we use essentially
the same \i{target triplet} format as autotools for \c{target}, it is
not flexible enough for \c{machine}.|
\li|\n\c{config: <config-vars>}\n
Additional build system configuration variables.
A single level of quotes (either single or double) is removed in each
variable before being passed to \c{bpkg}. For example, the following value:
\
config: config.cc.coptions=\"-O3 -stdlib='libc++'\"
\
Will be passed to \c{bpkg} as the following (single) argument:
\
config.cc.coptions=-O3 -stdlib='libc++'
\
Variables can be separated with spaces or newlines.|
\li|\n\c{warning-regex: <warning-regexes>}\n
Additional regular expressions that should be used to detect warnings in
the logs.
A single level of quotes (either single or double) is removed in each
expression before being used for search. For example, the following value:
\
warning-regex: \"warning C4\d{3}: \"
\
Will be treated as the following (single) regular expression (with a
trailing space):
\
warning C4\d{3}:
\
Expressions can be separated with spaces or newlines. They will be added to
the following default list of regular expressions that detect the \c{build2}
toolchain warnings:
\
^warning:
^.+: warning:
\
Note that this built-in list also covers GCC and Clang warnings (for the
English locale).||
\h#arch-result-manifest|Result Manifest|
\
SYNOPSIS
name: <package-name>
version: <package-version>
status: <status>
configure-status: <status>
update-status: <status>
test-status: <status>
configure-log: <text>
update-log: <text>
test-log: <text>
\
The result manifest describes a build result.
\dl|
\li|\n\c{name: <package-name>}\n
Package name from the task manifest.|
\li|\n\c{version: <package-version>}\n
Package version from the task manifest.|
\li|\n\c{status: <status>}\n
An overall (cumulative) build result status. Valid values are:
\
success # All operations completed successfully.
warning # One or more operations completed with warnings.
error # One or more operations completed with errors.
abort # One or more operations were aborted.
abnormal # One or more operations terminated abnormally.
\
The \c{abort} status indicates that the operation has been aborted by
\c{bbot}, for example, because it was consuming too many resources and/or
was taking too long. Note that a task can be aborted both by the \c{bbot}
worker as well as the agent. In the later case the whole machine is shut
down and no operation-specific status or logs will be included (@@ Maybe
we should just include 'log:' with commands that start VM, for
completeness?).
The \c{abnormal} status indicates that the operation has terminated
abnormally, for example, due to the package manager or build system crash.
Note that the overall \c{status} value should appear before any
per-operation \c{*-status} values.|
\li|\n\c{*-status: <status>}\n
A per-operation result status. Note that the \c{*-status} values should
appear in the same order as the corresponding operations were performed
and for each \c{*-status} there should be a corresponding \c{*-log}.|
\li|\n\c{*-log: <text>}\n
A per-operation result log. Note that the \c{*-log} values should appear
last and in the same order as the corresponding \c{*-status} values.||
\h#arch-task-req-manifest|Task Request Manifest|
\
SYNOPSIS
agent: <agent-name>
toolchain-name: <name>
toolchain-version: <standard-version>
fingerprint: <agent-fingerprint>
\
An agent (or controller acting as an agent) sends a task request to its
controller via HTTP/HTTPS POST method (@@ URL/API endpoint). The task request
starts with the task request manifest followed by a list of machine manifests.
\dl|
\li|\n\c{agent: <agent-name>}\n
The name of the agent host (\c{hostname}). This should be unique in a
particular \c{bbot} deployment.|
\li|\n\c{toolchain-name: <name>}\n
The \c{build2} toolchain name being used by the agent.|
\li|\n\c{toolchain-version: <standard-version>}\n
The \c{build2} toolchain version being used by the agent.|
\li|\n\c{fingerprint: <agent-fingerprint>}\n
The SHA256 fingerprint of the agent's public key. An agent may be configured
not to use the public key-based authentication in which case it does not
include this value. However, the controller may be configured to require
the authentication in which case it will respond with the
401 (unauthorized) HTTP status code.||
\h#arch-task-res-manifest|Task Response Manifest|
\
SYNOPSIS
session: <session-id>
challenge: <text>
result-url: <url>
\
A controller sends the task response manifest in response to the task request
initiated by an agent. The response is delivered as a result of the POST
method. The task response starts with the task response manifest optionally
followed by a task manifest.
\dl|
\li|\n\c{session: <session-id>}\n
An identifier assigned to this session by the controller. An empty value
indicates that the controller has no tasks at this time in which case all
the following values as well as the task manifest are absent.|
\li|\n\c{challenge: <string>}\n
Random 64-character string (nonce) used to challenge the agent's private
key. If present, then the agent must sign this string and include the
signature in the result request.
The signature should be calculated by encrypting the string with the agent's
private key and then base64-encoding the result.|
\li|\n\c{result-url: <url>}\n
The URL to post the result (upload) request to.||
\h#arch-result-req-manifest|Result Request Manifest|
\
SYNOPSIS
session: <session-id>
challenge: <text>
\
On completion of a task an agent (or controller acting as an agent) sends a
result (upload) request to the controller via HTTP/HTTPS POST method using the
URL returned in the task response. The result request starts with the
result request manifest followed by a result manifest. Note that there is no
result response and only a successful but empty POST result is returned.
\dl|
\li|\n\c{session: <session-id>}\n
The session id as returned by the controller in the task response.|
\li|\n\c{challenge: <text>}\n
The answer to the private key challenge as posed by the controller in the
task response. Must be present only if the challenge value was present in
the task response.||
\h#arch-worker|Worker Logic|
The \c{bbot} worker builds each package in a \i{build environment} that is
established for a particular build target. The environment has three
components: the execution environment (environment variables, etc), build
system modules, and configuration variables.
Setting up of the environment is performed by an executable (script, batch
file, etc). Specifically, upon receiving a build task, the worker obtains its
target and looks for the environment setup executable with this name in a
specific directory. If not found or if the target is unspecified, then the
worker looks for the executable called \c{default}. Not being able to locate
the environment executable is an error.
Once the environment setup executable is determined, the worker re-executes
itself as that executable passing to it as command line arguments the target
name (or empty value if not specified), the path to the \c{bbot} worker to be
executed once the environment is setup, and any additional options that need
to be propagated to the re-executed worker. The environment setup executable
is executed in the build directory as its current working directory. The build
directory contains the build task \c{manifest} file.
The environment setup executable sets up the necessary execution environment
for example by adjusting \c{PATH} or running a suitable \c{vcvars} batch file.
It then re-executes itself as the \c{bbot} worker passing to it as command
line arguments (in addition to worker options) the list of build system
modules (\c{<env-modules>}) and the list of configuration variables
(\c{<env-config-vars>}). The environment setup executable must execute the
\c{bbot} worker in the build directory as the current working directory.
The re-executed \c{bbot} worker then proceeds to test the package from the
repository by executing the following commands (\c{<>}-values are from the
task manifest and environment):
\
bpkg -v create <env-module> <config-vars> <env-config-vars>
bpkg -v add <repository-url>
bpkg -v fetch --trust <repository-fp>
bpkg -v build --yes --configure-only <package-name>/<package-version>
bpkg -v update <package-name>
bpkg -v test <package-name>
\
As an example, the following POSIX shell script can be used to setup the
environment for building C and C++ packages with GCC 6 on most Linux
distributions.
\
#!/bin/sh
# Environment setup script for C/C++ compilation with GCC 6.
#
# $1 - target
# $2 - bbot executable
# $3+ - bbot options
set -e # Exit on errors.
t=\"$1\"
shift
if test -n \"$t\"; then
echo \"unknown target: $t\" 1>&2
exit 1
fi
exec \"$@\" cc config.c=gcc-6 config.cxx=g++-6
\
\h#arch-controller|Controller Logic|
A \c{bbot} controller that issues own build tasks maps available build
machines (as reported by agents) to \i{build configurations} according to the
\c{buildtab} configuration file. Blank lines and lines that start with \c{#}
are ignored. All other lines in this file have the following format:
\
<machine-pattern> <config> [<target>] [<config-vars>] [<warning-regex>]
\
Where \c{<machine-pattern>} is filesystem wildcard pattern that is
matched against available machine names, \c{<config>} is the
configuration name, optional \c{<target>} is the build target, optional
\c{<config-vars>} is a list of additional build system configuration
variables, and optional \c{<warning-regex>} is a list of additional regular
expressions that should be used to detect warnings in the logs.
Regular expressions must start with \c{~}, to be distinguished from
configuration variables. Note that \c{<config-vars>} and \c{<warning-regex>}
lists have the same quoting semantics as in the \c{config} and the
\c{warning-regex} values in the build task manifest. The matched machine name,
the target, configuration variables, and regular expressions are included into
the build task manifest.
Note that each machine name is matched against every pattern and all the
patterns that match produce a configuration. If a machine does not match any
pattern, then it is ignored (meaning that this controller is not interested in
testing its packages with this machine). If multiple machines match the same
pattern, then only a single configuration using any of the machines is
produced (meaning that this controller considers these machines equivalent).
As an example, let's say we have a machine named \c{windows_10-vc_14u3}. If
we wanted to test both 32 and 64-bit builds as well as debug and release, then
we could have generated the following configurations:
\
windows*-vc_14* windows-vc_14-32-debug i686-microsoft-win32-msvc14.0 config.cc.coptions=/Z7 config.cc.loptions=/DEBUG ~\"warning C4\d{3}: \"
windows*-vc_14* windows-vc_14-32-release i686-microsoft-win32-msvc14.0 config.cc.coptions=\"/O2 /Oi\" ~\"warning C4\d{3}: \"
windows*-vc_14* windows-vc_14-64-debug x86_64-microsoft-win32-msvc14.0 config.cc.coptions=/Z7 config.cc.loptions=/DEBUG ~\"warning C4\d{3}: \"
windows*-vc_14* windows-vc_14-64-release x86_64-microsoft-win32-msvc14.0 config.cc.coptions=\"/O2 /Oi\" ~\"warning C4\d{3}: \"
\
As another example, let's say we have \c{linux_fedora_25-gcc_6} and
\c{linux_ubuntu_16.04-gcc_6}. If all we cared about it testing GCC 6 on Linux,
then our configurations could look like this (note the missing target):
\
linux*-gcc-6 linux-gcc_6-debug config.cc.coptions=-g
linux*-gcc-6 linux-gcc_6-release config.cc.coptions=-O3
\
"
|