1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
|
// file : doc/manual.cli
// license : MIT; see accompanying LICENSE file
"\name=build2-repository-interface-manual"
"\subject=repository interface"
"\title=Repository Interface"
// NOTES
//
// - Maximum <pre> line is 70 characters.
//
"
\h0#preface|Preface|
This document describes \c{brep}, the \c{build2} package repository web
interface. For the command line interface of \c{brep} utilities refer to the
\l{brep-load(1)}, \l{brep-clean(1)}, \l{brep-migrate(1)}, and
\l{brep-monitor(1)} man pages.
\h1#submit|Package Submission|
The package submission functionality allows uploading of package archives as
well as additional, repository-specific information via the HTTP \c{POST}
method using the \c{multipart/form-data} content type. The implementation in
\c{brep} only handles uploading as well as basic verification (checksum,
duplicates) expecting the rest of the submission and publishing logic to be
handled by a separate entity according to the repository policy. Such an
entity can be notified by \c{brep} about a new submission as an invocation of
the \i{handler program} (as part of the HTTP request) and/or via email. It
could also be a separate process that monitors the upload data directory.
The submission request without any parameters is treated as the submission
form request. If \c{submit-form} is configured, then such a form is generated
and returned. Otherwise, such a request is treated as an invalid submission
(missing parameters).
For each submission request \c{brep} performs the following steps.
\ol|
\li|Verify submission size limit.
The submission form-data payload size must not exceed \c{submit-max-size}.|
\li|Verify the required \c{archive} and \c{sha256sum} parameters are present.
The \c{archive} parameter must be the package archive upload while
\c{sha256sum} must be its 64 characters SHA256 checksum calculated in the
binary mode.|
\li|Verify other parameters are valid manifest name/value pairs.
The value can only contain UTF-8 encoded Unicode graphic characters as well as
tab (\c{\\t}), carriage return (\c{\\r}), and line feed (\c{\\n}).|
\li|Check for a duplicate submission.
Each submission is saved as a subdirectory in the \c{submit-data} directory
with a 12-character abbreviated checksum as its name.|
\li|Save the package archive into a temporary directory and verify its
checksum.
A temporary subdirectory is created in the \c{submit-temp} directory, the
package archive is saved into it using the submitted name, and its checksum
is calculated and compared to the submitted checksum.|
\li|Save the submission request manifest into the temporary directory.
The submission request manifest is saved as \c{request.manifest} into the
temporary subdirectory next to the archive.|
\li|Make the temporary submission directory permanent.
Move/rename the temporary submission subdirectory to \c{submit-data} as an
atomic operation using the 12-character abbreviated checksum as its new
name. If such a directory already exist, then this is a duplicate submission.|
\li|Invoke the submission handler program.
If \c{submit-handler} is configured, invoke the handler program passing to it
additional arguments specified with \c{submit-handler-argument} (if any)
followed by the absolute path to the submission directory.
The handler program is expected to write the submission result manifest to
\c{stdout} and terminate with the zero exit status. A non-zero exit status is
treated as an internal error. The handler program's \c{stderr} is logged.
Note that the handler program should report temporary server errors (service
overload, network connectivity loss, etc.) via the submission result manifest
status values in the [500-599] range (HTTP server error) rather than via a
non-zero exit status.
The handler program assumes ownership of the submission directory and can
move/remove it. If after the handler program terminates the submission
directory still exists, then it is handled by \c{brep} depending on the
handler process exit status and the submission result manifest status value.
If the process has terminated abnormally or with a non-zero exit status or the
result manifest status is in the [500-599] range (HTTP server error), then the
directory is saved for troubleshooting by appending the \c{.fail} extension
followed by a numeric extension to its name (for example,
\c{ff5a1a53d318.fail.1}). Otherwise, if the status is in the [400-499] range
(HTTP client error), then the directory is removed. If the directory is left
in place by the handler or is saved for troubleshooting, then the submission
result manifest is saved as \c{result.manifest} into this directory, next to
the request manifest and archive.
If \c{submit-handler-timeout} is configured and the handler program does not
exit in the allotted time, then it is killed and its termination is treated as
abnormal.
If the handler program is not specified, then the following submission result
manifest is implied:
\
status: 200
message: package submission is queued
reference: <abbrev-checksum>
\
|
\li|Send the submission email.
If \c{submit-email} is configured, send an email to this address containing
the submission request manifest and the submission result manifest.|
\li|Respond to the client.
Respond to the client with the submission result manifest and its \c{status}
value as the HTTP status code.|
|
Check violations (max size, duplicate submissions, etc) that are explicitly
mentioned above are always reported with the submission result manifest. Other
errors (for example, internal server errors) might be reported with
unformatted text, including HTML.
If the submission request contains the \c{simulate} parameter, then the
submission service simulates the specified outcome of the submission process
without actually performing any externally visible actions (e.g., publishing
the package, notifying the submitter, etc). Note that the package submission
email (\c{submit-email}) is not sent for simulated submissions.
Pre-defined simulation outcome values are \c{internal-error-text},
\c{internal-error-html}, \c{duplicate-archive}, and \c{success}. The
simulation outcome is included into the submission request manifest and the
handler program must at least handle \c{success} but may recognize additional
outcomes.
\h#submit-request-manifest|Submission Request Manifest|
The submission request manifest starts with the below values and in that order
optionally followed by additional values in the unspecified order
corresponding to the custom request parameters.
\
archive: <name>
sha256sum: <sum>
timestamp: <date-time>
[simulate]: <outcome>
[client-ip]: <string>
[user-agent]: <string>
\
The \c{timestamp} value is in the ISO-8601 \c{<YYYY>-<MM>-<DD>T<hh>:<mm>:<ss>Z}
form (always UTC). Note also that \c{client-ip} can be IPv4 or IPv6.
\h#submit-result-manifest|Submission Result Manifest|
The submission result manifest starts with the below values and in that order
optionally followed by additional values if returned by the handler program.
If the submission is successful, then the \c{reference} value must be present
and contain a string that can be used to identify this submission (for
example, the abbreviated checksum).
\
status: <http-code>
message: <string>
[reference]: <string>
\
\h1#ci|Package CI|
The CI functionality allows submission of package CI requests as well as
additional, repository-specific information via the HTTP \c{GET} and \c{POST}
methods using the \c{application/x-www-form-urlencoded} or
\c{multipart/form-data} parameters encoding. The implementation in \c{brep}
only handles reception as well as basic parameter verification expecting the
rest of the CI logic to be handled by a separate entity according to the
repository policy. Such an entity can be notified by \c{brep} about a new CI
request as an invocation of the \i{handler program} (as part of the HTTP
request) and/or via email. It could also be a separate process that monitors
the CI data directory.
The CI request without any parameters is treated as the CI form request. If
\c{ci-form} is configured, then such a form is generated and returned.
Otherwise, such a request is treated as an invalid CI request (missing
parameters).
For each CI request \c{brep} performs the following steps.
\ol|
\li|Verify the required \c{repository} and optional \c{package} parameters.
The \c{repository} parameter is the remote \c{bpkg} repository location that
contains the packages to be tested. If one or more \c{package} parameters are
present, then only the specified packages are tested. If no \c{package}
parameters are specified, then all the packages present in the repository (but
excluding complement repositories) are tested.
Each \c{package} parameter can specify either just the package name, in which
case all the versions of this package present in the repository will be
tested, or both the name and version in the \c{<name>/<version>} form (for
example, \c{libhello/1.2.3}.|
\li|Verify the optional \c{overrides} parameter.
The overrides parameter, if specified, must be the CI overrides manifest
upload.|
\li|Verify other parameters are valid manifest name/value pairs.
The value can only contain UTF-8 encoded Unicode graphic characters as well as
tab (\c{\\t}), carriage return (\c{\\r}), and line feed (\c{\\n}).|
\li|Generate CI request id and create request directory.
For each CI request a unique id (UUID) is generated and a request subdirectory
is created in the \c{ci-data} directory with this id as its name.|
\li|Save the CI request manifest into the request directory.
The CI request manifest is saved as \c{request.manifest} into the request
subdirectory created on the previous step.|
\li|Save the CI overrides manifest into the request directory.
If the CI overrides manifest is uploaded, then it is saved as
\c{overrides.manifest} into the request subdirectory.|
\li|Invoke the CI handler program.
If \c{ci-handler} is configured, invoke the handler program passing to it
additional arguments specified with \c{ci-handler-argument} (if any) followed
by the absolute path to the CI request directory.
The handler program is expected to write the CI result manifest to \c{stdout}
and terminate with the zero exit status. A non-zero exit status is treated as
an internal error. The handler program's \c{stderr} is logged.
Note that the handler program should report temporary server errors (service
overload, network connectivity loss, etc.) via the CI result manifest status
values in the [500-599] range (HTTP server error) rather than via a non-zero
exit status.
The handler program assumes ownership of the CI request directory and can
move/remove it. If after the handler program terminates the request directory
still exists, then it is handled by \c{brep} depending on the handler process
exit status and the CI result manifest status value. If the process has
terminated abnormally or with a non-zero exit status or the result manifest
status is in the [500-599] range (HTTP server error), then the directory is
saved for troubleshooting by appending the \c{.fail} extension to its
name. Otherwise, if the status is in the [400-499] range (HTTP client error),
then the directory is removed. If the directory is left in place by the
handler or is saved for troubleshooting, then the CI result manifest
is saved as \c{result.manifest} into this directory, next to the request
manifest.
If \c{ci-handler-timeout} is configured and the handler program does not
exit in the allotted time, then it is killed and its termination is treated as
abnormal.
If the handler program is not specified, then the following CI result
manifest is implied:
\
status: 200
message: CI request is queued
reference: <request-id>
\
|
\li|Send the CI request email.
If \c{ci-email} is configured, send an email to this address containing the CI
request manifest, the potentially empty CI overrides manifest, and the CI
result manifest.|
\li|Respond to the client.
Respond to the client with the CI result manifest and its \c{status} value as
the HTTP status code.|
|
Check violations that are explicitly mentioned above are always reported with
the CI result manifest. Other errors (for example, internal server errors)
might be reported with unformatted text, including HTML.
If the CI request contains the \c{interactive} parameter, then the CI service
provides the execution environment login information for each test and stops
them at the specified breakpoint.
Pre-defined breakpoint ids are \c{error} and \c{warning}. The breakpoint id is
included into the CI request manifest and the CI service must at least handle
\c{error} but may recognize additional ids (build phase/command identifiers,
etc).
If the CI request contains the \c{simulate} parameter, then the CI service
simulates the specified outcome of the CI process without actually performing
any externally visible actions (e.g., testing the package, publishing the
result, etc). Note that the CI request email (\c{ci-email}) is not sent for
simulated requests.
Pre-defined simulation outcome values are \c{internal-error-text},
\c{internal-error-html}, and \c{success}. The simulation outcome is included
into the CI request manifest and the handler program must at least handle
\c{success} but may recognize additional outcomes.
\h#ci-request-manifest|CI Request Manifest|
The CI request manifest starts with the below values and in that order
optionally followed by additional values in the unspecified order
corresponding to the custom request parameters.
\
id: <request-id>
repository: <url>
[package]: <name>[/<version>]
[interactive]: <breakpoint>
[simulate]: <outcome>
timestamp: <date-time>
[client-ip]: <string>
[user-agent]: <string>
[service-id]: <string>
[service-type]: <string>
[service-data]: <string>
[service-action]: <action>
\
The \c{package} value can be repeated multiple times. The \c{timestamp} value
is in the ISO-8601 \c{<YYYY>-<MM>-<DD>T<hh>:<mm>:<ss>Z} form (always
UTC). Note also that \c{client-ip} can be IPv4 or IPv6.
Note that some CI service implementations may serve as backends for
third-party services. The latter may initiate CI tasks, providing all the
required information via some custom protocol, and expect the CI service to
notify it about the progress. In this case the third-party service type as
well as optionally the third-party id and custom state data can be
communicated to the underlying CI handler program via the respective
\c{service-*} manifest values. Also note that normally a third-party service
has all the required information (repository URL, etc) available at the time
of the CI task initiation, in which case the \c{start} value is specified for
the \c{service-action} manifest value. If that's not the case, the CI task is
only created at the time of the initiation without calling the CI handler
program. In this case the CI handler is called later, when all the required
information is asynchronously gathered by the service. In this case the
\c{load} value is specified for the \c{service-action} manifest value.
\h#ci-overrides-manifest|CI Overrides Manifest|
The CI overrides manifest is a package manifest fragment that should be
applied to all the packages being tested. The contained values override the
whole value groups they belong to, resetting all the group values prior to
being applied. Currently, only the following value groups can be overridden:
\
build-email build-{warning,error}-email
builds build-{include,exclude}
*-builds *-build-{include,exclude}
*-build-config
\
For the package configuration-specific build constraint overrides the
corresponding configuration must exist in the package manifest. In contrast,
the package configuration override (\cb{*-build-config}) adds a new
configuration if it doesn't exist and updates the arguments of the existing
configuration otherwise. In the former case, all the potential build
constraint overrides for such a newly added configuration must follow the
corresponding \cb{*-build-config} override.
Note that the build constraints group values (both common and build package
configuration-specific) are overridden hierarchically so that the
\c{[\b{*-}]\b{build-}{\b{include},\b{exclude}\}} overrides don't affect the
respective \c{[\b{*-}]\b{builds}} values.
Note also that the common and build package configuration-specific build
constraints group value overrides are mutually exclusive. If the common build
constraints are overridden, then all the configuration-specific constraints
are removed. Otherwise, if any configuration-specific constraints are
overridden, then for the remaining configurations the build constraints are
reset to \cb{builds:\ none}.
See \l{bpkg#manifest-package Package Manifest} for details on these values.
\h#ci-result-manifest|CI Result Manifest|
The CI result manifest starts with the below values and in that order
optionally followed by additional values if returned by the handler program.
If the CI request is successful, then the \c{reference} value must be present
and contain a string that can be used to identify this request (for example,
the CI request id).
\
status: <http-code>
message: <string>
[reference]: <string>
\
\h1#upload|Build Artifacts Upload|
The build artifacts upload functionality allows uploading archives of files
generated as a byproduct of the package builds. Such archives as well as
additional, repository-specific information can optionally be uploaded by the
automated build bots via the HTTP \c{POST} method using the
\c{multipart/form-data} content type (see the \l{bbot \c{bbot} documentation}
for details). The implementation in \c{brep} only handles uploading as well as
basic actions and verification (build session resolution, agent
authentication, checksum verification) expecting the rest of the upload logic
to be handled by a separate entity according to the repository policy. Such an
entity can be notified by \c{brep} about a new upload as an invocation of the
\i{handler program} (as part of the HTTP request) and/or via email. It could
also be a separate process that monitors the upload data directory.
For each upload request \c{brep} performs the following steps.
\ol|
\li|Determine upload type.
The upload type must be passed via the \c{upload} parameter in the query
component of the request URL.|
\li|Verify upload size limit.
The upload form-data payload size must not exceed \c{upload-max-size} specific
for this upload type.|
\li|Verify the required \c{session}, \c{instance}, \c{archive}, and
\c{sha256sum} parameters are present. If \c{brep} is configured to perform
agent authentication, then verify that the \c{challenge} parameter is also
present. See the \l{bbot#arch-result-req Result Request Manifest} for
semantics of the \c{session} and \c{challenge} parameters.
The \c{archive} parameter must be the build artifacts archive upload while
\c{sha256sum} must be its 64 characters SHA256 checksum calculated in the
binary mode.|
\li|Verify other parameters are valid manifest name/value pairs.
The value can only contain UTF-8 encoded Unicode graphic characters as well as
tab (\c{\\t}), carriage return (\c{\\r}), and line feed (\c{\\n}).|
\li|Resolve the session.
Resolve the \c{session} parameter value to the actual package build
information.|
\li| Authenticate the build bot agent.
Use the \c{challenge} parameter value and the resolved package build
information to authenticate the agent, if configured to do so.|
\li|Generate upload request id and create request directory.
For each upload request a unique id (UUID) is generated and a request
subdirectory is created in the \c{upload-data} directory with this id as its
name.|
\li|Save the upload archive into the request directory and verify its
checksum.
The archive is saved using the submitted name, and its checksum is calculated
and compared to the submitted checksum.|
\li|Save the upload request manifest into the request directory.
The upload request manifest is saved as \c{request.manifest} into the request
subdirectory next to the archive.|
\li|Invoke the upload handler program.
If \c{upload-handler} is configured, invoke the handler program passing to it
additional arguments specified with \c{upload-handler-argument} (if any)
followed by the absolute path to the upload request directory.
The handler program is expected to write the upload result manifest to
\c{stdout} and terminate with the zero exit status. A non-zero exit status is
treated as an internal error. The handler program's \c{stderr} is logged.
Note that the handler program should report temporary server errors (service
overload, network connectivity loss, etc.) via the upload result manifest
status values in the [500-599] range (HTTP server error) rather than via a
non-zero exit status.
The handler program assumes ownership of the upload request directory and can
move/remove it. If after the handler program terminates the request directory
still exists, then it is handled by \c{brep} depending on the handler process
exit status and the upload result manifest status value. If the process has
terminated abnormally or with a non-zero exit status or the result manifest
status is in the [500-599] range (HTTP server error), then the directory is
saved for troubleshooting by appending the \c{.fail} extension to its name.
Otherwise, if the status is in the [400-499] range (HTTP client error), then
the directory is removed. If the directory is left in place by the handler or
is saved for troubleshooting, then the upload result manifest is saved as
\c{result.manifest} into this directory, next to the request manifest.
If \c{upload-handler-timeout} is configured and the handler program does not
exit in the allotted time, then it is killed and its termination is treated as
abnormal.
If the handler program is not specified, then the following upload result
manifest is implied:
\
status: 200
message: <upload-type> upload is queued
reference: <request-id>
\
|
\li|Send the upload email.
If \c{upload-email} is configured, send an email to this address containing
the upload request manifest and the upload result manifest.|
\li|Respond to the client.
Respond to the client with the upload result manifest and its \c{status} value
as the HTTP status code.|
|
Check violations (max size, etc) that are explicitly mentioned above are
always reported with the upload result manifest. Other errors (for example,
internal server errors) might be reported with unformatted text, including
HTML.
\h#upload-request-manifest|Upload Request Manifest|
The upload request manifest starts with the below values and in that order
optionally followed by additional values in the unspecified order
corresponding to the custom request parameters.
\
id: <request-id>
session: <session-id>
instance: <name>
archive: <name>
sha256sum: <sum>
timestamp: <date-time>
name: <name>
version: <version>
project: <name>
target-config: <name>
package-config: <name>
target: <target-triplet>
[tenant]: <tenant-id>
toolchain-name: <name>
toolchain-version: <standard-version>
repository-name: <canonical-name>
machine-name: <name>
machine-summary: <text>
\
The \c{timestamp} value is in the ISO-8601
\c{<YYYY>-<MM>-<DD>T<hh>:<mm>:<ss>Z} form (always UTC).
\h#upload-result-manifest|Upload Result Manifest|
The upload result manifest starts with the below values and in that order
optionally followed by additional values if returned by the handler program.
If the upload request is successful, then the \c{reference} value must be
present and contain a string that can be used to identify this request (for
example, the upload request id).
\
status: <http-code>
message: <string>
[reference]: <string>
\
\h1#package-review|Package Review Submission|
\h#package-review-manifest|Package Review Manifest|
The package review manifest files are per version/revision and are normally
stored on the filesystem along with other package metadata (like ownership
information). Under the metadata root directory, a review manifest file has
the following path:
\
<project>/<package>/<version>/reviews.manifest
\
For example:
\
hello/libhello/1.2.3+2/reviews.manifest
\
Note that review manifests are normally not removed when the corresponding
package archive is removed (for example, as a result of a replacement with a
revision) because reviews for subsequent versions may refer to review results
of previous versions (see below).
The package review file is a manifest list with each manifest containing
the below values in an unspecified order:
\
reviewed-by: <string>
result-<name>: pass|fail|unchanged
[base-version]: <version>
[details-url]: <url>
\
For example:
\
reviewed-by: John Doe <john@example.org>
result-build: fail
details-url: https://github.com/build2-packaging/hello/issues/1
\
The \c{reviewed-by} value identifies the reviewer. For example, a deployment
policy may require a real name and email address when submitting a review.
The \c{result-<name>} values specify the review results for various aspects of
the package. At least one result value must be present and duplicates for the
same aspect name are not allowed. For example, a deployment may define the
following aspect names: \c{build} (build system), \c{code} (implementation
source code), \c{test} (tests), \c{doc} (documentation).
The \c{result-<name>} value must be one of \c{pass} (the review passed),
\c{fail} (the review failed), and \c{unchanged} (the aspect in question hasn't
changed compared to the previous version, which is identified with the
\c{base-version} value; see below).
The \c{base-version} value identifies the previous version on which this
review is based. The idea here is that when reviewing a new revision, a patch
version, or even a minor version, it is often easier to review the difference
between the two versions than to review everything from scratch. In such
cases, if some aspects haven't changed since the previous version, then their
results can be specified as \c{unchanged}. The \c{base-version} value must be
present if at least one \c{result-<name>} value is \c{unchanged}.
The \c{details-url} value specifies a URL that contains the details of the
review (issues identified, etc). It can only be absent if none of the
\c{result-<name>} values are \c{fail} (a failed review needs an explanation
of why it failed).
\h1#github-ci|GitHub CI Integration|
This chapter describes the integration of the \l{#ci Package CI} functionality
with GitHub.
\h#github-ci-background|GitHub CI Background|
The GitHub CI model has a number of limitations that are important to
understand in order to use the provided integration correctly. To understand
the limitations, however, we first need to understand how the integration
works, at least at the high level.
GitHub supports integration of third-party CI services into the repository
workflow by allowing such third-party services to register for events (called
\i{web hooks} in the GitHub terminology).
\N|This mechanism should not be confused with GitHub Actions, which is a GitHub
built-in CI service. As far as we understand, it uses ad hoc integration
rather than the same integration mechanism as available to third-party CI
services.|
While there are many repository workflow events, for CI the only relevant ones
are:
\ol|
\li|\i{Branch push} (BP), which is triggered when a new commit is pushed
to a branch in your repository.|
\li|\i{Pull request} (PR), which is triggered when a new pull request is
created on your repository. It is also triggered when new commits are added
to the existing PR.||
\N|Another relevant event is \i{Merge queue}. However, merge queues are
not yet supported by this integration.|
In response to these events the third-party CI service is expected to start a
number of CI jobs (called \i{checks} in the GitHub terminology) and then
report their progress and results back to GitHub to be shown to the user,
and, in case of PRs, to prevent them from being merged in case the result
is unsuccessful.
Let's examine in more detail what exactly happens in case of a branch push and
a pull request.
The branch push (BP) case is pretty straightforward: when you push a new
commit to a branch in your repository, this commit is CI'ed by the third-party
service and the result is associated with this commit. If you push another
commit, the process repeats and you get a new set of CI results associated
with the new commit. The important point here is that the CI results for each
commit are associated with that commit id (called \i{head sha} in the GitHub
terminology).
The pull request (PR) case is more complicated: the aim of a PR is to merge
one or more commits from one branch (called \i{head branch} in the GitHub
terminology) to another branch (called \i{base branch} in the GitHub
terminology). If the base branch can be fast-forwarded to the head commit of
the head branch, then we can CI this head commit and the result will be
representative of the merge. However, if base cannot be fast-forwarded, then a
general merge of the two branches must be performed, with potential conflict
resolution, etc. And in this case the CI result for the head commit may not
necessarily represent the result of the merge.
To support the general case (when the base branch cannot be fast-forwarded)
GitHub creates a tentative merge commit (called \i{test merge commit} in the
GitHub terminology) and expects the CI service to test that commit rather than
the head commit (this is what most of the major CI integrations do). See
\l{https://www.kenmuse.com/blog/the-many-shas-of-a-github-pull-request/ The
Many SHAs of a GitHub Pull Request} for additional details.
While the PR case is more complicated, so far everything makes sense. But that
ends once we understand what GitHub associates the CI result with in case of a
PR. Since the CI service is expected to test the merge commit, it would make
sense to associate the result of this test with the merge commit. Instead,
GitHub expects the CI service to report it as associated with the head commit!
This strange decision by GitHub, which we will refer to as \"head sharing\",
has two serious consequences for trusting CI results when making decisions
about merging PRs.
Firstly, if the branch push and/or several pull requests share the same head
commit, then they will share the CI result, regardless of the state of the
PRs' base branches. Or, to put it another way, in the GitHub model there is a
single CI result per head commit that is shared by all the BPs and PRs with
this head commit.
Secondly, if the base branch of a PR moves, the CI result associated with the
PR does not get invalidated (because the PR head hasn't changed).
Let's consider two representative examples of each case that show how the
GitHub behavior can lead us to making wrong decisions. But before we do that,
a last bit of terminology: we will distinguish between \i{local PRs}, those with
the head branch from the same repository, and \i{remote PRs}, those with the
head branch belonging to another user/organization (called \i{forked PR} in
the GitHub terminology).
The first representative example is a feature branch: we develop a feature in
a branch of our repository and once it is ready, we create a local PR to merge
it to the \c{master}/\c{main} branch. We typically go through the PR instead
of merging our branch directly in order to have the changes reviewed by
someone else. In this scenario, the head commit of our feature branch and of
the PR we created will be the same, which means our PR will share the CI
result with the feature branch push, which is presumably successful. This can
lead us to merging the PR based on this result even though the merge commit of
the PR may not have the same contents as the head commit of the result. For
example, we may have forgotten to rebase our feature branch on the base branch
(\c{master}/\c{main} in our example) before creating the PR and the base
branch has moved while we developed the feature. Or the review may have taken
some time and the base branch likewise has moved in the meantime. In both
these cases while the changes to the base branch may not render our head
commit unmergeable (for example, due to conflicts), they may render our
changes uncompilable or otherwise buggy once merged.
The second representative example is a single remote PR: someone creates a PR
with a feature or bugfix from their fork of our repository. There is no
corresponding branch push for this PR's head commit in our repository so it
sounds like there is only one place (the PR) where the CI result, if
associated with this head commit, will be reported in our repository and so
the head sharing should not be an issue, right? While it's true that
\i{spatial} sharing, that is between BP and/or several PRs, is not an issue in
this case, \i{temporal} sharing still is. Specifically, if the base branch
moves before we examine the PR, we again may end up merging it based on the CI
results that are not representative of the merge commit.
Hopefully you see the underlying theme by now: the only way to ensure
correctness in the GitHub CI model is to make sure the PR's head and merge
commits are the same, which is only the case when the PR base branch can be
fast-forwarded to head.
Thankfully, GitHub provides a branch protection rule that prevents merging of
a PR with the head branch behind base (we will refer to it as the
\i{head-behind-base} protection). Enabling of this protection rule is a
prerequisite for this CI integration to work correctly.
Note, however, that even with the head-behind-base protection enabled, some of
the GitHub behavior can be counter-intuitive.
For one, GitHub does not prevent the CI build from starting if this protection
rule is violated. While this integration checks the result of this protection
rule and does not start the build if the head is behind, the CI result may
already be available (if this head is shared with a branch push and/or another
PR), in which case GitHub will show it. So you may end up with a violated
head-behind-base protection but with a successful CI result.
Another surprising consequence of the head sharing is the instantaneous
availability of the CI result, which may look suspicious. For example, if you
create a PR from a local feature branch, you may immediately see the
successful CI result because it is the same as for the branch push
to the feature branch.
Finally note that the GitHub CI model is quite wasteful of CI resources in
general and the head sharing makes this problem even worse. Specifically,
GitHub CI builds every commit indiscriminately, regardless of what was
changed. So a minor tweak to \c{README.md} will trigger a full rebuild even
though nothing that needs building has changed. The head sharing issue makes
the situation worse because the CI integration cannot easily cancel an
in-progress build when a new commit is added to a PR because the result could
be shared with a branch push or another PR. Nevertheless, this integration
will attempt to cancel a stale build of a remote PR provided it's not
(currently) shared.
"
|