summaryrefslogtreecommitdiff
path: root/libxerces-c/README-DEV
blob: 36386ed60c3e35ae4965823eaec3682488ad4499 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
This document describes how libxerces-c was packaged for build2. In
particular, this understanding will be useful when upgrading to a new upstream
version. See ../README-DEV for general notes on Xerces-C++ packaging.

Symlink the required upstream files and provide our own implementations for
auto-generated headers:

$ ln -s ../upstream/LICENSE
$ ln -s ../../upstream/src/xercesc/{dom,framework,internal,parsers,sax,sax2,\
validators,xinclude} xercesc/

$ ln -s ../../upstream/src/{stricmp,strnicmp}.{h,c} xercesc/

$ mkdir xercesc/internal/ xercesc/util/

$ pushd xercesc/internal/
$ ln -s ../../../upstream/src/xercesc/internal/*.{cpp,hpp} ./

$ cd ../util/
$ ln -s ../../../upstream/src/xercesc/util/*.{cpp,hpp,c} ./
$ ln -s ../../../upstream/src/xercesc/util/{regx,FileManagers} ./

Note that the main reasons for such a granular linking (we could just link
upstream's internal/, util/, etc) are source code patching and reducing the
number of preprocessor macros we need to deduce in xercesc/config.h (see the
change tracking instructions below for details). As a bonus it also simplifies
the buildfile.

$ mkdir -p Transcoders NetAccessors MsgLoaders MutexManagers
$ ln -s ../../../../upstream/src/xercesc/util/Transcoders/ICU Transcoders/
$ ln -s ../../../../upstream/src/xercesc/util/NetAccessors/Curl NetAccessors/
$ ln -s ../../../../upstream/src/xercesc/util/MsgLoaders/InMemory MsgLoaders/
$ ln -s ../../../../upstream/src/xercesc/util/MutexManagers/StdMutexMgr.{hpp,cpp} MutexManagers/

$ ln -s ../../../upstream/src/xercesc/util/XercesVersion.hpp.cmake.in \
  XercesVersion.hpp.in
$ popd

Use some of the upstream's tests and examples for testing:

$ ln -s ../../../upstream/samples/src/DOMPrint tests/dom-print/
$ ln -s ../../../upstream/samples/src/SAXPrint tests/sax-print/
$ ln -s ../../../upstream/samples/src/SAX2Print tests/sax2-print/
$ ln -s ../../../upstream/samples/src/PSVIWriter tests/psvi-writer/

We also apply the following patches:

- Fix of the use-after-free error (CVE-2018-1311) triggered during the
  scanning of external DTDs (see https://security-tracker.debian.org/tracker/CVE-2018-1311
  for details).

  There is no upstream fix and only suggested mitigations, at time of this
  writing (see https://issues.apache.org/jira/browse/XERCESC-2188 for
  details). Thus, we mitigate the issue at the expense of a memory leak, as it
  is done by Debian (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=947431).

  $ cp --remove-destination ../upstream/src/xercesc/internal/IGXMLScanner.cpp \
    xercesc/internal/

  $ git apply xercesc/dtd-decl-use-after-free.patch

- The explicit template instantiation declarations and definitions patch (see
  xercesc/util/Xerces_autoconf_config.hpp for details):

  $ cp --remove-destination ../upstream/src/xercesc/util/{Janitor.hpp,JanitorExports.cpp} \
    xercesc/util/

  $ git apply xercesc/export-template-instantiations.patch

- The inline functions definition/usage order change to prevent MinGW GCC
  from complaining when compile code that uses libxerces-c:

  $ cp --remove-destination ../upstream/src/xercesc/util/KVStringPair.hpp \
    xercesc/util/

  $ git apply xercesc/inline-funcs-def-usage-order.patch

- Patch of the net accessor test, which by some reason exits with the zero
  status printing the diagnostics to stdout for some errors:

  $ cp ../upstream/tests/src/NetAccessorTest/NetAccessorTest.cpp \
    tests/net-accessor/

  $ git apply tests/net-accessor/handle-exception-as-error.patch

Note that the above patches are produced by the following commands:

$ git diff ><patch-path>

Create xercesc/{config.h,util/Xerces_autoconf_config.hpp} using as a base the
upstream's config.h.cmake.in, config.h.in, and
Xerces_autoconf_config.hpp.cmake.in.

Re-creating xercesc/config.h from scratch every time we upgrade to a new
upstream version would be a real pain. Instead we can only (un)define the
newly introduced macros, comparing the already defined and currently used
macro sets. Note that we can use this approach to also deduce the initial set
of macros running the above commands for the upstream's auto-generated
config.h:

$ ln -s ../../upstream/config.h.cmake.in xercesc/config.h.cmake.in.orig
$ ln -s ../../../upstream/src/xercesc/util/Xerces_autoconf_config.hpp.cmake.in \
  xercesc/util/Xerces_autoconf_config.hpp.cmake.in.orig

$ for m in `cat xercesc/config.h.cmake.in.orig | \
            sed -n 's/.*#\s*\(define\|cmakedefine\)\s\{1,\}\([_a-zA-Z0-9]\{1,\}\)\(\s.*\)\{0,1\}$/\2/p' | sort -u`; do
    if grep -q -e "\b$m\b" `find -L . -name '*.h' -a ! -name config.h -o -name '*.c' -o -name '*.cpp' -o -name '*.hpp' -a ! -name XercesVersion.hpp  -a ! -name Xerces_autoconf_config.hpp`; then
      echo "$m"
    fi
  done >used-macros

$ cat xercesc/config.h | \
  sed -n 's/^#\s*\(define\|undef\)\s\{1,\}\([_a-z_A-Z0-9]\{1,\}\)\(\s.*\)\{0,1\}$/\2/p' | \
  sort -u >defined-macros

$ diff defined-macros used-macros | grep '<' >remove-macros
$ diff defined-macros used-macros | grep '>' >add-macros

We won't drop macro (un)definitions in Xerces_autoconf_config.hpp (see the
header for details). Thus, just compare the new and old
Xerces_autoconf_config.hpp.cmake.in files during upgrade to detect the removed
and added macro definitions. Luckily, the header is not that big.