diff options
author | Karen Arutyunov <karen@codesynthesis.com> | 2020-06-18 16:40:00 +0300 |
---|---|---|
committer | Boris Kolpackov <boris@codesynthesis.com> | 2020-06-19 11:27:32 +0200 |
commit | 112a83c346a537f1a5eac6fc17ee2ce3143d625b (patch) | |
tree | 11ed26fb72a571299eba7e02a225eaf07e527c58 /libbuild2/lexer+utf8.test.testscript | |
parent | 78ac6aee6dff1b608bc312fe7ada442ba83710e8 (diff) |
Fix lexer to fail on invalid UTF-8 sequences
Diffstat (limited to 'libbuild2/lexer+utf8.test.testscript')
-rw-r--r-- | libbuild2/lexer+utf8.test.testscript | 28 |
1 files changed, 28 insertions, 0 deletions
diff --git a/libbuild2/lexer+utf8.test.testscript b/libbuild2/lexer+utf8.test.testscript new file mode 100644 index 0000000..42c62ea --- /dev/null +++ b/libbuild2/lexer+utf8.test.testscript @@ -0,0 +1,28 @@ +# file : libbuild2/lexer+utf8.test.testscript +# license : MIT; see accompanying LICENSE file + +: valid +: +$* <<EOI >>EOO + Sommerzeit + Mitteleuropäische + EOI + 'Sommerzeit' + <newline> + 'Mitteleuropäische' + <newline> + EOO + +: invalid +: +: Here we spoil the UTF-8 sequence 'ä' by dropping its second byte. +: +cat <<EOI | sed -e 's/(rop.).(isc)/\1\2/' | $* >>EOO 2>>EOE != 0 + Sommerzeit + Mitteleuropäische + EOI + 'Sommerzeit' + <newline> + EOO + <stdin>:2:12: error: invalid UTF-8 sequence second byte (0x69 'i') + EOE |