aboutsummaryrefslogtreecommitdiff
path: root/libbuild2/lexer+utf8.test.testscript
diff options
context:
space:
mode:
authorKaren Arutyunov <karen@codesynthesis.com>2020-06-18 16:40:00 +0300
committerBoris Kolpackov <boris@codesynthesis.com>2020-06-19 11:27:32 +0200
commit112a83c346a537f1a5eac6fc17ee2ce3143d625b (patch)
tree11ed26fb72a571299eba7e02a225eaf07e527c58 /libbuild2/lexer+utf8.test.testscript
parent78ac6aee6dff1b608bc312fe7ada442ba83710e8 (diff)
Fix lexer to fail on invalid UTF-8 sequences
Diffstat (limited to 'libbuild2/lexer+utf8.test.testscript')
-rw-r--r--libbuild2/lexer+utf8.test.testscript28
1 files changed, 28 insertions, 0 deletions
diff --git a/libbuild2/lexer+utf8.test.testscript b/libbuild2/lexer+utf8.test.testscript
new file mode 100644
index 0000000..42c62ea
--- /dev/null
+++ b/libbuild2/lexer+utf8.test.testscript
@@ -0,0 +1,28 @@
+# file : libbuild2/lexer+utf8.test.testscript
+# license : MIT; see accompanying LICENSE file
+
+: valid
+:
+$* <<EOI >>EOO
+ Sommerzeit
+ Mitteleuropäische
+ EOI
+ 'Sommerzeit'
+ <newline>
+ 'Mitteleuropäische'
+ <newline>
+ EOO
+
+: invalid
+:
+: Here we spoil the UTF-8 sequence 'ä' by dropping its second byte.
+:
+cat <<EOI | sed -e 's/(rop.).(isc)/\1\2/' | $* >>EOO 2>>EOE != 0
+ Sommerzeit
+ Mitteleuropäische
+ EOI
+ 'Sommerzeit'
+ <newline>
+ EOO
+ <stdin>:2:12: error: invalid UTF-8 sequence second byte (0x69 'i')
+ EOE