From 112a83c346a537f1a5eac6fc17ee2ce3143d625b Mon Sep 17 00:00:00 2001 From: Karen Arutyunov Date: Thu, 18 Jun 2020 16:40:00 +0300 Subject: Fix lexer to fail on invalid UTF-8 sequences --- libbuild2/lexer+utf8.test.testscript | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 libbuild2/lexer+utf8.test.testscript (limited to 'libbuild2/lexer+utf8.test.testscript') diff --git a/libbuild2/lexer+utf8.test.testscript b/libbuild2/lexer+utf8.test.testscript new file mode 100644 index 0000000..42c62ea --- /dev/null +++ b/libbuild2/lexer+utf8.test.testscript @@ -0,0 +1,28 @@ +# file : libbuild2/lexer+utf8.test.testscript +# license : MIT; see accompanying LICENSE file + +: valid +: +$* <>EOO + Sommerzeit + Mitteleuropäische + EOI + 'Sommerzeit' + + 'Mitteleuropäische' + + EOO + +: invalid +: +: Here we spoil the UTF-8 sequence 'ä' by dropping its second byte. +: +cat <>EOO 2>>EOE != 0 + Sommerzeit + Mitteleuropäische + EOI + 'Sommerzeit' + + EOO + :2:12: error: invalid UTF-8 sequence second byte (0x69 'i') + EOE -- cgit v1.1