userdiff-cpp: back out the digit-separators in numbers

The implementation of digit-separating single-quotes introduced a
note-worthy regression: the change of a character literal with a
digit would splice the digit and the closing single-quote. For
example, the change from 'a' to '2' is now tokenized as
'[-a'-]{+2'+} instead of '[-a-]{+2+}'.

The options to fix the regression are:

- Tighten the regular expression such that the single-quote can only
  occur between digits (that would match the official syntax).

- Remove support for digit separators.

I chose to remove support, because

- I have not seen a lot of code make use of digit separators.

- If code does use digit separators, then the numbers are typically
  long. If a change in one of the segments occurs, it is actually
  better visible if only that segment is highlighted as the word
  that changed instead of the whole long number.

This choice does introduce another minor regression, though, which
is highlighted in the test case: when a change occurs in the second
or later segment of a hexadecimal number where the segment begins
with a digit, but also has letters, the segment is mistaken as
consisting of a number and an identifier. I can live with that.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Johannes Sixt 2021-10-24 11:56:43 +02:00 коммит произвёл Junio C Hamano
Родитель c4fdba3383
Коммит 386076ec92
4 изменённых файлов: 18 добавлений и 18 удалений

Просмотреть файл

@ -1,21 +1,21 @@
<BOLD>diff --git a/pre b/post<RESET> <BOLD>diff --git a/pre b/post<RESET>
<BOLD>index 144cd98..64e78af 100644<RESET> <BOLD>index a1a09b7..f1b6f3c 100644<RESET>
<BOLD>--- a/pre<RESET> <BOLD>--- a/pre<RESET>
<BOLD>+++ b/post<RESET> <BOLD>+++ b/post<RESET>
<CYAN>@@ -1,30 +1,30 @@<RESET> <CYAN>@@ -1,30 +1,30 @@<RESET>
Foo() : x(0<RED>&&1<RESET><GREEN>&42<RESET>) { <RED>foo0<RESET><GREEN>bar<RESET>(x.<RED>find<RESET><GREEN>Find<RESET>); } Foo() : x(0<RED>&&1<RESET><GREEN>&42<RESET>) { <RED>foo0<RESET><GREEN>bar<RESET>(x.<RED>find<RESET><GREEN>Find<RESET>); }
cout<<"Hello World<RED>!<RESET><GREEN>?<RESET>\n"<<endl; cout<<"Hello World<RED>!<RESET><GREEN>?<RESET>\n"<<endl;
<GREEN>(<RESET>1 <RED>-<RESET><GREEN>+<RESET>1e10 0xabcdef<GREEN>)<RESET> '<RED>x<RESET><GREEN>.<RESET>' <GREEN>(<RESET>1 <RED>-<RESET><GREEN>+<RESET>1e10 0xabcdef<GREEN>)<RESET> '<RED>x<RESET><GREEN>2<RESET>'
// long double<RESET> // long double<RESET>
<RED>3.141'592'653e-10l<RESET><GREEN>3.141'592'654e+10l<RESET> <RED>3.141592653e-10l<RESET><GREEN>3.141592654e+10l<RESET>
// float<RESET> // float<RESET>
<RED>120E5f<RESET><GREEN>120E6f<RESET> <RED>120E5f<RESET><GREEN>120E6f<RESET>
// hex<RESET> // hex<RESET>
<RED>0xdead'beaf<RESET><GREEN>0xdead'Beaf<RESET>+<RED>8ULL<RESET><GREEN>7ULL<RESET> <RED>0xdead<RESET><GREEN>0xdeaf<RESET>'1<RED>eaF<RESET><GREEN>eaf<RESET>+<RED>8ULL<RESET><GREEN>7ULL<RESET>
// octal<RESET> // octal<RESET>
<RED>0123'4567<RESET><GREEN>0123'4560<RESET> <RED>01234567<RESET><GREEN>01234560<RESET>
// binary<RESET> // binary<RESET>
<RED>0b10'00<RESET><GREEN>0b11'00<RESET>+e1 <RED>0b1000<RESET><GREEN>0b1100<RESET>+e1
// expression<RESET> // expression<RESET>
1.5-e+<RED>2<RESET><GREEN>3<RESET>+f 1.5-e+<RED>2<RESET><GREEN>3<RESET>+f
// another one<RESET> // another one<RESET>

Просмотреть файл

@ -1,16 +1,16 @@
Foo() : x(0&42) { bar(x.Find); } Foo() : x(0&42) { bar(x.Find); }
cout<<"Hello World?\n"<<endl; cout<<"Hello World?\n"<<endl;
(1 +1e10 0xabcdef) '.' (1 +1e10 0xabcdef) '2'
// long double // long double
3.141'592'654e+10l 3.141592654e+10l
// float // float
120E6f 120E6f
// hex // hex
0xdead'Beaf+7ULL 0xdeaf'1eaf+7ULL
// octal // octal
0123'4560 01234560
// binary // binary
0b11'00+e1 0b1100+e1
// expression // expression
1.5-e+3+f 1.5-e+3+f
// another one // another one

Просмотреть файл

@ -2,15 +2,15 @@ Foo():x(0&&1){ foo0( x.find); }
cout<<"Hello World!\n"<<endl; cout<<"Hello World!\n"<<endl;
1 -1e10 0xabcdef 'x' 1 -1e10 0xabcdef 'x'
// long double // long double
3.141'592'653e-10l 3.141592653e-10l
// float // float
120E5f 120E5f
// hex // hex
0xdead'beaf+8ULL 0xdead'1eaF+8ULL
// octal // octal
0123'4567 01234567
// binary // binary
0b10'00+e1 0b1000+e1
// expression // expression
1.5-e+2+f 1.5-e+2+f
// another one // another one

Просмотреть файл

@ -67,11 +67,11 @@ PATTERNS("cpp",
/* identifiers and keywords */ /* identifiers and keywords */
"[a-zA-Z_][a-zA-Z0-9_]*" "[a-zA-Z_][a-zA-Z0-9_]*"
/* decimal and octal integers as well as floatingpoint numbers */ /* decimal and octal integers as well as floatingpoint numbers */
"|[0-9][0-9.']*([Ee][-+]?[0-9]+)?[fFlLuU]*" "|[0-9][0-9.]*([Ee][-+]?[0-9]+)?[fFlLuU]*"
/* hexadecimal and binary integers */ /* hexadecimal and binary integers */
"|0[xXbB][0-9a-fA-F']+[lLuU]*" "|0[xXbB][0-9a-fA-F]+[lLuU]*"
/* floatingpoint numbers that begin with a decimal point */ /* floatingpoint numbers that begin with a decimal point */
"|\\.[0-9][0-9']*([Ee][-+]?[0-9]+)?[fFlL]?" "|\\.[0-9][0-9]*([Ee][-+]?[0-9]+)?[fFlL]?"
"|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->\\*?|\\.\\*|<=>"), "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->\\*?|\\.\\*|<=>"),
PATTERNS("csharp", PATTERNS("csharp",
/* Keywords */ /* Keywords */