JavaTokenParsers stringLiteral Unicode literals not fully correct

The stringLiteral function in scala.util.parsing.combinator.!JavaTokenParsers
tries to parse String literals with Unicode escape
sequence in them

```scala
  def stringLiteral: Parser[String] = 
    ("\""+"""([^"\p{Cntrl}\\]|\\[\\/bfnrt]|\\u[a-fA-F0-9]{4})*"""+"\"").r
```

However, such Unicode escapes can occur elsewhere,
including the first \ of an escape sequence,
or even instead of the " characters themselves (\u0022).

If you wish to support Unicode escapes, they should be
handled in a lower level stream Char => Char parser only,
where they would apply to string literals, numbers,
identifiers, etc.

Compile and run the following program

```scala
import scala.util.parsing.combinator.JavaTokenParsers
object StringLiteral extends JavaTokenParsers with Application {
  override def main(args: Array[String]) {
    args.foreach { a => println(a)
                        println(parseAll(stringLiteral, a)) }
  }
}
```

and pass it the strings:

```scala
  scala StringLiteral "\"trivial\"" "\"A backslash \u005c\ character\u0022"
```
and it parses the first correctly but fails on the second argument

```scala
"trivial"
[1.10] parsed: "trivial"
"A backslash \u005c\ character\u0022
[1.1] failure: string matching regex `"([^"\p{Cntrl}\\]|\\[\\/bfnrt]|\\u[a-fA-F0-9]{4})*"' expected but `"' found

"A backslash \u005c\ character\u0022
^
```

(Be careful about quoting arguments to the shell.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JavaTokenParsers stringLiteral Unicode literals not fully correct #324

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

JavaTokenParsers stringLiteral Unicode literals not fully correct #324

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions