-
Notifications
You must be signed in to change notification settings - Fork 14k
Open
Labels
A-frontendArea: Compiler frontend (errors, parsing and HIR)Area: Compiler frontend (errors, parsing and HIR)A-macrosArea: All kinds of macros (custom derive, macro_rules!, proc macros, ..)Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..)A-proc-macrosArea: Procedural macrosArea: Procedural macrosC-bugCategory: This is a bug.Category: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
Proc macros operate on tokens, including string/character/byte-string/byte literal tokens, which they can get from various sources.
- Source 1: Lexer.
This is the most reliable source, the token is passed to a macro precisely like it was written in source code.
"C"will be passed as"C", but the same C in escaped form"\x43"will be passed as"\x43".
Proc macros can observe the difference becauseToString(the only way to get the literal contents in proc macro API) also prints the literal precisely. - Source 2: Proc macro API.
Literal::string(s: &str)will make you a string literal containing datas, approximately.
The precise token (returned byToString) will contain:escape_debug(s)for string literals (Literal::string)escape_unicode(s)for character literals (Literal::character)escape_default(s)for byte string literals (Literal::byte_string)
- Source 3: Recovered from non-attribute AST
AST goes through pretty-printing first, then re-tokenized.
The precise token (returned byToString) will contain:- precise
sfor raw AST strings escape_debug(s)for non-raw AST stringsescape_default(s)for AST characters, bytes and byte strings (both raw and non-raw)
- precise
- Source 4: Recovered from attribute AST
Just an ad-hoc recovery without pretty-printing.
The precise token (returned byToString) will contain:- precise
sfor raw AST strings escape_default(s)for non-raw AST strings, AST characters, bytes and byte strings (both raw and non-raw)
- precise
EDIT: Also doc comments go through escape_debug when converted to #[doc = "content"] tokens for proc macros.
It would be nice to
- Figure out what escaping we actually want (perhaps none?) and document the motivation behind the escaping choices.
- Get rid of the escaping differences between token sources, so that at least literals of the same kind are escaped identically.
Metadata
Metadata
Assignees
Labels
A-frontendArea: Compiler frontend (errors, parsing and HIR)Area: Compiler frontend (errors, parsing and HIR)A-macrosArea: All kinds of macros (custom derive, macro_rules!, proc macros, ..)Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..)A-proc-macrosArea: Procedural macrosArea: Procedural macrosC-bugCategory: This is a bug.Category: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.