diff options
author | Rui Ueyama <ruiu@google.com> | 2017-02-14 04:47:24 +0000 |
---|---|---|
committer | Rui Ueyama <ruiu@google.com> | 2017-02-14 04:47:24 +0000 |
commit | a66b0f4f021df4f88e17ef5c49e88059596d53e9 (patch) | |
tree | 727f9df3bc1607176241deed78d12cebb2d4d751 /lld/ELF/ScriptLexer.cpp | |
parent | 6758a6942eae016a8694bbe8d4f58f6990e1a42e (diff) |
Add file comments for ScriptParser.cpp.
Diffstat (limited to 'lld/ELF/ScriptLexer.cpp')
-rw-r--r-- | lld/ELF/ScriptLexer.cpp | 33 |
1 files changed, 31 insertions, 2 deletions
diff --git a/lld/ELF/ScriptLexer.cpp b/lld/ELF/ScriptLexer.cpp index 6398a52a026..418ec93695f 100644 --- a/lld/ELF/ScriptLexer.cpp +++ b/lld/ELF/ScriptLexer.cpp @@ -7,8 +7,37 @@ // //===----------------------------------------------------------------------===// // -// This file contains the base parser class for linker script and dynamic -// list. +// This file defines a lexer for the linker script. +// +// The linker script's grammar is not complex but ambiguous due to the +// lack of the formal specification of the language. What we are trying to +// do in this and other files in LLD is to make a "reasonable" linker +// script processor. +// +// Among simplicity, compatibility and efficiency, we put the most +// emphasis on simplicity when we wrote this lexer. Compatibility with the +// GNU linkers is important, but we did not try to clone every tiny corner +// case of their lexers, as even ld.bfd and ld.gold are subtly different +// in various corner cases. We do not care much about efficiency because +// the time spent in parsing linker scripts is usually negligible. +// +// Our grammar of the linker script is LL(2), meaning that it needs at +// most two-token lookahead to parse. The only place we need two-token +// lookahead is labels in version scripts, where we need to parse "local :" +// as if "local:". +// +// Overall, this lexer works fine for most linker scripts. There's room +// for improving compatibility, but that's probably not at the top of our +// todo list. +// +// A caveat: This lexer splits an input string into tokens ahead of time, +// so the lexer is not context aware. There's one known corner case. Let's +// say the next string is "val*3" (without quotes). In the context where +// the parser is expecting an expression, that should be tokenizes to +// "val", "*" and "3". In other context, it should be just a single +// token. (If it is in a filename context, it'll be interpeted as a glob +// pattern, for example.) We want to fix this, but it probably needs a +// redesign of this lexer. // //===----------------------------------------------------------------------===// |