fix: COMMON/ABS symbols are now defined in .dynsym#1685
fix: COMMON/ABS symbols are now defined in .dynsym#1685ostylk wants to merge 2 commits intowild-linker:mainfrom
Conversation
lapla-cogito
left a comment
There was a problem hiding this comment.
Thanks for working on this! Here is my response to the question in the PR's description:
- Probably
run_all_diffsis not configured in the configuration file described at https://github.com/wild-linker/wild/blob/main/CONTRIBUTING.md#configuration-file-for-tests. I recommend runningcp test-config-ci.toml test-config.toml. - I think we also should implement that.
Additionally, CI appears to be showing differences for section.got, which should be ignored. You can resolve this by adding directives like //#DiffIgnore:section.got.
Yeah, that solved it. Thanks.
I added this plus a test case. Is there an easier way to emit an ABS symbol? I used inline assembly.
Done. Thank you for your comments. I marked this PR as ready. |
| } else if sym.is_common(LittleEndian) { | ||
| let symbol_id = sym_def.symbol_id; | ||
| let resolution = layout.local_symbol_resolution(symbol_id).with_context(|| { | ||
| format!( | ||
| "Tried to write dynamic symbol definition without a resolution: {}", | ||
| layout.symbol_debug(symbol_id) | ||
| ) | ||
| })?; | ||
|
|
||
| let mut sym_value = resolution.value(); | ||
|
|
||
| // As common symbols are denoted by setting shndx=0xfff2 which is a special section, | ||
| // we need to put them manually into BSS/TBSS sections depending on whether they are thread | ||
| // local or not. | ||
| let section_id = if sym.st_type() == STT_TLS { | ||
| sym_value -= layout.tls_start_address(); | ||
| output_section_id::TBSS | ||
| } else { | ||
| output_section_id::BSS | ||
| }; | ||
| let section_id = layout.output_sections.primary_output_section(section_id); | ||
|
|
||
| dynamic_symbol_writer | ||
| .copy_symbol(sym, name, section_id, sym_value, ValueFlags::empty()) | ||
| .with_context(|| { | ||
| format!("Failed to copy dynamic {}", layout.symbol_debug(symbol_id)) | ||
| })?; |
There was a problem hiding this comment.
This branch seems to duplicate a lot of code from write_symbols. Would it be possible to extract common parts?
There was a problem hiding this comment.
Hmm, this refactoring looked a bit tricky for me. I attempted it here: ostylk@00abc82 . I tried extracting the section_id/symbol_value resolution logic.
Do you think this approach looks good? I am currently a bit unsure (as some special case logic is now also applied for COMMON symbols but that may be desirable?) so I didn't include that commit in this PR.
At least the unit tests pass.
There was a problem hiding this comment.
It seems to me that there's probably already a bit of duplication between writing of dynamic symbols and writing of debug symbols even prior to this PR. Given that, if we're going to attempt to deduplicate it, it'd probably make sense to do that as a separate PR. So it's fine with me to leave this bit as-is for now.
There was a problem hiding this comment.
Thanks for the attempt. Since this proves hard we can do it separately.
Previously COMMON symbols appeared in the .dynsym section as undefined eventhough resolution was successful and .symtab entries were correct.
bce6d54 to
548ea7e
Compare
|
Those riscv diffs are fine to |
|
Then we'll postpone the deduplication of symbol table logic. I ignored the riscv diffs, but added a TODO. CI should hopefully pass now. |
lapla-cogito
left a comment
There was a problem hiding this comment.
Overall looks good. Added one nit comment.
There was a problem hiding this comment.
You can simply write this test as an assembly file (there are other test files with *.s).
There was a problem hiding this comment.
Assembly tests that also run on RISC-V must enclose comments in /* */ (see https://github.com/wild-linker/wild/blob/main/wild/tests/sources/exclude-section.s).
When absolute symbols are exposed as dynamic symbol in a shared library make sure to include them correctly in .dynsym (and not with undefined address).
Fixes #1679 .
In https://github.com/ostylk/wild/blob/e9f0b316e7ac2157a53bc66e307b6c77e19ce691/libwild/src/elf_writer.rs#L3886 every dynamic symbol got set to zero if it was not associated to a "concrete" section (meaning shndx < SHN_LORESERVE).
However, every COMMON symbol is in a virtual section SHN_COMMON with an index higher than SHN_LORESERVE.
So all COMMON symbols got written as undefined into the .dynsym section.
The fix is to introduce a small special case like in https://github.com/ostylk/wild/blob/e9f0b316e7ac2157a53bc66e307b6c77e19ce691/libwild/src/elf_writer.rs#L1597 .
Now COMMON symbols are correctly written into the .dynsym section with the BSS/TBSS section.
This PR is still a draft because I have two open questions and I am unsure about them:
cargo test. Is my dev environment weird or does the test allow undefined symbols?SHN_ABS. Do we also need absolute symbols in .dynsym? If yes, that would also need to be added here. (But I'm still researching that question, just wanted to get the PR out).