Add OMF read support#807
Conversation
That's logically what I would expect that API to return, so I think this is ok. I don't want to change the |
|
I don't have any major feedback, this looks sane to me. Thanks for working on it. For testing, the preferred approach is to run |
|
For example, you can add a test using: then manually verify |
- Fix SEGDEF length field size to depend on record type, not the P (use32) bit, and honor the B (big) bit. This fixes parsing of Borland objects. - Parse COMDAT records per the TIS spec (separate align byte, conditional public base fields), synthesize sections and symbols for them, and support continuation and iterated data. - Allow FIXUPP records to follow COMDAT records. - Support Borland virtual segments (COMDEF with a segment index data type, referenced via segment indices with bit 14 set). - Read the FIXUP M bit from the correct byte (LOCAT instead of fix data). - Fix iterated data (LIDATA) repeat/block counts to be plain integers rather than COMDEF-style encoded values, support multiple consecutive blocks, and propagate expansion errors. - Fix relocation section targets to use 1-based section indices. - Rework relocation kind mapping: self-relative fixups map to Relative with an end-of-location addend; segment-relative offsets map to Absolute (FLAT frame) or SectionOffset (target-section frame). - Return empty imports/exports like other relocatable formats.
- Add Borland and Watcom test files (from object-testfiles) - Add objdump snapshot outputs for all OMF test files - Verify LIDATA expansion content and COMDAT sections/symbols in tests
- Format with rustfmt; fix clippy lints - Add module documentation - Reject invalid record types instead of silently skipping - Recognize the DWARF segment class as debug sections - Add read_core,omf to the feature test matrix
- Reject zero-length records (slice panic) - Check for overflow when computing FIXUP offsets - Limit iterated data block nesting depth (stack overflow)
Adds read support for OMF (Relocatable Object Module Format), the object format used by DOS-era compilers (Borland C++, Open Watcom, MS C, etc.). Both 16-bit and 32-bit variants are supported.
OMF doesn't have a notion of sections; data is contributed to segments (
SEGDEF) and COMDATs. This implementation maps both to sections in the unified read API:SEGDEFbecomes a section, withLEDATA/LIDATArecords contributing data chunks.COMDATsynthesizes a section and a defined symbol, withObjectComdattying them together. Continuation records and iterated data are supported.COMDEFwith a segment index data type, referenced by other records via segment indices with bit 14 set) is handled the same way as COMDATs.Since segment data is split across records (and
LIDATArequires expansion),data()returns a contiguous&'data [u8]only when possible, anduncompressed_data()returns the assembled/expanded data otherwise (as discussed below).Fixups are exposed as relocations, with the full location/mode/frame/target information preserved in
RelocationFlags::Omf. The genericRelocationKindmapping is best-effort:Relative(with the addend adjusted to be relative to the end of the location)Absolutewhen the frame is theFLATgroup, orSectionOffsetwhen the frame is the target's segment; otherwiseUnknownSectionIndex, and far pointer fixups toAbsoluteTesting:
objdumpsnapshot tests incrates/examples/testfiles/omfcover all test files, including objects built with Borland C++ 4.5 and Open Watcom (contributed by a user; added in theobject-testfilesPR).LIDATAexpansion contents, COMDAT section data, and symbol properties.To-do:
object-testfiles. (here, updated with Borland/Watcom test files)objdumpsnapshot tests and more detailed handwritten tests.Not handled (can be follow-ups):
BAKPAT/NBKPAT) — none of the test compilers emit them.LINNUM/LINSYM) are skipped.Resources:
Resolves #736