-
Notifications
You must be signed in to change notification settings - Fork 5
Initial commit for PE parser #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
06e93ce
Initial commit
Schamper 2007ea6
Update exception.py
Schamper 5509763
Remove ELF changes
Schamper 34b753e
Fix c_pe bits
Schamper ace4e57
Update pe.py
Schamper e14deec
Update pe.py
Schamper c1ab55b
Add MIPS exception
Schamper e0895ef
Process review
Schamper 2d7c645
Process review comments
Schamper b5e33d5
Add pointer types
Schamper File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,7 @@ | ||
| from dissect.executable.elf import ELF | ||
| from dissect.executable.elf.elf import ELF | ||
| from dissect.executable.pe.pe import PE | ||
|
|
||
| __all__ = [ | ||
| "ELF", | ||
| "PE", | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| from dissect.executable.pe.pe import PE | ||
|
|
||
| __all__ = [ | ||
| "PE", | ||
| ] | ||
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| from dissect.executable.pe.directory.base import DataDirectory | ||
| from dissect.executable.pe.directory.basereloc import BaseRelocationDirectory | ||
| from dissect.executable.pe.directory.bound_import import BoundImportDirectory | ||
| from dissect.executable.pe.directory.com_descriptor import ComDescriptorDirectory | ||
| from dissect.executable.pe.directory.debug import DebugDirectory | ||
| from dissect.executable.pe.directory.delay_import import DelayImportDirectory | ||
| from dissect.executable.pe.directory.exception import ExceptionDirectory | ||
| from dissect.executable.pe.directory.export import ExportDirectory | ||
| from dissect.executable.pe.directory.iat import IatDirectory | ||
| from dissect.executable.pe.directory.imports import ImportDirectory, ImportFunction, ImportModule | ||
| from dissect.executable.pe.directory.load_config import LoadConfigDirectory | ||
| from dissect.executable.pe.directory.resource import ( | ||
| ResourceDataEntry, | ||
| ResourceDirectory, | ||
| ResourceDirectoryEntry, | ||
| ResourceEntry, | ||
| ) | ||
| from dissect.executable.pe.directory.security import SecurityDirectory | ||
| from dissect.executable.pe.directory.tls import TlsDirectory | ||
|
|
||
| __all__ = [ | ||
| "BaseRelocationDirectory", | ||
| "BoundImportDirectory", | ||
| "ComDescriptorDirectory", | ||
| "DataDirectory", | ||
| "DebugDirectory", | ||
| "DelayImportDirectory", | ||
| "ExceptionDirectory", | ||
| "ExportDirectory", | ||
| "IatDirectory", | ||
| "ImportDirectory", | ||
| "ImportFunction", | ||
| "ImportModule", | ||
| "LoadConfigDirectory", | ||
| "ResourceDataEntry", | ||
| "ResourceDirectory", | ||
| "ResourceDirectoryEntry", | ||
| "ResourceEntry", | ||
| "SecurityDirectory", | ||
| "TlsDirectory", | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from typing import TYPE_CHECKING | ||
|
|
||
| if TYPE_CHECKING: | ||
| from dissect.executable.pe.pe import PE | ||
|
|
||
|
|
||
| class DataDirectory: | ||
| """Base class for PE data directories.""" | ||
|
|
||
| def __init__(self, pe: PE, address: int, size: int): | ||
| self.pe = pe | ||
| self.address = address | ||
| self.size = size | ||
|
|
||
| def __repr__(self) -> str: | ||
| return f"<{self.__class__.__name__} address={self.address:#x} size={self.size}>" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from dataclasses import dataclass | ||
| from functools import cached_property | ||
| from typing import TYPE_CHECKING | ||
|
|
||
| from dissect.executable.pe.c_pe import c_pe | ||
| from dissect.executable.pe.directory.base import DataDirectory | ||
|
|
||
| if TYPE_CHECKING: | ||
| from collections.abc import Iterator | ||
|
|
||
|
|
||
| class BaseRelocationDirectory(DataDirectory): | ||
| """The base relocation directory of a PE file.""" | ||
|
|
||
| def __repr__(self) -> str: | ||
| return f"<BaseRelocationDirectory entries={len(self.entries)}>" | ||
|
|
||
| def __len__(self) -> int: | ||
| return len(self.entries) | ||
|
|
||
| def __iter__(self) -> Iterator[BaseRelocation]: | ||
| return iter(self.entries) | ||
|
|
||
| def __getitem__(self, idx: int) -> BaseRelocation: | ||
| return self.entries[idx] | ||
|
|
||
| @cached_property | ||
| def entries(self) -> list[BaseRelocation]: | ||
| """List of base relocation entries.""" | ||
| result = [] | ||
|
|
||
| offset = self.address | ||
| while offset < self.address + self.size: | ||
| self.pe.vfh.seek(offset) | ||
|
|
||
| block = c_pe._IMAGE_BASE_RELOCATION(self.pe.vfh) | ||
| if block.SizeOfBlock == 0: | ||
| break | ||
|
|
||
| page_rva = block.VirtualAddress | ||
|
|
||
| num_entries = (block.SizeOfBlock - len(c_pe._IMAGE_BASE_RELOCATION)) // len(c_pe.USHORT) | ||
| result.extend( | ||
| BaseRelocation(c_pe.IMAGE_REL_BASED(entry >> 12), page_rva + (entry & 0xFFF)) | ||
| for entry in c_pe.USHORT[num_entries](self.pe.vfh) | ||
| if (entry >> 12) != 0 # Skip IMAGE_REL_BASED_ABSOLUTE (0) | ||
| ) | ||
| offset += block.SizeOfBlock | ||
| offset += -offset & 3 # Align to 4 bytes | ||
|
|
||
| return result | ||
|
|
||
|
|
||
| @dataclass | ||
| class BaseRelocation: | ||
| """A single base relocation entry in the base relocation directory.""" | ||
|
|
||
| type: c_pe.IMAGE_REL_BASED | ||
| rva: int | ||
|
|
||
| def __repr__(self) -> str: | ||
| return f"<BaseRelocation rva={self.rva:#x} type={self.type.name or self.type.value}>" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from functools import cached_property | ||
| from typing import TYPE_CHECKING | ||
|
|
||
| from dissect.util.ts import from_unix | ||
|
|
||
| from dissect.executable.pe.c_pe import c_pe | ||
| from dissect.executable.pe.directory.base import DataDirectory | ||
|
|
||
| if TYPE_CHECKING: | ||
| import datetime | ||
|
|
||
|
|
||
| class BoundImportDirectory(DataDirectory): | ||
| """The bound import directory of a PE file.""" | ||
|
|
||
| def __repr__(self) -> str: | ||
| return f"<BoundImportDirectory modules={len(self.modules)}>" | ||
|
|
||
| def __len__(self) -> int: | ||
| return len(self.modules) | ||
|
|
||
| def __getitem__(self, idx: str | int) -> BoundImportModule: | ||
| if isinstance(idx, int): | ||
| return self.modules[idx] | ||
| if isinstance(idx, str): | ||
| if idx not in self._by_name: | ||
| raise KeyError(f"Bound import module {idx!r} not found") | ||
| return self._by_name[idx] | ||
| raise TypeError(f"BoundImportDirectory indices must be str or int, not {type(idx).__name__}") | ||
|
|
||
| def __contains__(self, name: str) -> bool: | ||
| if isinstance(name, str): | ||
| return name in self._by_name | ||
| return False | ||
|
|
||
| @cached_property | ||
| def modules(self) -> list[BoundImportModule]: | ||
| """List of bound imported modules.""" | ||
| result = [] | ||
|
|
||
| self.pe.vfh.seek(self.address) | ||
| while self.pe.vfh.tell() < self.address + self.size: | ||
| descriptor = c_pe.IMAGE_BOUND_IMPORT_DESCRIPTOR(self.pe.vfh) | ||
| if not descriptor: | ||
| break | ||
|
|
||
| forwarders = [] | ||
| for _ in range(descriptor.NumberOfModuleForwarderRefs): | ||
| forwarder = c_pe.IMAGE_BOUND_FORWARDER_REF(self.pe.vfh) | ||
| if not forwarder: | ||
| break | ||
|
|
||
| forwarders.append(BoundImportForwardReference(self, forwarder)) | ||
|
|
||
| result.append(BoundImportModule(self, descriptor, forwarders)) | ||
|
|
||
| return result | ||
|
|
||
| @cached_property | ||
| def _by_name(self) -> dict[str, BoundImportModule]: | ||
| """A mapping of module names to their :class:`DelayImportModule`.""" | ||
| return {module.name: module for module in self.modules} | ||
|
|
||
|
|
||
| class BoundImportModule: | ||
| """A module bound imported by a PE file, containing its functions.""" | ||
|
|
||
| def __init__( | ||
| self, | ||
| directory: BoundImportDirectory, | ||
| descriptor: c_pe.IMAGE_BOUND_IMPORT_DESCRIPTOR, | ||
| forwarders: list[BoundImportForwardReference], | ||
| ): | ||
| self.directory = directory | ||
| self.descriptor = descriptor | ||
| self.forwarders = forwarders | ||
|
|
||
| @property | ||
| def timestamp(self) -> datetime.datetime | None: | ||
| """The timestamp of this bound import module, or ``None`` if the PE file is compiled as reproducible.""" | ||
| if self.directory.pe.is_reproducible(): | ||
| return None | ||
| return from_unix(self.descriptor.TimeDateStamp) | ||
|
|
||
| @property | ||
| def name(self) -> str: | ||
| self.directory.pe.vfh.seek(self.directory.address + self.descriptor.OffsetModuleName) | ||
| return c_pe.CHAR[None](self.directory.pe.vfh).decode() | ||
|
|
||
|
|
||
| class BoundImportForwardReference: | ||
| """A forward reference in a bound import module.""" | ||
|
|
||
| def __init__( | ||
| self, | ||
| directory: BoundImportDirectory, | ||
| descriptor: c_pe.IMAGE_BOUND_FORWARDER_REF, | ||
| ): | ||
| self.directory = directory | ||
| self.descriptor = descriptor | ||
|
|
||
| @property | ||
| def timestamp(self) -> datetime.datetime | None: | ||
| """The timestamp of this bound import module, or ``None`` if the PE file is compiled as reproducible.""" | ||
| if self.directory.pe.is_reproducible(): | ||
| return None | ||
| return from_unix(self.descriptor.TimeDateStamp) | ||
|
|
||
| @property | ||
| def name(self) -> str: | ||
| self.directory.pe.vfh.seek(self.directory.address + self.descriptor.OffsetModuleName) | ||
| return c_pe.CHAR[None](self.directory.pe.vfh).decode() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from functools import cached_property | ||
| from typing import BinaryIO | ||
|
|
||
| from dissect.util.stream import RangeStream | ||
|
|
||
| from dissect.executable.pe.c_pe import c_pe | ||
| from dissect.executable.pe.directory.base import DataDirectory | ||
|
|
||
|
|
||
| class ComDescriptorDirectory(DataDirectory): | ||
| """The COM descriptor directory of a PE file. | ||
|
|
||
| References: | ||
| - https://www.codeproject.com/Articles/12585/The-NET-File-Format | ||
| """ | ||
|
|
||
| @cached_property | ||
| def descriptor(self) -> c_pe.IMAGE_COR20_HEADER: | ||
| """The CLR 2.0 header descriptor.""" | ||
| self.pe.vfh.seek(self.address) | ||
| return c_pe.IMAGE_COR20_HEADER(self.pe.vfh) | ||
|
|
||
| @cached_property | ||
| def metadata(self) -> ComMetadata: | ||
| """The COM metadata directory.""" | ||
| return ComMetadata(self.pe, self.descriptor.MetaData.VirtualAddress, self.descriptor.MetaData.Size) | ||
|
|
||
|
|
||
| class ComMetadata(DataDirectory): | ||
| """The COM metadata directory of the COM descriptor.""" | ||
|
|
||
| @cached_property | ||
| def metadata(self) -> c_pe.IMAGE_COR20_METADATA: | ||
| """The CLR 2.0 metadata descriptor.""" | ||
| self.pe.vfh.seek(self.address) | ||
| return c_pe.IMAGE_COR20_METADATA(self.pe.vfh) | ||
|
|
||
| @property | ||
| def version(self) -> str: | ||
| """The version as defined in the metadata.""" | ||
| return self.metadata.Version.decode().strip("\x00") | ||
|
|
||
| @cached_property | ||
| def streams(self) -> list[ComStream]: | ||
| """A list of streams defined in the metadata.""" | ||
| result = [] | ||
|
|
||
| offset = self.address + len(self.metadata) | ||
| for _ in range(self.metadata.NumberOfStreams): | ||
| self.pe.vfh.seek(offset) | ||
| header = c_pe.IMAGE_COR20_STREAM_HEADER(self.pe.vfh) | ||
|
|
||
| result.append(ComStream(self, header.Offset, header.Size, header.Name.decode())) | ||
|
|
||
| offset += len(header) | ||
| offset += -offset & 3 # Align to 4 bytes | ||
|
|
||
| return result | ||
|
|
||
|
|
||
| class ComStream: | ||
| """A stream in the COM metadata.""" | ||
|
|
||
| def __init__(self, metadata: ComMetadata, offset: int, size: int, name: str): | ||
| self.metadata = metadata | ||
| self.offset = offset | ||
| self.size = size | ||
| self.name = name | ||
|
|
||
| def __repr__(self) -> str: | ||
| return f"<ComStream offset={self.offset:#x} size={self.size:#x} name={self.name!r}>" | ||
|
|
||
| @property | ||
| def data(self) -> bytes: | ||
| """The data of the stream.""" | ||
| self.metadata.pe.vfh.seek(self.metadata.address + self.offset) | ||
| return self.metadata.pe.vfh.read(self.size) | ||
|
|
||
| def open(self) -> BinaryIO: | ||
| """Open the stream for reading.""" | ||
| return RangeStream(self.metadata.pe.vfh, self.metadata.address + self.offset, self.size) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.