kayleeFrye_onDeck kayleeFrye_onDeck - 1 year ago 45
C++ Question

Using only pointer arithmetic to traverse a loaded DLL/EXE for the PE File Format

My end-goal was getting the list of DLL names from the static import data table.

I thought I could do something like,

auto data_dirs = p_loaded_image->FileHeader->OptionalHeader.DataDirectory;


And then somehow iterate over that list of addresses and then get the DLL names that way; something like that.

So for baby-steps I was just trying to verify that I could match values for
p_loaded_image->FileHeader->OptionalHeader.SizeOfStackCommit;
against a manual pointer-math equivalent. I can't seem to do this without
Access Violation
exceptions, which seems to verify that I'm doing this incorrectly.

What did I do wrong, and how specifically do I get my pointer-math query to match the actual loaded image's API return for getting the same value of
SizeOfStackCommit
? If you can teach me that much, I can hopefully progress from this current point of my DLL-name-finding WIP.

For time-saving purposes, if your compiler supports
std::experimental::filesystem
you can start at the comment of
// Skip to here
to avoid all the console and file verification boilerplate, otherwise you'll need to stub it out or change it to something more friendly for older C++ specifications.

#include "Windows.h"
#include "Imagehlp.h"
#include "tchar.h"
#include "stdio.h"
#include "stdlib.h"

#include <string>
#include <vector>
#include <experimental\filesystem>

// All hard-coded values taken directly from latest PE/COFF .docx Documentation from MS:
// => http://go.microsoft.com/fwlink/p/?linkid=84140

const int MAGIC_32_NUM = 0x10b;
const int MAGIC_64_NUM = 0x20b;

// These two require C++17 || If needed, replace with older valid file-verification.
namespace fs = std::experimental::filesystem;
bool verify_loaded_file(std::string);

int _tmain(int argc, _TCHAR* argv[])
{
std::string image_to_load;
if (argc == 2) {
image_to_load = argv[1];
}
else {
printf("A valid path to a loadable image needs to be your only command-line parameter for %s", argv[0]);
return -1;
}

bool validFile = verify_loaded_file(image_to_load);

if (!validFile) {
printf("A valid file path of a DLL or EXE needs to be your only command-line parameter for %s", argv[0]);
return -1;
}

auto filesystem_image = fs::absolute(fs::path(image_to_load));
std::string image_directory = filesystem_image.parent_path().string();
std::string image_name = filesystem_image.stem().string();
std::string image_name_and_extension = image_name + filesystem_image.extension().string();
bool is64bit, is32bit = false;

// To use MapAndLoad, you need to manually include Imagehlp.lib in your project.
// The Imagehlp.h header alone does not suffice.
LOADED_IMAGE loaded_image = { 0 };
LOADED_IMAGE * p_loaded_image = &loaded_image;
bool image_loaded = MapAndLoad(image_name_and_extension.c_str(), image_directory.c_str(), p_loaded_image, FALSE, TRUE);
int error_check = GetLastError();

if (!image_loaded) {
printf("Something went wrong when trying to load %s0 with error code %s1", image_to_load.c_str(), error_check);
UnMapAndLoad(p_loaded_image);
return -1;
}

int magic_number = loaded_image.FileHeader->OptionalHeader.Magic;

if (magic_number == MAGIC_32_NUM) { is32bit = true; }
else if (magic_number == MAGIC_64_NUM) { is64bit = true; }
else {
printf("The magic number from the optional header wasn't detected as 32-bit or 64-bit\n");
printf("Check Windows System Error Code: %s\n", magic_number);
UnMapAndLoad(p_loaded_image);
return -1;
}

// Skip to here
UCHAR * module_base_address = p_loaded_image->MappedAddress;
size_t coverted_base_address = size_t(module_base_address);

size_t windows_optional_header_offset;
if (is64bit) { windows_optional_header_offset = size_t(24); }
else { windows_optional_header_offset = size_t(28); }

size_t data_directory_optional_header_offset;
if (is64bit) { data_directory_optional_header_offset = size_t(112); }
else { data_directory_optional_header_offset = size_t(96); }

size_t size_stack_commit_offset;
if (is64bit) { size_stack_commit_offset = size_t(80); }
else { size_stack_commit_offset = size_t(76); }

// The commented out line below breaks with Access Violations, as does the line following it:
// auto sum_for_size_stack = size_t(coverted_base_address + size_stack_commit_offset);
auto sum_for_size_stack = size_t(coverted_base_address +
windows_optional_header_offset +
data_directory_optional_header_offset +
size_stack_commit_offset);

auto direct_access_size_stack = p_loaded_image->FileHeader->OptionalHeader.SizeOfStackCommit;
DWORD64 * addy = &direct_access_size_stack;

printf("Direct: %s\n", direct_access_size_stack);
printf("Pointer-Math: %s\n", sum_for_size_stack);

UnMapAndLoad(p_loaded_image);
return 0;
}

//

bool verify_loaded_file(std::string file_to_verify)
{
if (fs::exists(file_to_verify))
{
size_t extension_query = file_to_verify.find(".dll", 0);
if (extension_query == std::string::npos)
{
extension_query = file_to_verify.find(".DLL", 0);
if (extension_query == std::string::npos)
{
extension_query = file_to_verify.find(".exe", 0);
if (extension_query == std::string::npos)
{
extension_query = file_to_verify.find(".EXE", 0);
}
else { return true; }

if (extension_query != std::string::npos) { return true; }
}
else { return true; }
}
else { return true; }
}
return false;
}


The PE File Format for windows has its latest documentation here in a whitepaper attachment which holds its
.docx
: http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx

Update:



I've got my pure pointer-arithmetic traversal working for up until just before my end-goal. Getting to that point required me to remove two layers of complexity.


  1. Don't screw around with loading the image with special APIs, as I'm trying to get the static import information; solution for that was to load the EXE into a char vector to snapshot it in memory.

  2. Forget the over-complicated sounding RVA crap. Just use Byte Offsets, or at least think of them that way. The Pecoff DOCX does a much better job explaining RVAs than the original MSJ article. Just consider the address of element 0 for the char vector to be your base address, for which all RVAs are calculated off from. The docx also tells you when to use the offset versus the actual address, which is good to know.



My program still isn't doing what I want it to, but at least I have the gist of pointer-arithmetic matching the pointer accessors, which was the goal of this question.

I think my remaining blockers are related to which structures to load with data, and where. You can build and run my WIP gist on a Win10 box, or update the value of
ph_file
to be some other locally installed 64-bit program on your OS, preferably one without an
.idata
section. The
.idata
section isn't guaranteed to exist even if importing DLLs. Calculator.exe doesn't have it, as one example.

Answer Source

I don't believe you that the line indicated (computation of sum_for_size_stack) causes an access violation. It's just unsigned arithmetic, which cannot overflow or result in a trap value.

I do believe that you get an access violation from printf, because you're using the %s format specifier with an argument that is not a pointer to a NUL-terminated ASCII string. I have no idea what gave you the idea that stack sizes are stored as strings, or that it's a good idea to pass size_t to a variadic function that requires const char*, but neither is true.

Pay attention to the preconditions of printf. The correct format string for a size_t parameter is %zx.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download