cross cross - 7 months ago 24
Perl Question

Chronological trace of function calls in C++ using etrace

Background:

I have one big simulation tool, and I need to understand its logical behavior. In order to do that, the most of help I would get if I have the chronological order of function calls, for a minimal working example.

I found many tools online, like CygProfile and etrace. I became so miserable on finding a solution that I started to follow the craziest solution of using "step into" with the debugger. Which is a good option if you have a small program but not a complete simulation tool.




Problem:

One of the problems I face is that the above-mentioned solutions are originally meant for
C
and they generate a static file (
*.o
) when compiled. On the other hand the simulation tool generates a shared library (
.so
). I don't have much knowledge on lower level stuff so I seem to fail when I try linking them.

I looked specifically at the
etrace
documentation, and it says:


To see how to modify ptrace.c to work with a dynamic library, look at
the example2 directory. The sources there also create a stand-alone
executable, but the PTRACE_REFERENCE_FUNCTION macro is defined just as
it would be for a dynamic library.


If you look at the repo there is no difference between the files in
example
and
example2
folders. Only there is an extra
.h
file in
example2
.

On the other hand, if you look at
src/ptrace.c
there it says:


When using ptrace on a dynamic library, you must set the
PTRACE_REFERENCE_FUNCTION macro to be the name of a function in the
library. The address of this function when loaded will be the first
line output to the trace file and will permit the translation of the
other entry and exit pointers to their symbolic names. You may set
the macro PTRACE_INCLUDE with any #include directives needed for
that function to be accesible to this source file.


a little below there is the commented code:

/* When using ptrace on a dynamic library, the following must be defined:
#include "any files needed for PTRACE_REFERENCE_FUNCTION"
#define PTRACE_REFERENCE_FUNCTION functionName
`*/





Question:

In essence the question is the following: How to use
etrace
with a dynamic library?

Do I need to #include any files?


To trace a stand-alone program, there is no need to #include any
additional file. Just link your code against ptrace.c and use the
-finstrument-functions option as a compile option for gcc. This should do it.


How do I link a C++ code which is built via makefiles against
ptrace.c


Final Note: I would appreciate if someone bears with my ignorance and provides a step-by-step solution to my question.




Update 1:

I managed to add the libraries related to etrace to the simulation tool, and it executes fine.

However, (probably because the scripts are too old, or are not meant for use with C++) I get the following error when using the perl script provided by default by
etrace



Hexadecimal number > 0xffffffff non-portable"



Probably this changes a bit the nature of this question, turning it more to a perl related issue at this point.

If this problem is solved, I hope
etrace
will work with a complicated project and I will provide the details




Update 2:

I took the suggestions from @Harry, and I believe that would work in most projects. However in my case I get the following from the perl script:

Use of uninitialized value within %SYMBOLTABLE in list assignment at etrace2.pl line 99, <CALL_DATA> line 1.

\-- ???
| \-- ???
\-- ???
| \-- ???
| | \-- ???
\-- ???
| \-- ???
\-- ???
| \-- ???
\-- ???
| \-- ???
\-- ???
| \-- ???
\-- ???
| \-- ???


Due to autegenerated makefiles I used the LD_PRELOAD to load the shared library for etrace.so which I got as follows:

gcc -g -finstrument-functions -shared -fPIC ptrace.c -o etrace.so -I <path-to-etrace>


I created the dummy etrace.h inside the tool:

#ifndef __ETRACE_H_
#define __ETRACE_H_

#include <stdio.h>

void Crumble_buy(char * what, int quantity, char * unit);


void Crumble_buy(char * what, int quantity, char * unit)
{
printf("buy %d %s of %s\n", quantity, unit, what);
}

#endif


and used
Crumble_buy
for the
#define
and the
etrace.h
for the
#include
.

Answer

Fixing the Perl Script

Hexadecimal number > 0xffffffff non-portable"

This is a warning from hex because it's detecting a possibly non-portable value (something > 32bits).

At the very top of the script, add this:

use bigint qw/hex oct/;

When this tool was written, I suspect the people were on 32-bit machines. You can compile the program using 32-bit with the flag -m32, but if you change the perl script as mentioned above you won't need to.

Note, if you're on a Mac, you can't use mknod the way it's used in the script to create a pipe; you need to use mkfifo with no arguments instead.

On Linux, adding the bigint fix above works. You then need to run both commands from the same directory, I did this using example2:

../src/etrace.pl crumble
# Switch to a different terminal
./crumble

and I get this on the Mac and Linux

\-- main
|   \-- Crumble_make_apple_crumble
|   |   \-- Crumble_buy_stuff
|   |   |   \-- Crumble_buy
|   |   |   \-- Crumble_buy
|   |   |   \-- Crumble_buy
|   |   |   \-- Crumble_buy
|   |   |   \-- Crumble_buy
|   |   \-- Crumble_prepare_apples
|   |   |   \-- Crumble_skin_and_dice
|   |   \-- Crumble_mix
|   |   \-- Crumble_finalize
|   |   |   \-- Crumble_put
|   |   |   \-- Crumble_put
|   |   \-- Crumble_cook
|   |   |   \-- Crumble_put
|   |   |   \-- Crumble_bake

About the Dynamic Library...

When you load a dynamic library, the address in the object file is not the address that will be used when running. What etrace does is take a function name from a header you specify. For example, in the case of example2, this would be the following:

#include "crumble.h"
#define PTRACE_REFERENCE_FUNCTION Crumble_buy

You would then edit the makefile to make sure that the header file can be found:

CFLAGS = -g -finstrument-functions -I.

Note the addition of the include -I.. The address of the symbol from the header (in our case, Crumble_buy) is used to calculate the offset between the object file and the actual address; this allows the program to calculate the correct address to find the symbol.

If you look at the output of nm, you get something like the following:

0000000100000960 T _Crumble_bake
00000001000005b0 T _Crumble_buy
0000000100000640 T _Crumble_buy_stuff
00000001000009f0 T _Crumble_cook

The addresses on the left are relative, that is, at runtime, these addresses actually change. The etrace.pl program is storing these in a hash like this:

$VAR1 = {
          '4294969696' => '_Crumble_bake',
          '4294969424' => '_Crumble_put',
          '4294970096' => '_main',
          '4294969264' => '_Crumble_mix',
          '4294970704' => '_gnu_ptrace_close',
          '4294967296' => '__mh_execute_header',
          '4294968752' => '_Crumble_buy',
          '4294968896' => '_Crumble_buy_stuff',
          '4294969952' => '_Crumble_make_apple_crumble',
          '4294969184' => '_Crumble_prepare_apples',
          '4294971512' => '___GNU_PTRACE_FILE__',
          '4294971504' => '_gnu_ptrace.first',
          '4294970208' => '_gnu_ptrace',
          '4294970656' => '___cyg_profile_func_exit',
          '4294970608' => '___cyg_profile_func_enter',
          '4294969552' => '_Crumble_finalize',
          '4294971508' => '_gnu_ptrace.active',
          '4294969840' => '_Crumble_cook',
          '4294969088' => '_Crumble_skin_and_dice',
          '4294970352' => '_gnu_ptrace_init'
        };

Note the leading underscore because this is on a Mac using clang. At runtime, these addresses are not correct, but their relative offsets are. If you can work out what the offset is, you can adjust the addresses you get at runtime to find the actual symbol. The code that does this follows:

 if ($offsetLine =~ m/^$REFERENCE_OFFSET\s+($SYMBOL_NAME)\s+($HEX_NUMBER)$/) {
    # This is a dynamic library; need to calculate the load offset
    my $offsetSymbol  = "_$1";
    my $offsetAddress = hex $2; 

    my %offsetTable = reverse %SYMBOLTABLE;

    print Dumper(\%offsetTable);
    $baseAddress = $offsetTable{$offsetSymbol} - $offsetAddress;
    #print("offsetSymbol == $offsetSymbol\n");
    #print("offsetAddress == $offsetAddress\n");
    #print("baseoffsetAddress == $offsetAddress\n");
    $offsetLine = <CALL_DATA>;
  } else {
    # This is static
    $baseAddress = 0;
  }

This is what the line #define PTRACE_REFERENCE_FUNCTION Crumble_buy is for. The C code in ptrace is using that MACRO, and if defined, outputting the address of that function as the first thing. It then calculates the offset, and for all subsequent addresses, adjusts them by this amount, looking up the correct symbol in the hash.