LuisABOL LuisABOL - 3 months ago 10
Objective-C Question

What is `objc_msgSend_fixup`, exactly?

I'm messing around with the Objective-C runtime, trying to compile objective-c code without linking it against

libobjc
, and I'm having some segmentation fault problems with a program, so I generated an assembly file from it. I think it's not necessary to show the whole assembly file. At some point of my
main
function, I've got the following line (which, by the way, is the line after which I get the seg fault):

callq *l_objc_msgSend_fixup_alloc


and here is the definition for
l_objc_msgSend_fixup_alloc
:

.hidden l_objc_msgSend_fixup_alloc # @"\01l_objc_msgSend_fixup_alloc"
.type l_objc_msgSend_fixup_alloc,@object
.section "__DATA, __objc_msgrefs, coalesced","aw",@progbits
.weak l_objc_msgSend_fixup_alloc
.align 16
l_objc_msgSend_fixup_alloc:
.quad objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_
.size l_objc_msgSend_fixup_alloc, 16


I've reimplemented
objc_msgSend_fixup
as a function (
id objc_msgSend_fixup(id self, SEL op, ...)
) which returns
nil
(just to see what happens), but this function isn't even being called (the program crashes before calling it).

So, my question is, what is
callq *l_objc_msgSend_fixup_alloc
supposed to do and what is
objc_msgSend_fixup
(after
l_objc_msgSend_fixup_alloc:
) supposed to be (a function or an object)?

Edit

To better explain, I'm not linking my source file against the objc library. What I'm trying to do is implement some parts of the libray, just to see how it works. Here is an approach of what I've done:

#include <stdio.h>
#include <objc/runtime.h>

@interface MyClass {

}
+(id) alloc;
@end

@implementation MyClass
+(id) alloc {
// alloc the object
return nil;
}
@end

id objc_msgSend_fixup(id self, SEL op, ...) {
printf("Calling objc_msgSend_fixup()...\n");

// looks for the method implementation for SEL in self's method list

return nil; // Since this is just a test, this function doesn't need to do that
}

int main(int argc, char *argv[]) {
MyClass *m;
m = [MyClass alloc]; // At this point, according to the assembly code generated
// objc_msgSend_fixup should be called. So, the program should, at least, print
// "Calling objc_msgSend_fixup()..." on the screen, but it crashes before
// objc_msgSend_fixup() is called...

return 0;
}


If the runtime needs to access the object's vtable or the method list of the obect's class to find the correct method to call, what is the function which actually does this? I think it is
objc_msgSend_fixup
, in this case. So, when
objc_msgSend_fixup
is called, it receives an object as one of its parameters, and, if this object hasn't been initialized, the function fails.

So, I've implemented my own version of
objc_msgSend_fixup
. According to the assembly source above, it should be called. It doesn't matter if the function is actually looking for the implementation of the selector passed as parameter. I just want
objc_msgSend_lookup
to be called. But, it's not being called, that is, the function that looks for the object's data is not even being called, instead of being called and cause a fault (because it returns a
nil
(which, by the way, doesn't matter)). The program seg fails before
objc_msgSend_lookup
is called...

Edit 2

A more complete assembly snippet:

.globl main
.align 16, 0x90
.type main,@function
main: # @main
.Ltmp20:
.cfi_startproc
# BB#0:
pushq %rbp
.Ltmp21:
.cfi_def_cfa_offset 16
.Ltmp22:
.cfi_offset %rbp, -16
movq %rsp, %rbp
.Ltmp23:
.cfi_def_cfa_register %rbp
subq $32, %rsp
movl $0, %eax
leaq l_objc_msgSend_fixup_alloc, %rcx
movl $0, -4(%rbp)
movl %edi, -8(%rbp)
movq %rsi, -16(%rbp)
movq L_OBJC_CLASSLIST_REFERENCES_$_, %rsi
movq %rsi, %rdi
movq %rcx, %rsi
movl %eax, -28(%rbp) # 4-byte Spill
callq *l_objc_msgSend_fixup_alloc
movq %rax, -24(%rbp)
movl -28(%rbp), %eax # 4-byte Reload
addq $32, %rsp
popq %rbp
ret


For
l_objc_msgSend_fixup_alloc
, we have:

.hidden l_objc_msgSend_fixup_alloc # @"\01l_objc_msgSend_fixup_alloc"
.type l_objc_msgSend_fixup_alloc,@object
.section "__DATA, __objc_msgrefs, coalesced","aw",@progbits
.weak l_objc_msgSend_fixup_alloc
.align 16
l_objc_msgSend_fixup_alloc:
.quad objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_
.size l_objc_msgSend_fixup_alloc, 16


For
L_OBJC_CLASSLIST_REFERENCES_$_
:

.type L_OBJC_CLASSLIST_REFERENCES_$_,@object # @"\01L_OBJC_CLASSLIST_REFERENCES_$_"
.section "__DATA, __objc_classrefs, regular, no_dead_strip","aw",@progbits
.align 8
L_OBJC_CLASSLIST_REFERENCES_$_:
.quad OBJC_CLASS_$_MyClass
.size L_OBJC_CLASSLIST_REFERENCES_$_, 8


OBJC_CLASS_$_MyClass
is a pointer to the
MyClass
struct definition, which has been also generated by the compiler and it's also present in the assembly code.

Answer

To understand what objc_msgSend_fixup is and what it does it's necessary to know exactly how message sending is performed in Objective-C. All the ObjC programmers have heard one day that the compiler transforms [obj message] statements into objc_msgSend(obj, sel_registerName("message")) calls. However, that's not entirely accurate.

To better ilustrate my explanation, consider the following ObjC snippet:

[obj mesgA];
[obj mesgB];

[obj mesgA];
[obj mesgB];

In this snippet, two messages are sent to obj, each of which is sent twice. So, you might imagine that the following code is generated:

objc_msgSend(obj, sel_registerName("mesgA"));
objc_msgSend(obj, sel_registerName("mesgB"));
objc_msgSend(obj, sel_registerName("mesgA"));
objc_msgSend(obj, sel_registerName("mesgB"));

However sel_registerName may be too costly and call it whenever a specific method is called is not a smart thing to do. Then, the compiler generates structures like this for each message to be sent:

typedef struct message_ref {
    id (*trampoline) (id obj, struct message_ref *ref, ...);
    union {
        const char *str;
        SEL sel;
    };
} message_ref;

So, in the example above, when the program starts, we have something like this:

message_ref l_objc_msgSend_fixup_mesgA = { &objc_msgSend_fixup, "mesgA" };
message_ref l_objc_msgSend_fixup_mesgB = { &objc_msgSend_fixup, "mesgB" };

When these messages need to be sent to obj, the compiler generates code equivalent to the following:

l_objc_msgSend_fixup_mesgA.trampoline(obj, &l_objc_msgSend_fixup_mesgA, ...);   // [obj mesgA];
l_objc_msgSend_fixup_mesgB.trampoline(obj, &l_objc_msgSend_fixup_mesgB, ...);   // [obj mesgB];

At the program startup, the message reference trampolines are pointers to the objc_msgSend_fixup function. For each message_ref, when its trampoline pointer is invoked for the first time, objc_msgSend_fixup gets called receiving the obj to which the message's got to be sent and the message_ref structure from which it was called. So, what objc_msgSend_fixup must do is get the selector for the message to be called. Since, this has to be done only once for each message reference, objc_msgSend_fixup must also replace the trampoline field of the ref by a pointer to another function that doesn't fix the message's selector. This function is called objc_msgSend_fixedup (the selector has been fixed up). Now that the message selector has been set and this doesn't have to be done again, objc_msgSend_fixup just calls objc_msgSend_fixedup and this just calls objc_msgSend. After that, if a message ref's trampoline is called again, its selector is already fixed, and objc_msgSend_fixedup is the one that gets called.

In short, we could write objc_msgSend_fixup and objc_msgSend_fixedup like this:

id objc_msgSend_fixup(id obj, struct message_ref *ref, ...) {
    ref->sel = sel_registerName(ref->str);
    ref->trampoline = &objc_msgSend_fixedup;
    objc_msgSend_fixedup(obj, ref, ...);
}

id objc_msgSend_fixedup(id obj, struct message_ref *ref, ...) {
    objc_msgSend(obj, ref->sel, ...);
}

This makes message sending a lot faster, since the appropriate selector is discovered only at the first time the message is called (by objc_msgSend_fixup). On later calls, the selector will have been already found and the message is called directly with objc_msgSend (by objc_msgSend_fixedup).

In the question's assembly code, l_objc_msgSend_fixup_alloc is the alloc method's message_ref structure and the segmentation fault may have been caused by a problem in its first field (maybe it's not pointing to objc_msgSend_fixup...)