Rella Rella - 10 months ago 48
C Question

How does one encode a series of images into H264 using the x264 C API?

How does one use the x264 C API to encode RBG images into H264 frames? I already created a sequence of RBG images, how can I now transform that sequence into a sequence of H264 frames? In particular, how do I encode this sequence of RGB images into a sequence of H264 frame consisting of a single initial H264 keyframe followed by dependent H264 frames?

Answer Source

First of all: check the x264.h file, it contains more or less the reference for each function and structure. The x264.c file you can find in the download contains a sample implementation. Most people say to base yourself on that one, but I find it rather complex for beginners, it is good as an example to fall back on however.

First you set up some parameters, of the type x264_param_t, a good site describing parameters is . Also take a look at the x264_param_default_preset function which allows you to target some functionality without needing to understand all of the (sometimes quite complex) parameters. Also use x264_param_apply_profile afterwards (you'll probably want the "baseline" profile)

This is some example setup from my code:

x264_param_t param;
x264_param_default_preset(&param, "veryfast", "zerolatency");
param.i_threads = 1;
param.i_width = width;
param.i_height = height;
param.i_fps_num = fps;
param.i_fps_den = 1;
// Intra refres:
param.i_keyint_max = fps;
param.b_intra_refresh = 1;
//Rate control:
param.rc.i_rc_method = X264_RC_CRF;
param.rc.f_rf_constant = 25;
param.rc.f_rf_constant_max = 35;
//For streaming:
param.b_repeat_headers = 1;
param.b_annexb = 1;
x264_param_apply_profile(&param, "baseline");

After this you can initialize the encoder as follows

x264_t* encoder = x264_encoder_open(&param);
x264_picture_t pic_in, pic_out;
x264_picture_alloc(&pic_in, X264_CSP_I420, w, h)

X264 expects YUV420P data (I guess some others also, but that's the common one). You can use libswscale (from ffmpeg) to convert images to the right format. Initializing this is like this (i assume RGB data with 24bpp).

struct SwsContext* convertCtx = sws_getContext(in_w, in_h, PIX_FMT_RGB24, out_w, out_h, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL);

encoding is as simple as this then, for each frame do:

//data is a pointer to you RGB structure
int srcstride = w*3; //RGB stride is just 3*width
sws_scale(convertCtx, &data, &srcstride, 0, h, pic_in.img.plane, pic_in.img.stride);
x264_nal_t* nals;
int i_nals;
int frame_size = x264_encoder_encode(encoder, &nals, &i_nals, &pic_in, &pic_out);
if (frame_size >= 0)
    // OK

I hope this will get you going ;), I spent a long time on it myself to get started. X264 is an insanely strong but sometimes complex piece of software.

edit: When you use other parameters there will be delayed frames, this is not the case with my parameters (mostly due to the nolatency option). If this is the case, frame_size will sometimes be zero and you'll have to call x264_encoder_encode as long as the function x264_encoder_delayed_frames does not return 0. But for this functionality you should take a deeper peek into x264.c and x264.h .