Devin Mitchell Devin Mitchell - 7 months ago 30
Ruby Question

Is Log4r Actually Threadsafe in a Rails 4 Application?

Setting the Scene



In an effort to make the default Rails logging more helpful in a production environment, I have been taking advantage of Log4r and its diagnostic context, specifically MDC. In addition to the logs that come out of the Rails application itself, I'm also inserting some of my own middleware so that I can get consistent logging, even in the Rack middleware.

For example, as soon as I get access to the logged-in user via Warden, I add the MDC entry needed for that piece.

def call(env)
user = env['warden'].user
user_context = user ? user.to_log_format : 'indetermined'
MDC.put :user, user_context

@app.call(env)
end


Other logged things in the Rack middleware are parent PID, request ID, etc.

The Problem



The issue is that my log entries are noticeably incorrect. Under load, I consistently see one user's identity is getting mixed in with a request that an entirely different user would have made to the API if the requests come in close enough together.

I'm told that Log4r MDC is threadsafe and Rails 4 is threadsafe by default, but obviously something isn't right. I've also wondered if Rack is the issue, but the threadsafeness--totally a word, right?--in Rails seems to be confident enough to remove Rack::Lock, then that doesn't seem right either.

What am I missing? All the information seems to say that it's threadsafe, but I'm not convinced that it is.

Under the Hood




  • Rails 4.1.10

  • Log4r 1.1.10

  • Passenger 4.0.59



References




Answer

After some research and experimenting, this has turned out to not be an issue of threadsafeness at all; it's rather an issue of stale data from another request hanging around. To understand the problem, you have to understand how the Log4r MDC stuff is stored to know what the issue is:

The MDC pieces are stored along with the thread so that they are accessible for logging throughout the whole request.

def self.put( a_key, a_value )
  self.check_thread_instance()
  Thread.current[MDCNAME][a_key] = a_value
end

So when a thread completely finishes processing a request, it will pick up another request. Unfortunately, it appears that it's still carrying the details from the previous request - nothing's been purged. So all of those old, unrelated details may end up as diagnostic context on an unrelated request.

The solution then is to ensure that the MDC pieces are purged before you start adding new context to your new request:

MDC.get_context.keys.each { |k| MDC.remove(k) }

Oops.