user1318499 user1318499 - 1 year ago 39 Question

Sponsor's Renewal function stops being called

I have a server and client process both running on the same machine. The client creates a CAO object and uses it for some time (<1s up to hours). It takes a lot of memory so I want to dispose of this object as soon as possible after the client finishes with it.

I set InitialLeaseTime and RenewOnCallTime to 10s (0.1s and 15s have the same problem). I can see that for a few minutes, the sponsor's Renweal functon is being called every 10s. After several minutes the client starts doing different kind of work and the sponsor stops being called (this seems wrong). A few minutes later when the client tries to use the remote object, it throws an exception saying it has been disconnected (probably because the sponsor wasn't called for a long time).

It seems like the lease manager somehow stops trying to check the lease after a while.

Answer Source

Long time in-between responses, but I figure others may run into this issue too, so here goes.

I'd recommend attaching to your server in VS, going to the Debug menu, choosing 'exceptions,' and then checking the 'System.Net.Sockets.SocketException' exception. This will break your program on any socket exception that occurs.

In my case, I've recently begun seeing this issue, and after a lot of debugging noticed that just before the Lease Manager stops checking for leases, a SocketException was occurring. In my case, the socket exception was AddressChangedCallback, with the stack trace:

1      [External Code]
 2      System.dll!System.Net.Dns.TryGetAddrInfo(string name = "", System.Net.AddressInfoHints flags, out System.Net.IPHostEntry hostinfo = null)
 3      System.dll!System.Net.Dns.GetAddrInfo(string name)
 4      System.dll!System.Net.Dns.InternalGetHostByName(string hostName, bool includeIPv6)
 5      System.dll!System.Net.Dns.GetHostEntry(string hostNameOrAddress)
 6      System.Runtime.Remoting.dll!System.Runtime.Remoting.Channels.CoreChannel.UpdateCachedIPAddresses()
 7      System.Runtime.Remoting.dll!System.Runtime.Remoting.Channels.CoreChannel.OnNetworkAddressChanged(object sender = null, System.EventArgs e = {System.EventArgs})
 8      mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx)
 9      mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx)
 10     mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state)
 11     System.dll!System.Net.NetworkInformation.NetworkChange.AddressChangeListener.AddressChangedCallback(object stateObject, bool signaled)
 12     mscorlib.dll!System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(object state, bool timedOut)
 13     [External Code]

This AddressChangedCallback, which, in my case, seems to be related to a network adapter going down or being changed (you can see your network adapters by holding windows+r then typing ncpa.cpl - if you have two or more it's possible this event is caused by you switching between them) appeared to cause the socket to stop reading. This meant that the next time the LeaseManager went to use the remoting connection to check the remote lease, it couldn't read that lease from the dead socket. So, it did the reasonable thing - disconnect that sponsor, since we can't read it anymore, and remove it from the list of sponsors for the object. And since it's probably the only sponsor for the object in question, that object then gets unsponsored by the LeaseManger, leaving it free for the GC to eventually pick up.

One approach to solving this is to, in your InitializeLifetimeService() method, return null instead of setting timeouts. This bypasses the LeaseManager, so you never have to worry about the object getting de-sponsored due to a socket exception, since you're not using leases in the first place. However, if you're like me, this also means you could have a buildup of objects and unmanaged resources over a period of time on the server. The only way around the buildup issue that I can see is making your remoting object implement Dispose, and making sure that you Dispose it when you're finished with it. Basically, you can't rely on the LeaseManager handling garbage collection, so you'll have to do GC yourself, the ol' fashioned way.

Also worthy of note: the ITrackingHandler object will allow you to track when LeaseManager-related objects are Disconnected, Marshaled, and Unmarshaled. It was a big help in figuring out what was going on, since I could see that an object was being disconnected, instead of inferring it from the fact that the calls stopped happening.