I thought I knew C# : Garbage collection & disposal. Part-II

Okay, we are back in to collect “garbage” properly. If you haven’t the first part of this post you might want to read it here. We are following norms and conventions from the first post here too.

 Starting things from where we left off:

This write up continues after the first one where we ended things with SafeHandle class and IDisposable  interface. I do see how intriguing this is of course. But before that, like a unexpected, miserable commercial break before the actual stuff comes on your TV, let’s go have a look on something called finalization.

What you create, you “might” need to destroy:

If you know a little bit of C++ you’d know that there is a thing called destructor and the name implies it does the exactly opposite of what a constructor does. C++ kind of needs it since it doesn’t really have a proper garbage collection paradigm. But that also raises the question of whether C# has a destructor and if it does what does it do and why do we even have one since we said managed resourced would be collected automatically.

Lets jump onto some code, shall we?

class Cat
{
    ~Cart()
    {
        // cleanup things
    }
}

Wait a minute, now this is getting confusing. I have IDisposable::Dispose() and I have a destructor. Both looks like to have same responsibility. Not exactly. Before we confuse ourselves more, lets dig a bit more in the code. To be honest, C# doesn’t really have a destructor, it’s basically a syntactic sugar and inside c sharp compiler translates this segment as

protected override void Finalize()
{
    try
    {
        // Clean things here
    }
    finally
    {
        base.Finalize();
    }
}

We can pick up to things here. First thing is the destructor essentially translates to a overridden Finalize() method. And it also calls the Finalize() base class in the finally block. So, it’s essentially a recursive finalize(). Wait, we have no clue what is Finalize().

Finalize, who art thou?

Finalize is somebody who is blessed to talk with the garbage collector. If you have already questioning where the topics we discussed in last post comes in help, this is the segment. Lets start getting answers for all our confusions on last segment. First, why Finalize is overridden and where does it gets its signature. Finalize gets its signature from Object class. Okay, understandable. What would be the default implementation then? Well, it doesn’t have a default implementation. You might ask why. A little secret to slip in here, remember the mark step of garbage collector. He marks a type instance for finalization if and only if it has overridden the Finalize() method. It’s your way of telling the garbage collector that you need to do some more work before you can reclaim my memory. Garbage collector would mark this guy and put him in a finalization queue which is essentially a list of objects whose finalization codes must be run before GC can reclaim their memory. So, you are possibly left with one more confusion now. And that is, you understood garbage collector uses this method to finalize things, i.e. doing the necessary cleanup. Now, then why do we need IDisposable where we can just override finalize and be done with it, right?

Dispose, you confusing scum!

Turns out dispose intends to point out a pattern in code that you can use to release unmanaged memory. What I didn’t tell you about garbage collector before is you don’t really know when he would essentially do that, you practically have no clue. That also means if you write an app that uses unmanaged resources like crazy like reading 50-60 files at a time, you are using a lot of scarce resources at the same time. And in the end you are waiting for GC to do his job but that guy has no time table. So, hoarding these resources is not a good idea in the meantime. Since releasing unmanaged resources are developers duty, putting that over a finalize method  and waiting for GC to come and invoke that is a stupid way to go. Moreover, if you send an instance to a finalization queue, it means, GC will essentially do the memory cleanup in the next round, he will only invoke finalize this round. That also means GC has to visit you twice to clear off your unused occupied memory which you kinda need now. And the fact that you might want to release the resources NOW is a pretty good reason itself to not wait for GC to do your dirty work. And, when you actually shut down your application, Mr. GC visits you unconditionally and takes out EVERYONE he finds. I hope the need of Dispose() is getting partially clear to you now. We need a method SomeMethod() that we can call which would clean up the unmanaged resources. If we fail to call that at some point, just to make sure garbage collector can call that we will use that same method inside Finalize() so it is called anyhow. If I have not made a fool out of myself at this point, you have figured out the fact that SomeMethod() is Dispose(). Okay, so we know what we are going to do. Now we need to act on it. We will implement the dispose pattern we have been talking about. The first thing we would do here is we would try to write code that reads a couple of lines from a file. We would try to do it the unmanaged way, then move it to slowly to the managed way and in the process we would see how we can use IDisposable there too.

Doing things the unsafe way:

I stole some code from a very old msdn doc which describes a FileReader class like the following:

using System;
using System.Runtime.InteropServices;
public class FileReader
{
    const uint GENERIC_READ = 0x80000000;
    const uint OPEN_EXISTING = 3;
    IntPtr handle;
[DllImport("kernel32", SetLastError = true)]
    static extern unsafe IntPtr CreateFile(
            string FileName,                    // file name
            uint DesiredAccess,                 // access mode
            uint ShareMode,                     // share mode
            uint SecurityAttributes,            // Security Attributes
            uint CreationDisposition,           // how to create
            uint FlagsAndAttributes,            // file attributes
            int hTemplateFile                   // handle to template file
            );
[DllImport("kernel32", SetLastError = true)]
    static extern unsafe bool ReadFile(
            IntPtr hFile,                       // handle to file
            void* pBuffer,                      // data buffer
            int NumberOfBytesToRead,            // number of bytes to read
            int* pNumberOfBytesRead,            // number of bytes read
            int Overlapped                      // overlapped buffer
            );
[DllImport("kernel32", SetLastError = true)]
    static extern unsafe bool CloseHandle(
            IntPtr hObject   // handle to object
            );
    public bool Open(string FileName)
    {
        // open the existing file for reading
        handle = CreateFile(
                FileName,
                GENERIC_READ,
                0,
                0,
                OPEN_EXISTING,
                0,
                0);
if (handle != IntPtr.Zero)
            return true;
        else
            return false;
    }
    public unsafe int Read(byte[] buffer, int index, int count)
    {
        int n = 0;
        fixed (byte* p = buffer)
        {
            if (!ReadFile(handle, p + index, count, &n, 0))
                return 0;
        }
        return n;
    }
    public bool Close()
    {
        // close file handle
        return CloseHandle(handle);
    }
}

Please remember you need to check your Allow Unsafe Code checkbox in your build properties before you start using this class. Lets have a quick run on the code pasted here. I don’t intend to tell everything in details here because that is not the scope of this article. But we will build up on it, so we need to know a little bit. The DllImport attribute here is essentially something  you would need to use an external dll (thus unmanaged) and map the functions inside it to your own managed class. You can also see that’s why we have used the extern keyword here. The implementations of these methods doesn’t live in your code and thus your garbage collector can’t take responsibility of clean up here. 🙂 The next thing you would notice is the fixed statement. fixed statement essentially link up a managed type to an unsafe one and thus make sure GC doesn’t move the managed type when it collects. So, the managed one stays in one place and points to the unmanaged resource perfectly. So, what are we waiting for? Lets read a file.

static int Main(string[] args)
{
    if (args.Length != 1)
    {
        Console.WriteLine("Usage : ReadFile <FileName>");
        return 1;
    }
    if (!System.IO.File.Exists(args[0]))
    {
        Console.WriteLine("File " + args[0] + " not found.");
        return 1;
    }
    byte[] buffer = new byte[128];
    FileReader fr = new FileReader();
    if (fr.Open(args[0]))
    {
        // Assume that an ASCII file is being read
        ASCIIEncoding Encoding = new ASCIIEncoding();
        int bytesRead;
        do
        {
            bytesRead = fr.Read(buffer, 0, buffer.Length);
            string content = Encoding.GetString(buffer, 0, bytesRead);
            Console.Write("{0}", content);
        }
        while (bytesRead > 0);
        fr.Close();
        return 0;
    }
    else
    {
        Console.WriteLine("Failed to open requested file");
        return 1;
    }
}

So, this is essentially a very basic console app and looks somewhat okay. I have created a byte array of size 128 which I would use as a buffer when I read. FileReader returns 0 when it can’t read anymore. Don’t get confused seeing this.

while (bytesRead > 0)

It’s all nice and dandy to be honest. And it works too. Invoke the application (in this case the name here is  TestFileReading.exe) like the following:

TestFileReading.exe somefile.txt

And it works like a charm. But what I did here is we closed the file after use. What if something happens in the middle, something like the file not being available. Or I throw an exception in the middle. What will happen is the file would not be closed up until my process is not closed. And the GC will not take care of it because it doesn’t have anything in the Finalize() method.

Making it safe:

public class FileReader: IDisposable
{
    const uint GENERIC_READ = 0x80000000;
    const uint OPEN_EXISTING = 3;
    IntPtr handle = IntPtr.Zero;
    [DllImport("kernel32", SetLastError = true)]
    static extern unsafe IntPtr CreateFile(
            string FileName,                 // file name
            uint DesiredAccess,             // access mode
            uint ShareMode,                // share mode
            uint SecurityAttributes,      // Security Attributes
            uint CreationDisposition,    // how to create
            uint FlagsAndAttributes,    // file attributes
            int hTemplateFile          // handle to template file
            );
    [DllImport("kernel32", SetLastError = true)]
    static extern unsafe bool ReadFile(
            IntPtr hFile,                   // handle to file
            void* pBuffer,                 // data buffer
            int NumberOfBytesToRead,      // number of bytes to read
            int* pNumberOfBytesRead,     // number of bytes read
            int Overlapped              // overlapped buffer
            );
    [DllImport("kernel32", SetLastError = true)]
    static extern unsafe bool CloseHandle(
            IntPtr hObject   // handle to object
            );
    public bool Open(string FileName)
    {
        // open the existing file for reading
        handle = CreateFile(
                FileName,
                GENERIC_READ,
                0,
                0,
                OPEN_EXISTING,
                0,
                0);
        if (handle != IntPtr.Zero)
            return true;
        else
            return false;
    }
    public unsafe int Read(byte[] buffer, int index, int count)
    {
        int n = 0;
        fixed (byte* p = buffer)
        {
            if (!ReadFile(handle, p + index, count, &n, 0))
                return 0;
        }
        return n;
    }
    public bool Close()
    {
        // close file handle
        return CloseHandle(handle);
    }
    public void Dispose()
    {
        if (handle != IntPtr.Zero)
            Close();
    }
}

Now, in our way towards making things safe, we implemented IDisposable here. That exposed Dispose() and the first thing I did here is we checked whether the handle is IntPtr.Zero and if it’s not we invoked Close(). Dispose() is written this way because it should be invokable in any possible time and it shouldn’t throw any exception if it is invoked multiple times. But is it the solution we want? Look closely. We wanted to have a Finalize() implementation that will essentially do the same things if somehow Dispose() is not called. Right?

Enter the Dispose(bool) overload. We want the parameter less Dispose() to be used by only the external consumers. We would issue a second Dispose(bool) overload where the boolean parameter indicates whether the method call comes from a Dispose method or from the finalizer. It would be true if it is invoked from the parameter less Dispose() method.

With that in mind our code would eventually be this:

    public class FileReader: IDisposable
    {
        const uint GENERIC_READ = 0x80000000;
        const uint OPEN_EXISTING = 3;
        IntPtr handle = IntPtr.Zero;
        private bool isDisposed;

        SafeHandle safeHandle = new SafeFileHandle(IntPtr.Zero, true);

        [DllImport("kernel32", SetLastError = true)]
        static extern unsafe IntPtr CreateFile(
              string FileName,                  // file name
              uint DesiredAccess,              // access mode
              uint ShareMode,                 // share mode
              uint SecurityAttributes,       // Security Attributes
              uint CreationDisposition,     // how to create
              uint FlagsAndAttributes,     // file attributes
              int hTemplateFile           // handle to template file
              );

        [DllImport("kernel32", SetLastError = true)]
        static extern unsafe bool ReadFile(
             IntPtr hFile,                // handle to file
             void* pBuffer,              // data buffer
             int NumberOfBytesToRead,   // number of bytes to read
             int* pNumberOfBytesRead,  // number of bytes read
             int Overlapped           // overlapped buffer
             );

        [DllImport("kernel32", SetLastError = true)]
        static extern unsafe bool CloseHandle(
              IntPtr hObject   // handle to object
              );

        public bool Open(string FileName)
        {
            // open the existing file for reading
            handle = CreateFile(
                  FileName,
                  GENERIC_READ,
                  0,
                  0,
                  OPEN_EXISTING,
                  0,
                  0);

            if (handle != IntPtr.Zero)
                return true;
            else
                return false;
        }

        public unsafe int Read(byte[] buffer, int index, int count)
        {
            int n = 0;
            fixed (byte* p = buffer)
            {
                if (!ReadFile(handle, p + index, count, &n, 0))
                    return 0;
            }
            return n;
        }

        public bool Close()
        {
            // close file handle
            return CloseHandle(handle);
        }

        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }

        protected virtual void Dispose(bool isDisposing)
        {
            if (isDisposed)
                return;

            if (isDisposing)
            {
                safeHandle.Dispose();
            }

            if (handle != IntPtr.Zero)
                Close();

            isDisposed = true;
        }
    }

Now if you focus on the changes we made here is introducing the following method:

protected virtual void Dispose(bool isDisposing)

Now, this method envisions what we discussed a moment earlier. You can invoke it multiple times without any issue. There are two prominent block here.

  • The conditional block is supposed to free managed resources (Read invoking Dispose() methods of other IDisposable member/properties inside the class, if we have any.)
  • The non-conditional block frees the unmanaged resources.

You might ask why the conditional block tries to dispose managed resources. The GC takes care of that anyway right? Yes, you’re right. Since garbage collector is going to take care of the managed resources anyway, we are making sure the managed resources are disposed on demand if only someone calls the parameter less Dispose(). 

Are we forgetting something again? Remember, you have unmanaged resources and if somehow the Dispose() is not invoked you still have to make sure this is finalized by the garbage collector. Let’s write up a one line destructor here.

 ~FileReader()
 {
    Dispose(false);
 }

It’s pretty straightforward and it complies with everything we said before. Kudos! We are done with FileReader.

Words of Experience:

Although we are safe already. We indeed forgot one thing. If we invoke Dispose() now it will dispose unmanaged and managed resources both. That also means when the garbage collector will come to collect he will see there is a destructor ergo there is a Finalize() override here. So, he would still put this instance into Finalization Queue. That kind of hurts our purpose. Because we wanted to release memory as soon as possible. If the garbage collector has to come back again, that doesn’t really make much sense. So, we would like to suppress the garbage collector to invoke Finalize() if we know we have disposed it ourselves. And a single line modification to the Dispose() method would allow you to do so.

public void Dispose()
{
    Dispose(true);
    GC.SuppressFinalize(this);
}

We added the following statement to make sure if we have done disposing ourselves the garbage collector would not invoke Finalize() anymore.

GC.SuppressFinalize(this);

Now, please keep in mind that you shouldn’t write a GC.SupperssFinalize() in a derived class since your dispose method would be overridden and you would follow the same pattern and call base.Dispose(isDisposing) in the following way:

class DerivedReader : FileReader
{
   // Flag: Has Dispose already been called?
   bool disposed = false;

   // Protected implementation of Dispose pattern.
   protected override void Dispose(bool disposing)
   {
      if (disposed)
         return; 

      if (disposing) {
         // Free any other managed objects here.
         //
      }

      // Free any unmanaged objects here.
      //
      disposed = true;

      // Call the base class implementation.
      base.Dispose(disposing);
   }

   ~DerivedClass()
   {
      Dispose(false);
   }
}

It should be fairly clear to now why we are doing it this way. We want disposal to go recursively to base class. So, when we dispose the derived class resources, the base class disposes its own resources too.

To use or not to use a Finalizer:

We are almost done, really, we are. This is a section where Im supposed to tell you why you really shouldn’t use unmanaged resources whenever you need. It’s always a good idea not to write a Finalizer if you really really don’t need it. Currently we need it because you are using a unsafe file handle and we need to close it manually. To keep destructor free and as managed as possible, we should always wrap our handles in SafeHandle class and dispose the SafeHandle as a managed resource. Thus, eliminating the need for cleaning unmanaged resources and the overloaded Finalize() . You will find more about that here.

“Using” it right:

Before you figure out why I quoted the word using here, let’s finally wrap our work up. We have made our FileReader class disposable and we would like to invoke dispose() after we are done using it. We would opt for a try-catch-finally block to do it and will dispose the resources in the finally block.

FileReader fr = new FileReader();
try
{
    if (fr.Open(args[0]))
    {
        // Assume that an ASCII file is being read
        ASCIIEncoding Encoding = new ASCIIEncoding();
        int bytesRead;
        do
        {
            bytesRead = fr.Read(buffer, 0, buffer.Length);
            string content = Encoding.GetString(buffer, 0, bytesRead);
            Console.Write("{0}", content);
        }
        while (bytesRead > 0);
        return 0;
    }
    else
    {
        Console.WriteLine("Failed to open requested file");
        return 1;
    }
}
finally
{
    if (fr != null)
        fr.Dispose();
}

The only difference you see here that we don’t explicitly call Close() anymore. Because that is already handled when we are disposing the FileReader instance.

Good thing for you is that C# has essentially made things even easier than this. Remember the using statements we used in Part-I? An using statement is basically a syntactic sugar placed on a try-finally block with a call to Dispose() in the finally block just like we wrote it here. Now, with that in mind, our code-block will change to:

using (FileReader fr = new FileReader())
{
    if (fr.Open(args[0]))
    {
        // Assume that an ASCII file is being read
        ASCIIEncoding Encoding = new ASCIIEncoding();
        int bytesRead;
        do
        {
            bytesRead = fr.Read(buffer, 0, buffer.Length);
            string content = Encoding.GetString(buffer, 0, bytesRead);
            Console.Write("{0}", content);
        }
        while (bytesRead > 0);
        return 0;
    }
    else
    {
        Console.WriteLine("Failed to open requested file");
        return 1;
    }
}

Now you can go back to part-I and try to understand the first bits of code we saw. I hope it would make a bit better sense to you now. Hope you like it. Hopefully, if there’s a next part, I would talk about garbage collection algorithms.

You can find the code sample over github here.

One thought on “I thought I knew C# : Garbage collection & disposal. Part-II

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s