On Linux, Pipes, Fork and passing file descriptors to other process.

I found a lonely and weird corner case of Linux IO, file descriptors, Linux processes and, fork calls.

Figuring out the problem took me a solid half a day, so I believe is worth writing about it, so others can benefit as well. The discussion will use golang because it is the language in which the original bug was written, but it should apply to every other language.

The setting of the problem is rather simple, one producer process needs to send data to a consumer process.
The producer is a long-running daemon, while the consumer is created on the spot by the producer itself using something like fork/exec.
The producer needs to receive an answer from the consumer as soon as the consumer finish with the input.

It was chosen to use Linux pipes to implement the Inter-Process Communication.

The algorithm was the following.

  1. Create a pipe to send data to the consumer
  2. Create a pipe to receive data from the consumer
  3. Use fork/exec to spin-up the consumer
  4. Pass to the consumer the reading end of the first pipe and the writing end of the second pipe (in reality this is done in step 3).
  5. Block on reading waiting for the result from the reading end of the second pipe

The communication protocol between the process is very simple. First, we send 4 bytes with the size of the message, and then we send the whole message.

In general, this process works fine.

Eventually the blocking call reading on the result pipe complete, either it was a success or an error and the consumer decides how to handle everything.

However, sometimes the consumer crash.

And when the consumer crash we don’t receive any message on the reading side of the result pipe, which is correct since nobody wrote anything.

However, when the consumer crash, I was expecting the pipe to close. So that the blocking reading call would have the chance to return with an error.

And this did not happen. The blocking reading call was there, waiting, and waiting and waiting. And several times we were forced to restart the service in production.

When you create a pipe and the software exit, for any reason, the pipe gets closed automatically by the operative system.
The bytes already in the pipe are kept in the buffer, but the pipe is close, and keep reading you will never block and you will reach EOF.
This was not happening.

In our case, it was happening something more complex. When we create the pipes, 4 file descriptors are created (2 for each pipe). During the fork/exec, those 4 file descriptors are all duplicated. Two of them are then managed by the consumer software. But there are now 8 open file descriptors. Four for the first pipe and four for the second pipe.

Now the problem is clear.

The consumer exits and 4 (out of 8) file descriptors are closed. However, there is still one open file descriptor referring to the writing end of the second (result carrying) pipe.

And this open file descriptor prevents the operative system to send EOF to the pipe, and the read call to finish. Hanging the whole producer process.

After we understand the problem, the fix is quite simple. After the fork/exec, it is sufficient to close the file descriptor that the consumer won’t need. Especially the writing end of the second (result carrying) pipe.

It is possible to reduce the problem even further omitting the first pipe. I created an example below that shows all possible cases.

package main

import (
	"fmt"
	"os"
	"os/exec"
)

/**
 * Note, replace WHATEVER_PATH with what is correct in your system to
 * invoke this same executable a second time
 */
func main() {

	if len(os.Args) > 1 {

		if os.Args[1] == "write" {
			writeOnPipe()
			return
		}
		if os.Args[1] == "crash" {
			crashNoWrite()
			return
		}
	}

	// scenario base (correct implementation)
	// we create a pipe,
	// we pass the writer to the fork,
	// we close our own writer
	// the fork write
	p1Read, p1Write, _ := os.Pipe()

	cmd1 := exec.Command("$WHATEVER_PATH/go_use_pipe_connection", "write")
	cmd1.ExtraFiles = []*os.File{p1Write}

	err1 := cmd1.Start()
	if err1 != nil {
		fmt.Println("Error in making the clone 1 start", err1)
	}

	p1Write.Close()
	b1 := make([]byte, 100)
	n1, err1 := p1Read.Read(b1)
	if err1 != nil {
		fmt.Println("Error in reading 1", err1)
	} else {
		fmt.Printf("Read %d bytes: %s\n", n1, string(b1))
	}

	// scenario base (wrong implementation)
	// we create a pipe,
	// we pass the writer to the fork,
	// we **don't** close our own writer
	// the fork write
	// here we are lucky the something got written and the call unbloc
	p4Read, p4Write, _ := os.Pipe()

	cmd4 := exec.Command("$WHATEVER_PATH/go_use_pipe_connection", "write")
	cmd4.ExtraFiles = []*os.File{p4Write}

	err4 := cmd4.Start()
	if err1 != nil {
		fmt.Println("Error in making the clone 1 start", err1)
	}

	//p4Write.Close() // here we don't close our own writer
	b4 := make([]byte, 100)
	n4, err4 := p4Read.Read(b4)
	if err1 != nil {
		fmt.Println("Error in reading 1", err4)
	} else {
		fmt.Printf("Read %d bytes: %s\n", n4, string(b4))
	}

	// scenario ok
	// we create a pipe,
	// we pass the writer to the fork,
	// we close our own writer
	// the fork crash
	// the reading call fails, no open file descriptor
	p2Read, p2Write, _ := os.Pipe()

	cmd2 := exec.Command("$WHATEVER_PATH/go_use_pipe_connection", "crash")
	cmd2.ExtraFiles = []*os.File{p2Write}

	err2 := cmd2.Start()
	if err2 != nil {
		fmt.Println("Error in making the clone 2 start", err2)
	}

	p2Write.Close() // here our own close

	b2 := make([]byte, 100)
	n2, err2 := p2Read.Read(b2)
	if err2 != nil {
		fmt.Println("Error in reading 2", err2)
	} else {
		fmt.Printf("Read %d bytes: %s\n", n2, string(b2))
	}

	// scenario bug
	// we create a pipe,
	// we pass the writer to the fork,
	// we dont close our own writer
	// the fork crash
	// the read call hangs
	p3Read, p3Write, _ := os.Pipe()

	cmd3 := exec.Command("$WHATEVER_PATH/go_use_pipe_connection", "crash")
	cmd3.ExtraFiles = []*os.File{p3Write}

	err3 := cmd3.Start()
	if err3 != nil {
		fmt.Println("Error in making the clone 3 start", err2)
	}

	// p3Write.Close() // here we don't close and we wait forever
	b3 := make([]byte, 100)
	n3, err3 := p3Read.Read(b3)
	if err3 != nil {
		fmt.Println("Error in reading 3", err3)
	} else {
		fmt.Printf("Read %d bytes: %s\n", n3, string(b3))
	}

}

func writeOnPipe() {
	pipeToWrite := os.NewFile(3, "")
	if pipeToWrite == nil {
		fmt.Println("Error in getting the descriptor")
		return
	}
	pipeToWrite.Write([]byte("fooo"))
}

func crashNoWrite() {
	pipeToWrite := os.NewFile(3, "")
	if pipeToWrite == nil {
		fmt.Println("Error in getting the descriptor")
		return
	}
	return
}

Newsletter

We publish new content each week, subscribe to don't miss any article.