Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: race between os.(*File).Close() and os.(*File).Read() #10001

Closed
tarm opened this issue Feb 25, 2015 · 7 comments
Closed

os: race between os.(*File).Close() and os.(*File).Read() #10001

tarm opened this issue Feb 25, 2015 · 7 comments

Comments

@tarm
Copy link

tarm commented Feb 25, 2015

Here's is a simple reproduction case operating on a regular file (in this case "race.go"). This is with go 1.4.2

package main

import (
    "log"
    "os"
)

func main() {
    f, err := os.Open("race.go")
    if err != nil {
        log.Fatal(err)
    }
    ch := make(chan struct{})
    go func() {
        f.Close()
        close(ch)
    }()
    b := make([]byte, 512)
    n, err := f.Read(b)
    log.Printf("Read %v, %s, %v", n, b[:n], err)
    <-ch
}

In this case, this would obviously be a silly thing to do because there is also a larger race condition in terms of how much data is actually read from the Read() call. The real case this came up was with a serial port device (https://github.com/tarm/goserial) where the Read() and Close() might happen in different goroutines because the Read() can take a long time (serial ports can be configured to block forever until the next character). A syscall.Close() will allow the Read() to return, which is why os.(*File).Close() and Read() are useful to be able to run in different goroutines.

The problematic line in file_unix.go:close() is

        file.fd = -1 // so it can't be closed again

Conceptually, does it make sense that Close() and Read() should be allowed to run at the same time? The cost of a mutex in Close() is not important, but putting a mutex in Read() might have a performance impact. Would a fix be appropriate for the std library? Should I try to work around this in the goserial library by not using os.File?

tarm@linux ~ $ go run -race race.go 
2015/02/25 10:55:52 Read 287, package main

import (
    "log"
    "os"
)

func main() {
    f, err := os.Open("race.go")
    if err != nil {
        log.Fatal(err)
    }
    ch := make(chan struct{})
    go func() {
        f.Close()
        close(ch)
    }()
    b := make([]byte, 512)
    n, err := f.Read(b)
    log.Printf("Read %v, %s, %v", n, b[:n], err)
    <-ch
}, <nil>
==================
WARNING: DATA RACE
Write by goroutine 5:
  os.(*file).close()
      /home/tarm/gosrc/src/os/file_unix.go:109 +0x1d7
  os.(*File).Close()
      /home/tarm/gosrc/src/os/file_unix.go:98 +0x91
  main.func·001()
      /home/tarm/race.go:15 +0x53

Previous read by main goroutine:
  os.(*File).read()
      /home/tarm/gosrc/src/os/file_unix.go:191 +0x5f
  os.(*File).Read()
      /home/tarm/gosrc/src/os/file.go:95 +0xc3
  main.main()
      /home/tarm/race.go:19 +0x32a

Goroutine 5 (running) created at:
  main.main()
      /home/tarm/race.go:17 +0x29a
==================
Found 1 data race(s)
exit status 66
@davecheney
Copy link
Contributor

As I understand it, POSIX does not guarantee the result of calling read(2) in one thread and close(2) in another. This is effectively what is happening here.

@tarm
Copy link
Author

tarm commented Feb 25, 2015

Changing Close() to not write to file.fd yet still only close once per the comment is pretty straightforward (see below). However, setting file.fd to -1 has the positive side affect that file operations after Close() will not be using a stale fd at the syscall level, so we should keep the -1 assignment unless we want to take the file.closedMutex every operation which defeats the purpose.

diff --git a/src/os/file_unix.go b/src/os/file_unix.go
index ff4fc7d..1f31860 100644
--- a/src/os/file_unix.go
+++ b/src/os/file_unix.go
@@ -8,6 +8,7 @@ package os

 import (
        "runtime"
+       "sync"
        "sync/atomic"
        "syscall"
 )
@@ -26,6 +27,9 @@ type file struct {
        name    string
        dirinfo *dirInfo // nil unless directory being read
        nepipe  int32    // number of consecutive EPIPE in Write
+
+       closeMutex sync.Mutex
+       closed     bool
 }

 // Fd returns the integer Unix file descriptor referencing the open file.
@@ -99,14 +103,22 @@ func (f *File) Close() error {
 }

 func (file *file) close() error {
-       if file == nil || file.fd < 0 {
+       if file == nil {
+               return syscall.EINVAL
+       }
+
+       file.closeMutex.Lock()
+       alreadyClosed := file.closed
+       file.closed = true
+       file.closeMutex.Unlock()
+
+       if alreadyClosed {
                return syscall.EINVAL
        }
        var err error
        if e := syscall.Close(file.fd); e != nil {
                err = &PathError{"close", file.name, e}
        }
-       file.fd = -1 // so it can't be closed again

        // no need for a finalizer anymore
        runtime.SetFinalizer(file, nil)

@alobbs
Copy link

alobbs commented Feb 25, 2015

Imagine the following three events happening one after the other:

1. f1.Close()
2. f2 = os.Open()  // Same file descriptor could well be reused
3. f1.Read()

In that case, the call to f1.Read() would actually be reading from an invalid file descriptor. Setting the internal .fd property to -1 prevents that situation.

@davecheney
Copy link
Contributor

I am not disputing that there is a race. What I am suggesting is fixing
this race gives the illusion that concurrent read/close works, and POSIX
doesn't guarantee this.

On Thu, Feb 26, 2015 at 7:41 AM, Alvaro Lopez Ortega <
notifications@github.com> wrote:

Imagine the following three events happening one after the other:

  1. f1.Close()
  2. f2 = os.Open() // Same file descriptor could well be reused
  3. f1.Read()

In that case, the call to f1.Read() would actually be reading from an
invalid file descriptor. Setting the internal .fd property to -1 prevents
that situation.


Reply to this email directly or view it on GitHub
#10001 (comment).

@mikioh mikioh changed the title race between os.(*File).Close() and os.(*File).Read() os: race between os.(*File).Close() and os.(*File).Read() Feb 25, 2015
@tarm
Copy link
Author

tarm commented Feb 26, 2015

@davecheney FYI my first reply was not meant as a reply to your first comment. Looking into it more I agree with you.

From the linux man page:

It is probably unwise to close file descriptors while they may be in use by system calls in other threads in the same process. Since a file descriptor may be reused, there are some obscure race conditions that may cause unintended side effects.

I will fix this at a higher level in the serial library. Closing.

@tarm tarm closed this as completed Feb 26, 2015
@dvyukov
Copy link
Member

dvyukov commented Feb 26, 2015

The problem here is not with file.fd, but with the actual file. Whenever you call Close concurrently with Read/Write you are risking of corrupting arbitrary file or leaking sensitive data to an arbitrary socket.

@davecheney
Copy link
Contributor

@tarm on a closing note, this issue is related to #6222. If all fd io in Go went through a poller, not just the network stuff, then we could have real async close.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants