← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1558967] Re: libfuse2: race in fuse_daemonize() causes ' Transport endpoint is not connected' (found with cmsfs-fuse)

 

#winning

reopening bug report, targeting to yakkety & xenial.

** Changed in: ubuntu-z-systems
       Status: Incomplete => Triaged

** Changed in: fuse (Ubuntu)
       Status: Incomplete => Triaged

** Changed in: fuse (Ubuntu)
   Importance: Undecided => Medium

** Also affects: fuse (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Changed in: fuse (Ubuntu Xenial)
       Status: New => Triaged

** Changed in: fuse (Ubuntu Xenial)
     Assignee: (unassigned) => Dimitri John Ledkov (xnox)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1558967

Title:
  libfuse2: race in fuse_daemonize() causes ' Transport endpoint is not
  connected' (found with cmsfs-fuse)

Status in Ubuntu on IBM z Systems:
  Triaged
Status in fuse package in Ubuntu:
  Triaged
Status in fuse source package in Xenial:
  Triaged

Bug description:
  == Comment: #21 - Hendrik Brueckner - 2016-03-16 06:44:09 ==
  Package: libfuse2
  Version: 2.9.4-1ubuntu2

  The cmsfs-fuse program is used to transfer files from a CMSFS dasd (on
  z/VM) to Linux.  The procedure is to mount, copy files, umount.  All
  commands are issued from within an application over an SSH connection.

  The problem is that the copy intermittently fails with "Transport
  endpoint is not connected".  The procedure is as follows:

     #mount cmsfs
     sudo /usr/bin/cmsfs-fuse /dev/dasdb /usr/wave/wavedisk
     # copy file 
     /bin/cp -f /usr/wave/wavedisk/WAVEDATA.SCRIPT /usr/wave/wavedata
     /bin/cp: cannot stat '/usr/wave/wavedisk/WAVEDATA.SCRIPT': Transport endpoint is not connected
     #umount
     umount /usr/wave/wavedisk

  Because the application uses JSCH to issue the commands, I worked on a
  non-Java reproducer using SSH.

  The problem can be easily re-created with ssh as follows:

  root@r3559004:~# ssh -t root@localhost  "cmsfs-fuse /dev/disk/by-path/ccw-0.0.0190 /CMSFS"
  Connection to localhost closed.
  root@r3559004:~# ls /CMSFS 
  ls: cannot access '/CMSFS': Transport endpoint is not connected

  
  Problem analysis will follow but not that is not specific to cmsfs-fuse; the problem might also occur with other fuse file systems that are mounted through an SSH connection.

  == Comment: #23 - Hendrik Brueckner - 2016-03-16 07:07:30 ==
  After debugging and some code review on the libfuse library, I think that
  we identified the root cause.  As suggested, the problem is not related
  to cmsfs-fuse directly.

  The cmsfs-fuse main program calls into the libfuse library() using the
  fuse_main() function.  The fuse_main() function later calls the
  fuse_daemonize()  to fork the daemon process to handle the fuse file
  system I/O.

  The fuse_daemonize() look at follows:

  180 int fuse_daemonize(int foreground)
  181 {
  182         if (!foreground) {
  183                 int nullfd;
  184 
  185                 /*
  186                  * demonize current process by forking it and killing the
  187                  * parent.  This makes current process as a child of 'init'.
  188                  */
  189                 switch(fork()) {
  190                 case -1:
  191                         perror("fuse_daemonize: fork");
  192                         return -1;
  193                 case 0:
  194                         break;
  195                 default:
  196                         _exit(0);
  197                 }
  198 
  199                 if (setsid() == -1) {
  200                         perror("fuse_daemonize: setsid");
  201                         return -1;
  202                 }
  203 
  204                 (void) chdir("/");
  205 
  206                 nullfd = open("/dev/null", O_RDWR, 0);
  207                 if (nullfd != -1) {
  208                         (void) dup2(nullfd, 0);
  209                         (void) dup2(nullfd, 1);
  210                         (void) dup2(nullfd, 2);
  211                         if (nullfd > 2)
  212                                 close(nullfd);
  213                 }
  214         }
  215         return 0;
  216 }

  
  The fuse_daemonize() function calls fork() as usual.  The child proceeds with setsid() and then redirecting its file descriptors to /dev/null etc. The parent process, simply exits.

  The child's functions and the parent's exit creates a subtle race.
  This is seen with an SSH connection.  The SSH command "ssh -t
  root@localhost  "cmsfs-fuse /dev/disk/by-path/ccw-0.0.0190 /CMSFS"
  calls the cmsfs-fuse on an allocated pseudo-terminal device (-t
  option).

  If the parent exits, the SSH command receives that its command has
  been executed and closes the connection, that means, it closes the
  master side of the pseudo-terminal.  This causes a HUP signal being
  sent to the process group on the pseudo-terminal.  The child might not
  have completed the setsid() call and hence becomes terminated.  Note
  that fuse sets up its signal handler later after fuse_daemonize() has
  complete.

  Even if the child has the chance to disassociate from it's parent
  process group to become it's own process group with setsid(), the
  child still has the pseudo-terminal opened as stdin, stdout, and
  stderr.  So the pseudo-terminal still behave as controlling terminal
  and might cause a SIGHUP to be issued at closing the the master side.

  To solve the problem, the parent has to wait until the child (the fuse
  daemon process) has completed its processing, that means, has become
  its own process group with setsid() and closed any file descriptors
  pointing to the pseudo-terminal.

  For example, using a pipe as follows could solve the problem:

  The parent waits on the pipe, then exits:

  read(waiter[0], &completed, sizeof(completed));
  _exit(0);

  The child signals its completion (after redirecting its file descriptors) with:
  completed = 1;
  write(waiter[1], &completed, sizeof(completed));

  == Comment: #24 - Gerald Schaefer - 2016-03-16 08:18:20 ==
  The race can also be triggered w/o ssh, by using "setsid -c", and I can also reproduce it w/o cmsfs-fuse but with sshfs:

  root@s3545003:~# setsid -c sshfs geraldsc@tuxmaker: sshfs/
  root@s3545003:~# ls sshfs 
  ls: cannot access 'sshfs': Transport endpoint is not connected

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1558967/+subscriptions