GlusterFS无法在启动时挂载

我在Ubuntu 12.04上运行官方的GlusterFS 3.5软件包,作为客户端和服务器,除了在启动的时候挂载GlusterFS卷,所有东西似乎都能正常工作。 这是我在日志文件中看到的:

[2014-06-17 08:20:52.969258] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.0 (/usr/sbin/glusterfs --volfile-server=127.0.0.1 --volfile-id=/public_uploads /var/www/shared/public/uploads) [2014-06-17 08:20:52.998985] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-06-17 08:20:52.999048] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread [2014-06-17 08:20:53.000373] E [socket.c:2161:socket_connect_finish] 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection refused) [2014-06-17 08:20:53.000427] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 127.0.0.1 (No data available) [2014-06-17 08:20:53.000442] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers [2014-06-17 08:20:53.013793] W [glusterfsd.c:1095:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x27) [0x7f686e0160f7] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x1a4) [0x7f686e019cc4] (-->/usr/sbin/glusterfs(+0xcada) [0x7f686e6ddada]))) 0-: received signum (1), shutting down [2014-06-17 08:20:53.013830] I [fuse-bridge.c:5444:fini] 0-fuse: Unmounting '/var/www/shared/public/uploads'. 

我的fstab包含:

 proc /proc proc defaults 0 0 /dev/xvda / ext4 noatime,errors=remount-ro 0 1 /dev/xvdb none swap sw 0 0 /dev/xvdc /var/lib/glusterfs/brick01 ext4 defaults 1 2 127.0.0.1:/private_uploads /var/www/shared/private/uploads glusterfs defaults,_netdev 0 0 

我知道这曾经是Ubuntu的GlusterFS 3.2中的一个bug,但是我明白它已经在GlusterFS 3.4的PPA包中解决了,如下所示: https : //bugs.launchpad.net/ubuntu/+source/glusterfs/+bug / 876648

我还记得这个在一个虚拟机上运行的实验中(但是因为它只是在工作,所以我没有深入)。 我看到gluster-client包提供了一个名为mount-glusterfs.conf的新兴工作,其中包含:

 author "Louis Zuckerman <me@louiszuckerman.com>" description "Block the mounting event for glusterfs filesystems until the network interfaces are running" instance $MOUNTPOINT start on mounting TYPE=glusterfs task exec start wait-for-state WAIT_FOR=static-network-up WAITER=mounting-glusterfs-$MOUNTPOINT 

但我不太确定它应该如何工作。 它似乎并没有开箱即用。 即使在networking启动之后安装glusterfs卷,也会在GlusterFS启动之前发生:

  * Starting RPC portmapper replacement [ OK ] * Stopping rpcsec_gss daemon [ OK ] * Starting Start this job to wait until rpcbind is started or fails to s[ OK ] * Starting configure network device [ OK ] * Stopping Start this job to wait until rpcbind is started or fails to s[ OK ] * Starting Bridge socket events into upstart [ OK ] * Starting NSM status monitor [ OK ] * Stopping cold plug devices [ OK ] * Stopping log initial device creation [ OK ] * Starting load fallback graphics devices [ OK ] * Starting configure network device security [ OK ] * Starting load fallback graphics devices [fail] * Starting configure virtual network devices [ OK ] * Starting Send an event to indicate plymouth is up [ OK ] * Stopping Send an event to indicate plymouth is up [ OK ] * Starting Mount network filesystems [ OK ] * Stopping configure virtual network devices [ OK ] * Stopping Mount network filesystems [ OK ] * Starting Mount network filesystems [ OK ] * Stopping Mount network filesystems [ OK ] * Starting configure network device [ OK ] * Starting set sysctls from /etc/sysctl.conf [ OK ] * Stopping set sysctls from /etc/sysctl.conf [ OK ] The disk drive for /var/www/shared/public/uploads is not ready yet or not present. Continue to wait, or Press S to skip mounting or M for manual recovery * Starting Waiting for state [fail] * Starting Block the mounting event for glusterfs filesystems until the [fail]k interfaces are running mountall: Event failed Mount failed. Please check the log file for more details. * Starting GNU Screen Cleanup [ OK ] * Starting flush early job output to logs [ OK ] * Starting base [ OK ] * Starting save udev log and update rules [ OK ] * Starting OpenSSH server [ OK ] * Stopping Failsafe Boot Delay [ OK ] * Starting System V initialisation compatibility [ OK ] * Stopping save udev log and update rules [ OK ] * Stopping Mount filesystems on boot [ OK ] * Stopping GNU Screen Cleanup [ OK ] * Stopping flush early job output to logs [ OK ] * Starting system logging daemon [ OK ] * Stopping System V initialisation compatibility [ OK ] * Starting System V runlevel compatibility [ OK ] * Starting save kernel messages [ OK ] * Starting deferred execution scheduler [ OK ] * Starting CPU interrupts balancing daemon [ OK ] * Starting regular background program processing daemon [ OK ] * Starting automatic crash report generation [ OK ] * Starting GlusterFS Management Daemon [ OK ] 

任何想法正在发生什么和/或如何解决?

作为一个我不太感兴趣的select,我尝试着在这些卷上进行一次暴发性的工作。 我将noauto添加到了我的fstab glusterfs条目中,以便它们不会被自动挂载到引导项目,并用这些内容创build一个暴发户作业:

 description "Mount public uploads" start on started glusterfs-server exec mount /var/www/shared/public/uploads 

当我重新启动服务器时,卷未安装。 /var/log/upstart/mount_public_uploads.log包含:

 Mount failed. Please check the log file for more details. 

和/var/log/glusterf/var-www-shared-public-uploads.log cotains:

 2014-06-19 15:01:47.170299] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.0 (/usr/sbin/glusterfs --volfile-server=127.0.0.1 --volfile-id=/public_uploads /var/www/shared/public/uploads) [2014-06-19 15:01:47.190852] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-06-19 15:01:47.190933] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread [2014-06-19 15:01:50.613939] I [dht-shared.c:311:dht_init_regex] 0-public_uploads-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$ [2014-06-19 15:01:50.616107] I [socket.c:3561:socket_init] 0-public_uploads-client-0: SSL support is NOT enabled [2014-06-19 15:01:50.616128] I [socket.c:3576:socket_init] 0-public_uploads-client-0: using system polling thread [2014-06-19 15:01:50.616158] I [client.c:2273:notify] 0-public_uploads-client-0: parent translators are ready, attempting connect on transport Final graph: +------------------------------------------------------------------------------+ 1: volume public_uploads-client-0 2: type protocol/client 3: option remote-host koraga.int.example.com 4: option remote-subvolume /var/lib/glusterfs/brick01/public_uploads 5: option transport-type socket 6: option username 51275c7d-33b4-46cc-b8e9-9c06b5dfcda5 7: option password 36401ce2-18e7-427e-b126-30d2d9351480 8: option transport.socket.ssl-enabled off 9: end-volume 10: 11: volume public_uploads-dht 12: type cluster/distribute 13: subvolumes public_uploads-client-0 14: end-volume 15: 16: volume public_uploads-write-behind 17: type performance/write-behind 18: subvolumes public_uploads-dht 19: end-volume 20: 21: volume public_uploads-read-ahead 22: type performance/read-ahead 23: subvolumes public_uploads-write-behind 24: end-volume 25: 26: volume public_uploads-io-cache 27: type performance/io-cache 28: subvolumes public_uploads-read-ahead 29: end-volume 30: 31: volume public_uploads-quick-read 32: type performance/quick-read 33: subvolumes public_uploads-io-cache 34: end-volume 35: 36: volume public_uploads-open-behind 37: type performance/open-behind 38: subvolumes public_uploads-quick-read 39: end-volume 40: 41: volume public_uploads-md-cache 42: type performance/md-cache 43: subvolumes public_uploads-open-behind 44: end-volume 45: 46: volume public_uploads 47: type debug/io-stats 48: option latency-measurement off 49: option count-fop-hits off 50: subvolumes public_uploads-md-cache 51: end-volume 52: +------------------------------------------------------------------------------+ [2014-06-19 15:01:50.619723] E [client-handshake.c:1742:client_query_portmap_cbk] 0-public_uploads-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2014-06-19 15:01:50.619795] I [client.c:2208:client_rpc_notify] 0-public_uploads-client-0: disconnected from 192.168.134.227:24007. Client process will keep trying to connect to glusterd until brick's port is available [2014-06-19 15:01:50.629922] I [fuse-bridge.c:4946:fuse_graph_setup] 0-fuse: switched to graph 0 [2014-06-19 15:01:50.630166] I [fuse-bridge.c:3883:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 [2014-06-19 15:01:50.630473] W [fuse-bridge.c:739:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not connected) [2014-06-19 15:01:50.642752] I [fuse-bridge.c:4787:fuse_thread_proc] 0-fuse: unmounting /var/www/shared/public/uploads [2014-06-19 15:01:50.643121] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f6d5111c3fd] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f6d513efe9a] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x7f6d51ee91b5]))) 0-: received signum (15), shutting down [2014-06-19 15:01:50.643144] I [fuse-bridge.c:5444:fini] 0-fuse: Unmounting '/var/www/shared/public/uploads'. 

其中我认为这是重要的一行:

 [2014-06-19 15:01:50.619723] E [client-handshake.c:1742:client_query_portmap_cbk] 0-public_uploads-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. 

如果我手动运行服务mount_public_uploads开始,它挂载就好了。 也许它正在尝试安装glusterfs准备好之前?

这似乎是一个已知的问题,根据README.Ubuntu应该在Ubuntu 14.04中修复。

对于早期的Ubuntu版本,可能的解决方法是在GlusterFS服务器启动后使用自定义的upstart作业推迟卷挂载。

也许你在安装你的PPA之前从Ubuntu的repo中安装了glusterfs-client? 因此,你不会有最新的版本。