
dispynode
*********

**dispynode.py** program should be running on each of the nodes
(servers). It executes jobs for dispy clients; i.e., jobs submitted by
JobCluster or SharedJobCluster. Usually no options are needed to run
this program; '-d' option may be useful to see log of jobs being
executed.

Below are various options to invoke dispynode program:

* "-d" enables debug messages that show trace of execution.

* "-c n" or "--cpus=n" sets the number of processing units to *n*.
  Without this option, dispynode will use all the processing units
  available on that node. If *n* is positive, it must be at least 1
  and at most number of processing units on that node; dispynode will
  then use at most *n* processors. If *n* is negative, then that many
  processing units are not used by dispynode.

* "-i addr" or "--ip_addr=addr" directs dispynode to use given
  *addr* for communication, instead of the IP address associated with
  the host name.

* "--ext_ip_addr=addr" directs dispynode to announce *addr* in
  network communication so that the node can be used if it is behind
  NAT firewall/gateway that is configured to use *addr*. See
  NAT/Firewall Forwarding below.

* "-p n" or "--node_port=n" directs dispynode to use given port *n*
  instead of default port 51348.

* "--name=name" associates given name to the node. If this option is
  not given, result of socket's *gethostname()* is used as name.

* "-s secret" or "--secret=secret" directs dispynode to use 'secret'
  for hashing handshake communication with dispy scheduler; i.e., this
  node will only work with clients that use same secret (see *secret*
  option to JobCluster and *node_secret* option to *dispyscheduler*).

* "--dest_path_prefix=path" directs dispynode to use *path* as
  prefix for storing files sent by dispy scheduler. If a cluster uses
  *dest_path* option (when creating cluster with JobCluster or
  SharedJobCluster), then *dest_path* is appened to *path* prefix.
  With this, files from different clusters can be automatically stored
  in different directories, to avoid conflicts. Unless *cleanup=False*
  option is used when creating a cluster, dispynode will remove all
  files and directories created after the cluster is terminated.

* "--scheduler_node=addr": If the node is in the same network as the
  dispy scheduler or when no jobs are scheduled at the time dispynode
  is started, this option is not necessary. However, if jobs are
  already scheduled and scheduler and node are on different networks,
  the given *addr* is used for handshake with the scheduler.

* "--scheduler_port=n" directs dispynode to use port *n* to
  communicate with scheduler. Default value is 51347. When using this
  option, make sure dispy scheduler is also directed to use same port.

* "--keyfile=path" is path to file containing private key for SSL
  communication, same as 'keyfile' parameter to ssl.wrap_socket of
  Python ssl module. This key may be stored in 'certfile' itself, in
  which case this must be *None* (default).  Same file must be used as
  *keyfile* parameter for JobCluster.

* "--certfile=path" is path to file containing SSL certificate, same
  as 'certfile' parameter to ssl.wrap_socket of Python ssl module.
  Same file must be used as *certfile* parameter for JobCluster.

* "--max_file_size n" specifies maximum size of any file transferred
  from/to clients. If size of a file transferred exceeds *n*, the file
  will be truncated. *n* can be specified as a number >= 0, with an
  optional suffix letter k (indicating *n* kilobytes), m (*n*
  megabytes), g (*n* gigabytes) or t (*n* terrabytes). If *n* is 0,
  there is no maximum limit.

* "--zombie_interval=n" indicates dispynode to assume a scheduler is
  a zombie if there is no communication from it for *n* minutes.
  dispynode doesn't terminate jobs submitted by a zombie scheduler;
  instead, when all the jobs scheduled are completed, the node frees
  itself from that scheduler so other schedulers may use the node.

* "--clean" indicates dispynode should remove any files saved from
  previous runs. dispynode saves any files sent by dispy clients and
  information about jobs' execution results that couldn't be sent to
  clients (because of network failures, clients crashed etc.). The
  cleaning is done once when the node is starting. If dispynode is
  left running for a long time, it may be advisable to periodically
  remove such files (perhaps files that were accessed before a certain
  time). Note that dispy sets the timestamps of files saved from dispy
  client computations to the timestamps on the clients, so
  modification times of such files may not be a good measure to know
  if the files are still in use.

* "--msg_timeout n" specifies timeout value in seconds for socket
  I/O operations with the client / scheduler. The default value is 5
  seconds. If the network is slow, this timeout can be increased.
  Bigger timeout values than necessary will cause longer delays in
  recognizing communication failures.


NAT/Firewall Forwarding
=======================

As explained in *dispy* documentation, *ext_ip_addr* can be used in
case dispynode is behding a NAT firewall/gateway and the NAT forwards
UDP and TCP ports 51348 to the IP address where dispynode is running.
Thus, assuming NAT firewall/gateway is at (public) IP address a.b.c.d,
dispynode is to run at (private) IP address 192.168.5.33 and NAT
forwards UDP and TCP ports 51348 to 192.168.5.33, dispynode can be
invoked as:

   dispynode.py -i 192.168.5.33 --ext_ip_addr a.b.c.d

If multiple dispynodes are needed behind a.b.c.d, then each must be
started with different 'port' argument and those ports must be
forwarded to nodes appropriately. For example, to continue the
example, if 192.168.5.34 is another node that can run dispynode, then
it can be started on it as:

   dispynode.py -i 192.168.5.34 -p 51350 --ext_ip_addr a.b.c.d

and configure NAT to forward UDP and TCP ports 51350 to 192.168.5.34.
Then dispy client can use the nodes with:

   cluster = JobCluster(compute, nodes=[('a.b.c.d', 51347), ('a.b.c.d', 51350)])
