www.marmakoide.org

The homepage of Alexandre Devert

sssched

Overview

sssched is a program that launch N tasks on M computers throught a tunnel. It is designed for working in a Unix environment. This is very useful to launch many number crunching experiments on a cluster of machines, with the assumption that those machines are accessible through ssh. This program is coded in Python by Christian Gagne and myself.

Quick manual

  1. Download the script here

  2. To use the script, it is useful to give it execution rights with the following command

    chmod 744 sssched
    
  3. You have a file with the list of the machine you want to use, let's call that file machines.lst. It contains one machine name or IP adress per line, empty lines are skipped. Here it is an example of such a file

    uber-computer-01
    uber-computer-02
    uber-computer-03
    129.175.5.145
    
  4. You have a file with the list of the command you want to execute, let's call that file tasks.lst. It contains one command per line. Empty lines are skipped.

    launchxp.sh mutation=0.1 out=run.1.out
    launchxp.sh mutation=0.1 out=run.2.out
    launchxp.sh mutation=0.1 out=run.3.out
    launchxp.sh mutation=0.1 out=run.4.out
    launchxp.sh mutation=0.9 out=run.5.out
    launchxp.sh mutation=0.9 out=run.6.out
    launchxp.sh mutation=0.9 out=run.7.out
    launchxp.sh mutation=0.9 out=run.8.out
    

    Here, the commands are calls to a launch script with parameters, the typical case for some experiments.

  5. To launch all those commands by schedulding them on the set of machine you specify, just do the following command :

    ./sssched -m machines.lst -c tasks.lst
    

    or, in a more verbose fashion

    ./sssched --machines=machines.lst --commands=tasks.lst
    

    If you have a momentary lapse of memory about the command line, ask some help

    ./sssched --help
    

Some remarks:

  • The tasks will be schedulded on the machines from the list. Once a machine is free, a new task will be launched on it, so there is no wasted times due to a variable execution time of your experiment.
  • sssched is always with a very low process priority, so it is not likely to slowdown your computer.
  • sssched detects if a machine is not reachable.
  • To use N times a machine, just put it name or IP adress N times in the machines list.