This note describes the logic of mas-slave.c Some notable global variables are: pending[] ... The typical calling sequence seen by a task is: main() [User] [enter mas-slave.c] init_master_slave() [Is this called here?] master_slave() SET par_loop_get_task [MAYBE get_task_global A BETTER NAME?] raw_master_slave() SET get_task_result_global SET update_environment_global init_master_slave() [if not yet called] par_call() [Defined in threads-mpi.c or threads.c; Par. stuff begins] body_master_slave() runtime() if (is_master()) goto MASTER else goto SLAVE ========================== MASTER: walltime() parallel_loop() LOOP: wait_for_slave() [while num_free_slaves == 0] task = get_task() [same as par_loop_get_task()] [NOTE: set_task() callable directly by user, if raw_master_slave() invoked by user, and user submits his own version of parallel_loop()] set_task(task) [if mas_slave_is_task (if there is a current task)] wait_for_slave() [if num_free_slaves < num_slaves] recv_msg() action = get_task_result_global() switch(action) case REDO: send_task() case CONTINUE: send_msg() case NO_ACTION: add_free_slave() case UPDATE: add_free_slave() broadcast_command(result, task) [if not is_shared_memory] [CURRENTLY NOT IMPLEMENTED -- ALSO NOT CALLED CURRENTLY] update_environment() [same as update_environment_global()] send_task(task) wait_for_all_slaves() wait_for_slave() [until num_free_slaves = num_slaves] send_msg(doneMessage) master_stats() SLAVE: LOOP: result = recv_msg() tag = get_last_tag() result == doneMessage: break result == confirmMessage: /* ping better than confirm ? */ send_msg(answerMessage) tag == BROADCAST_TAG: update_environment() else: do_task(result) send_msg(task) LOOP_END: slave_stats() finalize() ================================================================== MSG() sets these: mas_slave_is_msg set to TRUE mas_slave_msg_size mas_slave_msg_ptr send_msg() uses them and resets msg_slave_is_msg to FALSE ================================================================== enum tag {DUMMY_TAG = NUM_PEND_TAGS, PING_TAG, ACK_TAG, DONE_TAG, BCAST_RESULT_TAG, BCAST_TASK_TAG}; tasks and results are always accompanied by a tag less than NUM_PEND_TAGS Currently, the MPI_datatype is always assumed to be MPI_CHAR This will not work, of course, for general heterogeneous architectures, but then the user would have to declare the types of his tasks/results ================================================================== mas-slave.c calls attach_new_slave(). This is defined in threads-mpi.c and threads.c by default as a function returning 0. If threads-mpi.c is compiled with LOADER defined, attach_new_slave() is defined to call poll_new_slave() and MPI_Spawn2(). Also, init_loaderctl() is called from threads-mpi.c in that case. MPI_Spawn2() is defined in mpinu/spawn2.c. If it is not linked into the final executable, then it will not require symbols from loader/* loader/* defines poll_new_slave() and other utilities. When invoked, init_loaderctl() sets up a well-known port to be used by the binary loader/loader. That binary then downloads an unexec'ed checkpoint file (called topcjob locally), and starts it running. ./topcjob starts at main(), invokes MPI_Init(), where it talks to MPI_Spawn2() on the master process, which integrates it as a new slave.