<?xml version="1.0"?>
<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" 
               "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<sect1 id="boinc">
  <title>BOINC</title>

  <sect2>
    <title>Terminology</title>

    <para>
      The term <emphasis>workunit</emphasis> means approximately the same thing
      for DC-API as for BOINC.
    </para>

    <para>
      Both the DC-API and BOINC uses the term <emphasis>result</emphasis>, but
      they mean different things. In BOINC, results are instances of a work
      unit waiting to be downloaded or currently under execution. The DC-API
      result is what BOINC calls the <emphasis>canonical result</emphasis>.
      This means that when BOINC generates multiple results (e.g. for
      redundant computation), the DC-API will not be notified about the status
      of individual BOINC results; instead, it will be notified only if the
      canonical result is found or the whole work unit is marked as failed by
      BOINC.
    </para>

    <para>
      In the following sections, <emphasis>result</emphasis> will mean the
      DC-API term, while <emphasis>BOINC result</emphasis> will refer to the
      BOINC definition.
    </para>

    <para>
      The DC-API master application is an <emphasis>assimilator</emphasis> in
      BOINC terms.
    </para>

  </sect2>

  <sect2>
    <title>Configuration options</title>

    <sect3>
      <title>Master side</title>

      <para>
	<warning>
	  <para>
	    Important note: the directories specified by
	    <literal>WorkingDirectory</literal>,
	    <literal>ProjectRootDir</literal> and the upload &amp; download
	    directories specified in BOINC's <filename>config.xml</filename>
	    must all reside on the same filesystem since the DC-API uses the
	    <function>link()</function> and <function>rename()</function> system
	    calls.
	  </para>
	</warning>
      </para>

      <variablelist>
	<varlistentry>
	  <term>InstanceUUID</term>
	  <listitem>
	    <para>
	      REQUIRED. The value must be a Universally Unique Identifier. The
	      value must be unique for every master application running on the
	      same grid backend. If two master applications are started with the
	      same <literal>InstanceUUID</literal> value, their behaviour is
	      undefined.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>ProjectRootDir</term>
	  <listitem>
	    <para>
	      OPTIONAL. The location of the project's root directory.
	      This is the directory that contains
	      <filename>config.xml</filename> and other BOINC-related
	      subdirectories.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>UploadURL</term>
	  <listitem>
	    <para>
	      OPTIONAL. The upload handler's URL to send output files of
	      processed BOINC workunits to.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>InputURLRewriteRegExpMatch</term>
	  <listitem>
	    <para>
	      OPTIONAL. This variable along with the <literal>InputURLRewriteRegExpReplace</literal>
	      varaible can be used to rewrite input file URLs based on regular expressions. The
	      variable <literal>InputURLRewriteRegExpMatch</literal> defines the match part of the
	      regular expression, whereas the varibale <literal>InputURLRewriteRegExpReplace</literal>
	      defines the replacement part of the regular expression. An example value of this variable
	      is <literal>attic://([^/]*).*/([^/]*)$</literal>.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>InputURLRewriteRegExpReplace</term>
	  <listitem>
	    <para>
	      OPTIONAL. This variable along with the <literal>InputURLRewriteRegExpMatch</literal>
	      varaible can be used to rewrite input file URLs based on regular expressions. The
	      variable <literal>InputURLRewriteRegExpReplace</literal> defines the replace part of the
	      regular expression, whereas the varibale <literal>InputURLRewriteRegExpMatch</literal>
	      defines the match part of the regular expression. An example value of this variable
	      is <literal>http://\1/dl/redir/\2\nhttp://localhost:12345/data/\2</literal>.
	    </para>
	  </listitem>
	</varlistentry>
      </variablelist>
    </sect3>

    <sect3>
      <title>Per-client configuration</title>

      <variablelist>
	<varlistentry>
	  <term>Redundancy</term>
	  <listitem>
            <anchor id="DC-API-Boinc-Redundancy"/>
	    <para>
	      OPTIONAL. Integer value specifying the quorum required to consider
	      the work unit as valid. The default value is 1. If this value is N,

	      <itemizedlist>
		<listitem>
		  <para>
		    N + log(N) initial BOINC results will be created. If one of
		    them finishes, a new one will be created automatically until
		    the work unit either succeeds or fails.
		  </para>
		</listitem>
		<listitem>
		  <para>
		    The work unit will be considered failed if more than N +
		    log(N + 2) + 1 BOINC results fail.
		  </para>
		</listitem>
		<listitem>
		  <para>
		    The work unit will be considered failed if there are N +
		    log(N + 2) + 1 successful results but the validator could
		    not find a canonical result.
		  </para>
		</listitem>
		<listitem>
		  <para>
		    The work unit will be considered failed if the state of the
		    work unit is still not decided after 2 * (N + log(N + 2))
		    BOINC results have been received.
		  </para>
		</listitem>
	      </itemizedlist>
	      <note>
		<para>
		  When the redundancy is greater than 1, the work unit can not
		  be suspended using <function><link
		      linkend="DC-suspendWU">DC_suspendWU()</link></function>.
		
		  In the following the options are listed which allow fine tuning 
		  redundancy. These options are mutually exclusive with <function><link
	      linkend="DC-API-Boinc-Redundancy">Redundancy</link></function>.
		</para>
	      </note>
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MinQuorum</term>
	  <listitem>
	    <para>
		 OPTIONAL. Integer value specifying the quorum required to consider
		 the work unit as valid. The default value is 1.
		</para>
		<note>
		 <para>
			This option is mutually exclusive with <function><link
		      linkend="DC-API-Boinc-Redundancy">Redundancy</link></function>. 
			<literal>MinQuorum</literal>, <literal>TargetNResults</literal>, 
			<literal>MaxErrorResults</literal> and <literal>MaxTotalResults</literal> 
			should be used combined.
		 </para>
		</note>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>TargetNResults</term>
	  <listitem>
	    <para>
		 OPTIONAL. Integer value specifying the number of initial BOINC results to be 
		 created. The default value is <literal>MinQuorum</literal>.
		</para>
		<note>
		 <para>
			This option is mutually exclusive with <function><link
		      linkend="DC-API-Boinc-Redundancy">Redundancy</link></function>. 
			<literal>MinQuorum</literal>, <literal>TargetNResults</literal>, 
			<literal>MaxErrorResults</literal> and <literal>MaxTotalResults</literal> 
			should be used combined.
		 </para>
		</note>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MaxErrorResults</term>
	  <listitem>
	    <para>
		 OPTIONAL. Integer value specifying the maximum number of failed BOINC results 
		 for a work unit. The default value is 0.
		</para>
		<note>
		 <para>
			This option is mutually exclusive with <function><link
		      linkend="DC-API-Boinc-Redundancy">Redundancy</link></function>. 
			<literal>MinQuorum</literal>, <literal>TargetNResults</literal>, 
			<literal>MaxErrorResults</literal> and <literal>MaxTotalResults</literal> 
			should be used combined.
		 </para>
		</note>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MaxTotalResults</term>
	  <listitem>
	    <para>
		 OPTIONAL. Integer value specifying the total number of BOINC results for a 
		 work unit. The default value is <literal>MinQuorum</literal>.
		</para>
		<note>
		 <para>
			This option is mutually exclusive with <function><link
		      linkend="DC-API-Boinc-Redundancy">Redundancy</link></function>. 
			<literal>MinQuorum</literal>, <literal>TargetNResults</literal>, 
			<literal>MaxErrorResults</literal> and <literal>MaxTotalResults</literal> 
			should be used combined.
		 </para>
		</note>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MaxSuccessResults</term>
	  <listitem>
	    <para>
		 OPTIONAL. Integer value specifying the maximum number of successful BOINC results 
		 for a work unit. The default value is <literal>MinQuorum</literal>.
		</para>
		<note>
		 <para>
			This option is mutually exclusive with <function><link
		      linkend="DC-API-Boinc-Redundancy">Redundancy</link></function>. 
			<literal>MinQuorum</literal>, <literal>TargetNResults</literal>, 
			<literal>MaxErrorResults</literal> and <literal>MaxTotalResults</literal> 
			should be used combined.
		 </para>
		</note>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MaxOutputSize</term>
	  <listitem>
	    <para>
	      OPTIONAL. Max. size of any output files the client application
	      generates.  The default is 256 KiB. If the size of an output file
	      exceeds this value, the BOINC core client will not upload that
	      file and will report the BOINC result as failed.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MaxMemUsage</term>
	  <listitem>
	    <para>
	      OPTIONAL. Max. memory usage of the client application. The default
	      is 128 MiB. Hosts with less available memory will not download
	      work units for this application. Also, if the applications's real
	      memory usage exceeds this limit, the BOINC core client aborts the
	      application and reports the BOINC result as failed.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MaxDiskUsage</term>
	  <listitem>
	    <para>
	      OPTIONAL. Max. disk usage of the client application, including all
	      output and temporary files. The default is 64 MiB. Hosts with less
	      usable disk space will not download work units for this
	      application.  Also, if the application's disk usage exceeds this
	      limit, the BOINC core client aborts the apllication and reports
	      the BOINC result as failed.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>EstimatedFPOps</term>
	  <listitem>
	    <para>
	      OPTIONAL. The estimated run-time of the client application,
	      expressed in the number of floating point operations. The default
	      is 10<superscript>13</superscript>. This value is used by the
	      BOINC server to decide whether a given host is eligible to run a
	      work unit and is also used by the BOINC core client for scheduling
	      decisions.
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>MaxFPOps</term>
	  <listitem>
	    <para>
	      OPTIONAL. Max. CPU usage of the client application, expressed in
	      the number of floating point operations. The default is
	      10<superscript>15</superscript>. If the application uses more CPU
	      time than this value divided by the CPU's speed, then the BOINC
	      core client aborts the application and reports the BOINC result as
	      failed.
	    </para>
	    <note>
	      <para>
		As per recommendations in the BOINC documentation, the value of
		<literal>MaxFPOps</literal> should be several times larger than
		the expected run time of a work unit on an avarage host.
	      </para>
	    </note>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>DelayBound</term>
	  <listitem>
	    <para>
	      Time in seconds the BOINC server waits for a result to finish. If
	      a client has donwloaded a BOINC result and did not finish in the
	      given time, the result is considered failed and a new one is
	      generated.
	    </para>
	    <note>
	      <para>
		If <literal>DelayBound</literal> is smaller than the estimated
		run time of the application on a given host (calculated by
		dividing <literal>EstimatedFPOps</literal> by the host's speed),
		then the BOINC result will not be offered for download. If no
		host is fast enough to complete the application within the
		specified time limit, the result will remain unsent for an
		unspecified amount of time and DC-API will receive no feedback
		for it.
	      </para>
	    </note>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>EnableSuspend</term>
	  <listitem>
	    <para>
	      OPTIONAL. Boolean value telling if work units for this client can
	      be suspended using <function><link
		  linkend="DC-suspendWU">DC_suspendWU()</link></function> or
	      not. The default value is false.
	      <note>
		<para>
		  When the redundancy is greater than 1, the work unit can not
		  be suspended using <function><link
		      linkend="DC-suspendWU">DC_suspendWU()</link></function>,
		  regardless of the value of this configuration option.
		</para>
	      </note>
	    </para>
	  </listitem>
	</varlistentry>
	<varlistentry>
	  <term>NativeClient</term>
	  <listitem>
	    <para>
	      OPTIONAL. Boolean value telling if the client application uses the
	      native BOINC API instead of DC-API. This will prevent adding
	      DC-API specific input and output files to the workunit
	      description.
	    </para>
	  </listitem>
	</varlistentry>
      </variablelist>
    </sect3>

    <sect3>
      <title><anchor id="Boinc-Config"/>Considerations for BOINC configuration</title>

      <para>
	If you want to use master-to-client messaging, you must enable it in the
	BOINC project's configuration by making sure that the
	<literal>&lt;msg_to_host/&gt;</literal> tag is present in
	<filename>config.xml</filename>. Client-to-master messaging is always
	enabled and does not require configuration.
      </para>
    </sect3>

  </sect2>

  <sect2>
    <title>Backend-specific issues</title>

    <sect3>
      <title>Deploying the application</title>

      <para>
	Deploying the application consists of two steps: registering the client
	application(s) in the BOINC database, and running the master daemon.
      </para>
      <para>
	All client applications should be compiled for every platform you need,
	and installed under the project's <filename>apps</filename> directory.
	The BOINC name of the client application must be the same as the master
	uses when it calls <function><link
	    linkend="DC-createWU">DC_createWU()</link></function>. See the BOINC
	documentation about how the client binaries should be named and placed
	and how they should be registered in the database.
      </para>
      <para>
	The most common method of deploying the master application is to run it
	as a BOINC daemon by adding it to BOINC's
	<filename>config.xml</filename>. See the BOINC documentation for
	details. Other methods of deploying the master application depending on
	how it was designed are also possible, but the following rules must be
	fulfilled:
	<itemizedlist>
	  <listitem>
	    <para>
	      The master application must have access to the BOINC project's
	      <filename>config.xml</filename>
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      The BOINC <application>file_deleter</application> process must
	      have enough privileges to be able to remove files and directories
	      created by the master application. If the master runs under the
	      same user account as the BOINC daemons, this is usually not a
	      problem.
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      The master application must be able to create files and
	      directories under the project's <filename>download</filename>
	      directory, and it must be able to access files under the project's
	      <filename>upload</filename> directory.
	    </para>
	  </listitem>
	</itemizedlist>
      </para>

      <!-- XXX Update when the validator API is added -->
      <para>
	Besides the master and client applications, you must also define a
	validator for the application in <filename>config.xml</filename>. If you
	are not using redundancy then you may use the
	<filename>sample_trivial_validator</filename> that comes with BOINC.
	This validator accepts everything without checking.
      </para>
      <para>
	If redundancy is desired, you may use the
	<application>validator_for_dcapi</application> validator which does a
	textual (meaning converting between UNIX and Windows line endings)
	comparison of the first output file.
      </para>
      <warning>
	<para>
	  If you are running multiple master applications under the same BOINC
	  project, and you want to use
	  <filename>sample_trivial_validator</filename> for any of them, then
	  you must use it for all of them. This restricition exists for any
	  other validator that is not DC-API aware, since it can not determine
	  which work unit belongs to which master and therefore which results
	  should it validate and which ones should it leave alone.
	</para>
      </warning>
    </sect3>

    <sect3>
      <title>Redundant computation</title>

      <para>
	Redundancy is very important if you are running computations on
	untrusted clients instead but may even be useful on dedicated clients to
	protect from hardware failures. Besides deliberate tampering with the
	output, clients may also produce incorrect results due to hardware
	problems like bad memory, overheating or faulty CPU or simply disk
	corruption.
      </para>
      <para>
	Redundant computing means sending the same work unit to multiple
	different clients and comparing the results. The comparison is performed
	by a tool BOINC calls <emphasis>validator</emphasis>. The validator
	usually is application-specific as it must understand the output file
	format to filter out unimportant noise (like different line endings on
	different operating systems, or small differences between floating point
	results due to the different rounding characteristics of different CPU
	architectures).
      </para>
      <para>
	Redundancy can be enabled in DC-API on a per client application basis by
	adding the appropriate <link
	  linkend="DC-API-Boinc-Redundancy"><literal>Redundancy</literal></link>
	value to the client's configuration group.
      </para>
      <para>
	If redundancy is enabled for a client application, work units for that
	client can not be suspended. The reason that it is generally impossible
	to compare the state of two BOINC results suspended at two different
	stage of their execution. If one of the suspended results is already
	corrupt and is restarted, the validator will no longer recognize the
	corruption since all new results starting from the corrupted initial
	state will produce the same but bad output.
      </para>

    </sect3>

    <sect3>
      <title>Messaging</title>

      <para>
	BOINC provides a limited messaging support that is accessible thru the
	DC-API <function><link
	    linkend="DC-sendWUMessage">DC_sendWUMessage()</link></function> and
	<function><link
	    linkend="DC-sendMessage">DC_sendMessage()</link></function>
	functions on the master and client side, respectively.
	<note>
	  <para>
	    See the note about <link linkend="Boinc-Config">configuration</link>
	    requirements for master-to-client messaging.
	  </para>
	</note>
      </para>
      <para>
	BOINC messaging has several restrictions:

	<itemizedlist>
	  <listitem>
	    <para>
	      Messages can only be sent to BOINC results that are currently
	      running. If a work unit has no running result, messages sent to it
	      are silently discarded.
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      If redundancy is enabled, master-to-client messages are sent to
	      all running BOINC results regardless their state. In case of
	      client-to-master messages, the master cannot tell which BOINC
	      result sent the message. This means that "request-response" style
	      messaging is hard to implement correctly when redundancy is
	      enabled.
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      Messages sent by the master are delivered only when the client
	      connects to the master next time. Since the master has no control
	      over this, the client should periodically send messages to the
	      master to force a connection if timely receiving of messages sent
	      by the master is important. Be caraful about the extra load placed
	      on the BOINC server by clients sending messages too frequently.
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      When multiple messages are being queued in either direction due to
	      the client not connecting to the server frequently enough, they
	      will be delivered to the peer in an undefined order.
	    </para>
	  </listitem>
	</itemizedlist>
      </para>
    </sect3>

    <sect3>
      <title>Cancelling a running work unit</title>

      <para>
	The <function><link
	    linkend="DC-cancelWU">DC_cancelWU()</link></function> function can
	be used to cancel a running work unit. This function is implemented
	by sending a special message to all running BOINC results. This implies
	that unless clients where BOINC results for this work unit are running
	connect back to the BOINC server, the cancel request may not be
	delivered until the client finishes the computation.
      </para>
      <para>
	Due to a race condition between various components of the BOINC system,
	it is also possible that a new BOINC result is created and is sent out
	after the work unit has been cancelled. Such BOINC result will not
	receive the cancellation request and will run until it finishes its
	computation. Its result however will not be reported to the DC-API
	master, so the application should not be concerned about this.
      </para>
    </sect3>

    <sect3>
      <title><anchor id="Boinc-Result"/>When is a result reported</title>

      <!-- XXX Rework when the validator API is merged -->
      <para>
	The BOINC core client handles the completion of a BOINC result in two
	phases: first it uploads the output files, then it notifies the BOINC
	server that the result has been finised. The validator will notice the
	completion of the BOINC result only when this notification is received.
      </para>
      <para>
	However, this notification is sent only when the core client has to
	connect the BOINC server the next time, which may be a long time if the
	core client has already started processing the next BOINC result while
	the output files of the previous result were being uploaded.
      </para>
      <para>
	When there are no more work units to download, the client sleeps for a
	couple of minutes before trying again. This means that the reporting of
	the completion of the last work unit may be delayed for a couple of
	minutes even after all its output files have been uploaded.
      </para>
      <para>
	The DC-API master application will receive notification about a result
	when the validator has made its decision. This may also introduce some
	delay after all BOINC results have been completed.
      </para>
    </sect3>

    <sect3>
      <title>Work unit priority</title>

      <para>
	The priority of a work unit can be set either by using the
	<function><link
	    linkend="DC-setWUPriority">DC_setWUPriority()</link></function>
	function or by specifying it in the configuration file using the
	<literal>DefaultPriority</literal> key. Either way, the priority can
	be an arbitrary 32-bit integer.
      </para>
      <para>
	The BOINC scheduler dispatches higher priority work units first. Results
	belonging to work units with lower priorities will not be offered to
	clients until all the higher priority work units are exhausted.
      </para>
    </sect3>

    <sect3>
      <title>Common errors</title>

      <para>
	There are some common errors:

	<variablelist>
	  <varlistentry>
	    <term>No results are reported</term>
	    <listitem>
	      <para>
		Check the validator.  When there is no validator defined in
		<filename>config.xml</filename> or the validator fails for some
		reason, the DC-API master will not receive result notifications.
	      </para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>The final result is not reported</term>
	    <listitem>
	      <para>
		There is a couple minutes <link
		  linkend="Boinc-Result">delay</link> before reporting the final
		result. It is normal.
	      </para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>
	      I've fixed a bug in a client application, but results are still
	      computed using the old client
	    </term>
	    <listitem>
	      <para>
		Be sure to give the new client binary with a version number
		greater than the old client, or otherwise the clients will not
		notice that the binary has been updated and will not download
		it.
	      </para>
	    </listitem>
	  </varlistentry>
	</variablelist>
      </para>
    </sect3>

    <sect3>
      <title>Open issues</title>

      <para>
	The following list contains the known problems with DC-API's BOINC
	backend:
      </para>

      <itemizedlist>
	<listitem>
	  <para>
	    Messages are not removed from the <literal>msg_to_host</literal>
	    and <literal>msg_from_host</literal> tables by the
	    <application>db_purge</application> tool, so they need to be cleaned
	    up manually from time to time to prevent the database from being
	    filled up.
	  </para>
	</listitem>
	<listitem>
	  <para>
	    The DC-API creates result template files in the
	    <filename>templates</filename> subdirectory in the project's root
	    directory, but those files are never removed.
	  </para>
	</listitem>
      </itemizedlist>
    </sect3>

  </sect2>
</sect1>
<!-- vim: set ai sw=2 tw=80: -->
