\documentclass[a4paper,final,10pt]{article}

\title{MLton Cross Compiler Bootstrap Overview}
\author{\'{A}noq of the Sun, Hardcore Processing
\footnote{\copyright 1998-2001 \'{A}noq of the Sun (alias Johnny Andersen)}}

\begin{document}

\maketitle

\section{Introduction}

This documents tries to give an overview of the bootstrapping
process for turning MLton into a cross compiler.

\section{Notation}

The basic notation will be as the notation used in \cite{partialeval} and
\cite{notesdat2vprog}.
As a brief summary of this notation the function
$[| \cdot |]_L$ will take the meaning of a
program according to the semantics of the language $L$.
For example the meaning of running the program
$source$ written in the language $S$ with input $in_1, in_2, \ldots, in_n$ is the same as
running an interpreter $int$ for the language $S$ written in the language $L$
with the program $source$ and it's input $in_1, in_2, \ldots, in_n$ as inputs for $int$. This is
expressed with the following notation:

\begin{displaymath}
[|source|]_S [in_1, in_2, \ldots, in_n] = [|int|]_L [source, in_1, in_2, \ldots, in_n]
\end{displaymath}

Compiling the program $source$ with a compiler $comp$ written in the langauge $L$ is
expressed as:

\begin{displaymath}
target = [|comp|]_L source
\end{displaymath}

If $comp$ performs a correct translation of programs from the language $S$ to some target language $T$, the following should hold:

\begin{displaymath}
output = [|source|]_S [in_1, in_2, \ldots, in_n] = [|target|]_T [in_1, in_2, \ldots, in_n]
\end{displaymath}

For simplicity we assume that there are no termination problems.

\newpage

\section{Compilation of MLton}

\subsection{Abbreviations}

\begin{itemize}
\item $MLton_{SML}$ : The Standard ML source code for MLton
\item $MLton_{binLinux}$ : A compiled version of MLton for Linux (assumed to be on x86)
\item $Linux$ : The Linux platform (also considered a "language")
\end{itemize}

\subsection{Native Linux Compilation of MLton}

Compiling MLton with itself once is usually done as follows:

\begin{displaymath}
MLton_{binLinux} = [|MLton_{binLinux}|]_{Linux} [MLton_{SML}, "linux", consts]
\end{displaymath}

Where the input string $"linux"$ can be considered a commandline argument for the
platform to compile to and $consts$ is the constants file that MLton uses duing compilation.
Actually $consts$ is usually generated by MLton - but we pretend that it has been generated already - since
one of the purposes of this document is to find out how to generate the $consts$ files for
cross compilation.

\subsection{Bootstrapping MLton into a Linux to Win32 Crosscompiler}

To turn the compiler into a cross compiler we would need to do the following
compilation step:

\begin{displaymath}
MLton_{binWin32Cross} = [|MLton_{binLinux}|]_{Linux}[MLton_{SML}, "win32bootstrap", consts_{Win32Bootstrap}]
\end{displaymath}

Cross compiling applications with the resulting compiler is done as:

\begin{displaymath}
App_{binWin32} = [|MLton_{binWin32Cross}|]_{Linux}[App_{SML}, "win32cross", consts_{Win32Cross}]
\end{displaymath}

And we hope to run this application on Win32:

\begin{displaymath}
output = [|App_{binWin32}|]_{Win32}[in_1, in_2, \ldots, in_n]
\end{displaymath}

The interesting parts are how $MLton_{binWin32Cross}$, $App_{binWin32}$,
$consts_{Win32Bootstrap}$ and $consts_{Win32Cross}$ are generated.
The way that
$consts_{Win32Bootstrap}$ and $consts_{Win32Cross}$ are generated
can probably be controlled just by the flags $"win32bootstrap"$
and $"win32cross"$.

To get an overview of how
$MLton_{binWin32Cross}$
is generated and how the $MLton_{SML}$ code should be modified to achieve that
it is probably a good idea to pretend that the $MLton_{SML}$ code, the $consts_{Win32Bootstrap}$ file and
the resuling binary $MLton_{binWin32Cross}$ is divided into 2 disjoint parts:

\begin{itemize}
\item The part of the code that implements reading and writing compiled files and other compiler output, input or other interfacing with the host environment (hostIO)
\item The part of the code that generates code and basis library functions which is to be executed (outgen)
\end{itemize}

The constants file for the outgen part will have to be split further into 2 parts:

\begin{itemize}
\item The constants that are used to implement the outgen algorithms in the MLton code (impl)
\item The constants that are used in the produced output (prod)
\end{itemize}

We will split the cross compiling equation into 2 equations, and to
simplify the notation just a wee bit and focus on the parts we need to
solve we will rename $MLton_{binLinux}$ to
$M_n$ (MLton native), $MLton_{SML}$ to $S$, $consts_{Win32Bootstrap}$ to $c_b$ and $"win32bootstrap"$ to $"b"$:

\begin{displaymath}
MLton_{binWin32Cross_{hostIO}} = [|M_n|]_{Linux}[S_{hostIO}, "b", c_{b_{hostIO}}]
\end{displaymath}

\begin{displaymath}
MLton_{binWin32Cross_{outgen}} = [|M_n|]_{Linux}[S_{outgen}, "b", c_{b_{outgen_{impl}}}, c_{b_{outgen_{prod}}}]
\end{displaymath}

When we start to compile an application by running the entire $MLton_{binWin32Cross}$ on Linux
each of the compiled parts will do their work in a different way:

\begin{itemize}
\item $MLton_{binWin32Cross_{hostIO}}$ : Must produce files etc. as it is done on Linux.
From this we can conclude that $S_{hostIO}$ must be compiled for Linux and that $c_{Win32Bootstrap_{hostIO}}$
should be Linux constants.
\item $MLton_{binWin32Cross_{outgen}}$ : Must generate code and basis library calls for Win32.
From this we can conclude that $S_{outgen}$ must be compiled for Linux and generate code for Win32.
This means that the constants $c_{b_{outgen_{impl}}}$ should be Linux constants
and that $c_{b_{outgen_{prod}}}$ should be Win32 constants.

\end{itemize}

The file $consts_{Win32Cross}$ should be a constants file with only Win32 constants. 

\begin{thebibliography}{99}
\bibitem{notesdat2vprog} Nils Andersen, Fritz Henglein, Niel D. Jones, \emph{Notes for Dat2V-Programminglanguages at DIKU}, 1999.
\bibitem{partialeval} Carsten K. Gomard, Niel D. Jones, Peter Sestoft \emph{Partial Evaluation and Automatic Program Generation}, Prentice Hall 1993.
\end{thebibliography}

\end{document}
