\documentclass[11pt]{article}

%% the following two lines are essential for lhs2TeX:
%include lhs2TeX.fmt
%include lhs2TeX.sty

%% ghc -O --make -o HsDep HsDep.lhs
%% lhs2TeX --poly HsDep.lhs > HsDep.ltx; pdflatex HsDep.ltx
%% lhs2TeX --poly HsDep.lhs > HsDep.ltx; latex HsDep.ltx; dvips -Ppdf -z HsDep.dvi

\title{HsDep: Dependency Graph Generator for Haskell}
\author{\textsc{Wolfram Kahl}\\[1ex]
Software Quality Research Laboratory, McMaster University}
\date{2004-06-21}

%{{{ Settings
\parindent0pt
\parskip0.5ex
\setlength{\voffset}{-15mm}
\setlength{\textheight}{255mm}
\setlength{\hoffset}{-15mm}
\setlength{\textwidth}{160mm}
%}}}

\pagestyle{empty}
\begin{document}
\maketitle
\thispagestyle{empty}

This is a wrapper around the dependency generation facilities of GHC
that produce a dot graph for module dependencies.

Usage:

\centerline{
\texttt{HsDep $[$--excludes="\emph{modules} $...$"$]$ \emph{graphname} \emph{GHCoptions} $...$ \emph{files} $...$}
}

This calculates a dependency graph among the Haskell modules
contained in \texttt{\emph{files}}
(assuming usual Haskell file naming conventions),
except that dependencies to any of the modules listed in the
``\texttt{--excludes}'' argument are omitted
(quotes are necessary only for protecting space-separated multiple
 \texttt{\emph{modules}} against interference from the shell).
The dot representation of the resulting dependency graph
is saved as \texttt{\emph{graphname}.dot},
and dot is invoked to convert it into the PostScript file
\texttt{\emph{graphname}.ps}
%
All the \texttt{\emph{GHCoptions} $...$} and \texttt{\emph{files} $...$}
are passed unchanged to the dependency generation invokation of GHC.

The remainder of this document is the literate Haskell implementation code.

\begin{code}
import System                        -- Haskell 98 imports
import IO
import List (partition, isPrefixOf)
import Dot                           -- companion module
\end{code}

Since full-fledged command-line parsing would require careful distinction
of HsDep options from GHC options,
we only extract the \texttt{--excludes} option in a simple way.

\begin{code}
exclopt = "--excludes="
\end{code}

After that, we assume that the first remaining argument is the dot graph name.

The implementation is presented top-down:

\begin{code}
main = do
  (excl, name : args) <- fmap (partition (exclopt `isPrefixOf`)) getArgs
  let dotfile = name ++ ".dot"
      psfile  = name ++ ".ps"
      depfile = name ++ ".depend"
      excludes = concatMap (words . drop (length exclopt)) excl
  putStrLn . unwords $ excludes
\end{code}

Since it seems to be impossible to have ``\textsf{ghc -M}''
output the generated dependencies to |stdout|,
we need to save them to a file;
we use \texttt{\emph{graphname}.depend} for this purpose
and do not to remove it after processing it,
in case it may be needed also for other purpses.

Then we write the dependencies into that temporary file,
and parse the file contents into dependency pairs:

\begin{code}
  system $ unwords (mkdepCommand depfile excludes : args)
  deps <- fmap parseDepFile $ readFile depfile
\end{code}

The dependencies are then output as dot graph,
and \textsf{dot} is invoked to convert to PostScript.

\begin{code}
  writeFile dotfile . show . dotOfDeps name $ deps
  system $ unwords ["dot -Tps ", dotfile, ">", psfile]
\end{code}

This finishes the definition of |main|.

For writing dependencies into a given file path,
currently the following invocation is necessary:

\begin{code}
mkdepCommand depfile excludes =
  unwords . ("ghc -M" :) . map ("-optdep" ++) .
            ("-f" :) . (depfile :) . map ("--exclude-module=" ++) $ excludes
\end{code}

We represent a module dependency as a pair of module names,
encoded as strings:

\begin{code}
type ModDep = (String,String)
\end{code}

For parsing dependency files,
we filter out comments first,
expect all remaining lines to be file dependencies,
which are converted into module dependencies by |depFromLine|.
We are only interested in the irreflexive part of this relation,
and therefore filter out dependencies arising from 
``\texttt{\emph{modname}.o :\ \emph{modname}.lhs}'' lines.

\begin{code}
parseDepFile :: String -> [ModDep]
parseDepFile = filter (uncurry (/=)) . map depFromLine . filter noComment . lines
\end{code}

Comments begin with the hash character, we also consider
empty lines as comments.

\begin{code}
noComment []       =  False
noComment (c : cs) =  c /= '#'
\end{code}

For obtaining module dependencies from file dependencies, we need to
drop the suffixes (here \texttt{.lhs}, \texttt{.hs}, \texttt{.hi},
\texttt{.o}) from both elements.  Furthermore, GHC in many cases lists
dependencies from the current directory prefixed with ``\texttt{./}'',
as well as dependancies from relative directories prefixed with
``\texttt{.../directory/to/file/}'', so we use |dropPrefix| to
eliminate those, too:

\begin{code}
depFromLine l = case words l of
  [ofile,":",depfile] -> ((dropSuffix . dropPrefix) ofile, dropPrefix $ dropSuffix depfile)
  _ -> error ("unexpected dependency line ``" ++ l ++ "''")
\end{code}

This implementation of |depFromLine|
is based on the assumption that ``\textsf{ghc -M}''
only generates single-dependency lines
and surrounds the colon with spaces;
if other output is encountered, this will be flagged as a run-time error.

The two auxiliary functions used here are straightforward:

\begin{code}
dropSuffix = reverse . tail . dropWhile ('.' /=) . reverse
dropPrefix = reverse . takeWhile ('/' /=) . reverse
\end{code}

For the generation of dot graphs,
we just turn each dependency into an edge
and insert some useful default settings:

\begin{code}
dotOfDeps :: String -> [ModDep] -> DotGraph
dotOfDeps name = DotGraph name . (settings ++) . map (\ (x,y) -> Edge x y [])

settings =
  NodeSettings
    [("shape","plaintext"), ("height","0"), ("width","0")
    ,("fontsize","20")
    ]
  : map Setting [("nodesep","0.1")
                ,("nslimit","100"), ("mclimit","100")] -- make dot work harder
\end{code}

The first three attributes produce outline-free nodes
with as little free space around them as possible.
The choice of font size,
with otherwise standard settings,
makes arrows reasonably thin and short (relative to the nodes).

Since the generated dot file can be edited and dot run again,
and dot settings can also be supplied on the dot command-line,
the lack of possibility to influence the settings chosen here
should not be a big problem.

\end{document}


%% Local variables:
%% folded-file: t
%% fold-internal-margins: 0
%% eval: (fold-set-marks "%{{{ " "%}}}")
%% eval: (fold-whole-buffer)
%% end: