\documentclass[12pt]{article} %% the following two lines are essential for lhs2TeX: %include lhs2TeX.fmt %include lhs2TeX.sty %% ghc --make -o CapsConflicts CapsConflicts.lhs %% lhs2TeX --poly CapsConflicts.lhs > CapsConflicts.ltx; pdflatex CapsConflicts.ltx %% lhs2TeX --poly CapsConflicts.lhs > CapsConflicts.ltx; latex CapsConflicts.ltx; dvips -Ppdf -z CapsConflicts.dvi \title{CapsConflicts --- Detecting Directory Entries that Differ only in Capitalisation} \author{Wolfram Kahl} \date{2004-06-03} \parindent0pt \parskip0.5ex \setlength{\voffset}{-12mm} \setlength{\textheight}{243mm} \setlength{\hoffset}{-10mm} \setlength{\textwidth}{160mm} \begin{document} \maketitle The default filesystem on MacOS X is case insensitive --- before using tools like Unison between a Mac and a real Unix box makes sense, all these conflicts have to be resolved. The tool here just finds them. \begin{code} module Main(main) where import System.Posix.Files import Directory import IO import Char import System import Data.FiniteMap \end{code} We start from either all command-line arguments, or the current directory: \begin{code} main = do args <- getArgs paths <- case args of [] -> fmap (:[]) getCurrentDirectory _ -> return args mapM_ processDirectory paths \end{code} A useful utility in this context is concatenation of file paths: \begin{code} fpCat :: FilePath -> FilePath -> FilePath fpCat dir file = dir ++ '/' : file \end{code} The following data structure for a case-insensitive finite map could be encapsulated in a separate module exporting only |conflictGroups|. The conflict detection uses finite maps that map lower-case strings to non-empty lists of directory entries. \begin{code} newtype DirMap = DM {unDM :: (FiniteMap String [FilePath])} emptyDirMap = DM emptyFM addToDirMap :: DirMap -> FilePath -> DirMap addToDirMap (DM m) s = let key = map toLower s in DM . addToFM m key . (s :) . maybe [] id $ lookupFM m key listToDirMap :: [FilePath] -> DirMap listToDirMap = foldl addToDirMap emptyDirMap \end{code} Only groups with more than one element represent conflicts: \begin{code} conflictGroups :: [FilePath] -> [[FilePath]] conflictGroups = foldFM f [] . unDM . listToDirMap where f key [] r = error "conflictGroups: impossible case" f key [s] r = r f key ss r = ss : r \end{code} The groups are displayed without particular efforts: \begin{code} groupLines :: FilePath -> [FilePath] -> [String] groupLines path = ("" :) . map (fpCat path) printGroups :: FilePath -> [[FilePath]] -> IO () printGroups path [] = return () printGroups path gs = putStrLn . unlines . concatMap (groupLines path) $ gs \end{code} We define a variant of |FilePath| concatenation that checks whether the resulting path is a sub-directory, not counting or following symbolic links: \begin{code} catSubDir :: FilePath -> FilePath -> IO (Maybe FilePath) catSubDir dir "." = return Nothing catSubDir dir ".." = return Nothing catSubDir dir file = let path = fpCat dir file in catch (do st <- getSymbolicLinkStatus path if isSymbolicLink st then return Nothing else do st <- getFileStatus path return $ if isDirectory st then Just path else Nothing ) (\ err -> do hPutStrLn stderr ("catSubDir error: " ++ path ++ " - " ++ ioeGetErrorString err) return Nothing) \end{code} Processing a directory: \begin{code} processDirectory path = do contents <- catch (getDirectoryContents path) (\ err -> do hPutStrLn stderr ("getDirectoryContents " ++ path ++ " - " ++ ioeGetErrorString err) return []) printGroups path . reverse . conflictGroups $ contents mapM_ (catSubDir path =>>?= processDirectory) contents \end{code} Here we used a variant of monadic composition, specialised for branches that are applied only to |Just| results: \begin{code} (=>>?=) :: Monad m => (a -> m (Maybe b)) -> (b -> m ()) -> (a -> m ()) f =>>?= g = \ x -> f x >>= maybe (return ()) g \end{code} \end{document}