Cross-package documentation, part 1

It would appear to the casual observer that Haddock works fairly excellently cross-packages.  For example, I haddocked the Haddock code, and all the links to the types from the GHC API point to the haddock-docs in my GHC 6.10.3 installation.  (Yes, you need the versions of the dependent packages chosen, installed and haddocked, and this doesn’t always work out optimally online. But that’s not the problem that I’m confronting.)

Yes, links work.  But that’s about it, right now.  Compare (broken)  http://www.haskell.org/ghc/dist/stable/docs/libraries/haskell98/List.html to (good) http://www.haskell.org/ghc/dist/stable/docs/libraries/base/Data-List.html .  The broken one’s source (haskell98:List) imports the good one (base:Data.List).  haskell98:List’s documentation luckily lists the *names* of the exported things, because they’re listed in that file’s export list.  But not their types or their docs.  Not the module doc header (though that shouldn’t be copied anyway), and there’s no link to base:Data.List (which is acceptable. Although maybe someone should write a haddock-header to haskell98:List that says it’s the Haskell 98 version (specified at http://www.haskell.org/onlinereport/list.html ) of the slightly expanded base-package Data.List[link].

Anyway, the way we get any links to modules in other packages is because we read their generated .haddock files.  They’re binary and haddock doesn’t have any obvious way to make them human-readable, but they correspond to Haddock.InterfaceFile.InterfaceFile.  Which is, roughly, a list of Haddock.Types.InstalledInterface, which seems to be the interesting bit.

data InstalledInterface = InstalledInterface {
instMod            :: Module,
instInfo           :: HaddockModInfo Name,
instDocMap         :: Map Name (HsDoc DocName),
instExports        :: [Name],
instVisibleExports :: [Name],
instOptions        :: [DocOption],
instSubMap         :: Map Name [Name]
}

A Module (not to be confused with GhcModule, which is defined by Haddock) is a low-information type defined in GHC that contains the package name and version and the module name.

HaddockModInfo is just the module’s header description, plus the portability:, stability:, maintainer: fields (HaddockModInfo is defined in GHC, oddly enough: must be parse result. defined in ghc:HsSyn to be precise.)

DocOptions are hide, prune, ignore-exports, not-home. (defined in haddock code.)

The rest contain a lot of “Name”s, which is a GHC thing that refers unambiguously to the place an identifier originates.  Sufficient for making a link, but not sufficient by itself for copying the named identifier’s docs or type.  So passing over them for now… there is one interesting thing left.

A DocName contains a Name and also (if any) the module we’d like to link to in which that name is documented.  instDocMap :: Map Name (HsDoc DocName).  This gives more info on any number of Name identifiers.  (HsDoc provides formatting, DocName provides its references .)  There is no type information here at all, as far as I can tell, which will clearly need to be remedied somehow (in Interface, roughly a superset of InstalledInterface, types appears in ifaceDeclMap, though I’m not sure if that’s where they’re retrieved from for HTML-doc-printing).  But there’s another big question I need to find out: *which* names are documented in any given module’s instDocMap?  The type provides no clue, nor does its current (lack of a) doc string, nor ifaceRnDocMap.  I could look everywhere in the code that it’s generated, or I could ask David Waern… who will need to tell me if I said anything confused here anyway 🙂

7 Responses to “Cross-package documentation, part 1”

  1. David Waern Says:

    You didn’t say anything wrong here, so I’ll just answer your questions 🙂

    During HTML generation, the declarations are taken from ifaceRnExportItems which represents the export items. See the ExportItem the data type – it contains LHsDecl. Creating the export items is one of the most important jobs of Haddock.Interface.Create.

    You were wondering which names are documented in a given modules’s instDocMap. That is the names of all declarations in that module, that have documentation. It is the same as ifaceRnDocMap in Interface, which is generated from ifaceDeclMap by taking all declared names that have documentation and renaming the documentation. ifaceDeclMap is just a map of all declarations in the module.

    Haddock could be a lot better documented here.

  2. Andrea Vezzosi Says:

    this blog is becoming a nice resource for wannabe haddock hackers 🙂
    so i’ll try asking here:
    does the .haddock interface file contains enough information to generate e.g. the documentation for that module? i’m guessing the answer is no.
    that’s a bit unfortunate though, since a machine-readable format to ship documentation that can be then converted in a desired format would be quite useful, especially when one uses binary tarballs for libraries (like when installing ghc).

  3. haddock2009 Says:

    Andrea: nope. I believe the missing bits of information are the doc-strings, which I plan to include in the .haddock files; and the type-signatures, which my current plan is to retrieve from .hi files. Luckily .hi files are present in binary distributions, so your idea might work. Do you have a particular use-case? Like, letting the end-user turn their docs into whatever strange format they like, as long as Haddock supports it? (or even machine-processable. hmm. Haddock and GHC and .hi/.haddock file versions are all very tightly coupled at the moment, by the way.)

    thanks!
    -Isaac

  4. haddock2009 Says:

    I think my next step should be the simpler bit: extend .haddock files to contain their docs. After that I might have a better guess where best to thread in the type-signatures.

  5. David Waern Says:

    The doc strings are already in the .haddock files! This is stated in the wiki-page we wrote about this task:

    http://trac.haskell.org/haddock/wiki/CrossPackageDocumentation

    🙂

  6. Andrea Vezzosi Says:

    The motivating goal for me was keeping a central hoogle index of all the packages installed, and finding that while packages shipped with ghc have the .haddock interface installed you can’t currently extract the .txt file to feed hoogle from it.

  7. David Waern Says:

    Andrea: that might be possible to do in the future, since it should be possible to fully re-create the interface given the .hi files and the .haddock file provided that we put in some missing bits in the .haddock file. Note that othing is missing for cross-package documentation, but if you want to re-generate documentation or create a hoogle file, we need to put some more bits into the .haddock file. This is not in scope of Isaac’s project, though.

Leave a reply to David Waern Cancel reply