# ==OCaml==, learning log <p class="doc-sub">// status: seedling</p> A running log of things I pick up while working through [Real World OCaml](https://dev.realworldocaml.org/toc.html) and poking at the language. Notes are deliberately rough — see the [[Index|garden conventions]]. Coming from a ==Rust== and ==Haskell== background, a lot of the ideas land quickly; the friction tends to be in the corners — tooling, syntax conventions, and a few words that mean subtly different things than they do elsewhere. # The module system The single feature that makes OCaml feel most _itself_. - Every `.ml` file is implicitly a module. `foo.ml` is the module `Foo` — capitalisation is not cosmetic, the compiler genuinely derives the module name from the filename. - An optional `.mli` file acts as the interface/signature. What is not exposed in the `.mli` is truly private — stronger than Rust's `pub`/`pub(crate)` dance. - Modules can nest, be anonymous, be passed as first-class values, and be parameterised (→ functors). ```ocaml module Counter = struct type t = { mutable n : int } let make () = { n = 0 } let incr c = c.n <- c.n + 1 let get c = c.n end ``` The corresponding signature: ```ocaml module type COUNTER = sig type t val make : unit -> t val incr : t -> unit val get : t -> int end ``` `open` brings names into scope; `include` copy-pastes them into the current module's signature. Mixing them up is the first intermediate-level foot-gun. # Variants & pattern matching Variants are algebraic data types, same idea as Rust `enum` / Haskell `data`: ```ocaml type shape = | Circle of float | Square of float | Rect of float * float let area = function | Circle r -> Float.pi *. r *. r | Square s -> s *. s | Rect (w, h) -> w *. h ``` Things worth remembering: - `function` is sugar for `fun x -> match x with`. Reach for it when the last argument is the thing you're matching on. - Exhaustiveness checks are on by default but _warnings_, not errors. Promote them with `-w +8` or `[@warning "+8"]` — otherwise a missing case silently compiles. - `when` guards let you add boolean conditions to a branch, but they opt out of exhaustiveness checking. Use them sparingly. ## Polymorphic variants The lesser cousin — no declaration needed, tagged with a backtick: ```ocaml let classify n = if n > 0 then `Positive else if n < 0 then `Negative else `Zero ``` Handy for throwaway sum types and for library APIs that want to be open to extension. The cost is worse error messages and a slightly more complicated type scheme. Default to regular variants; reach for polymorphic ones when the extensibility is genuinely worth it. # Functors Parameterised modules. Given a module, produce a module. ```ocaml module type ORDERED = sig type t val compare : t -> t -> int end module MakeSet (Ord : ORDERED) = struct type elt = Ord.t type t = elt list (* toy implementation *) let empty = [] let add x s = if List.exists (fun y -> Ord.compare x y = 0) s then s else x :: s end module IntSet = MakeSet (Int) ``` This is how the stdlib's `Set.Make` / `Map.Make` work. It feels heavyweight compared to Haskell typeclasses or Rust traits, but the upside is explicitness — there is exactly one `compare` in scope and you can see where it came from. For ad-hoc polymorphism you can also pass first-class modules, which blurs the line with dynamic dispatch. # Tooling — opam, dune, utop, merlin The ecosystem used to be rough. It isn't any more. ```mermaid %%{init: {'theme':'base','themeVariables':{'primaryColor':'#0a0907','primaryTextColor':'#efeadf','primaryBorderColor':'#f4d03f','lineColor':'#f4d03f','secondaryColor':'#050403','tertiaryColor':'#000000','background':'#000000','fontFamily':'JetBrains Mono, monospace'}}}%% flowchart LR switch["opam switch"] --> dune src[".ml / .mli"] --> dune cfg["dune-project"] --> dune dune[["dune"]] --> bin["native / bytecode"] dune --> repl["utop"] dune -. metadata .-> lsp["merlin / ocaml-lsp"] --> editor["editor"] fmt["ocamlformat"] -. formats .-> src dune --> odoc["odoc"] ``` - **opam** — package manager. Think `cargo` crossed with a Python venv: it installs compilers _and_ libraries, and supports per-project switches (`opam switch create . 5.1.1`). - **dune** — the de-facto build system. A `dune-project` at the root plus tiny `dune` stanza files per directory. Ludicrously fast incremental builds. - **utop** — the good REPL. Tab completion, multi-line editing, and the thing you actually want instead of vanilla `ocaml`. - **merlin** + **ocaml-lsp** — editor brains. Types on hover, jump-to-def, completions. Configured via a `.merlin` file historically, these days dune generates what's needed. - **ocamlformat** — the formatter. Like `rustfmt`, with configurable profiles. Just commit a `.ocamlformat` at the project root and stop arguing. - **odoc** — doc generator. Doc comments are `(** ... *)` (double asterisk). A sensible starter combo: `opam` for the switch, `dune` for the build, `utop` for scratch work, `ocaml-lsp` + `ocamlformat` wired into the editor. # Things that tripped me up The ones worth writing down so I don't hit them twice. - **`=` vs `==`** — exactly backwards from most languages I use. `=` is structural equality, `==` is physical (pointer) equality. `<>` and `!=` mirror them. Default to `=`. - **`;;`** — only needed in the REPL to end a top-level phrase. In a `.ml` file it's almost always a smell. - **Labelled and optional arguments** — `~name` is labelled, `?name` is optional (with an implicit `option` wrapper). Order of labelled args at the call site doesn't matter, which is nice, but partial application interacts with labels in ways that can surprise you. - **No typeclasses** — `show`, `eq`, `compare` don't fall out for free. Either derive them with `[@@deriving show, eq]` (via `ppx_deriving`) or use a functor. The module-first mindset feels alien at first and then obvious. - **`Printf`/`Format` format strings are type-checked** at compile time — `%d` for int, `%s` for string, `%a` for custom printers. Great once you know, confusing when the error message points at a literal string. - **Mutation is allowed and normal** — `ref`, `mutable` record fields, arrays. OCaml isn't Haskell; pragmatic mutation is idiomatic when it's local. Don't fight it. - **The `Stdlib` ↔ `Core` split** — Jane Street's `Core` is a richer standard library used heavily in _Real World OCaml_. It shadows a lot of `Stdlib`. Decide early which world you're in; mixing them is a papercut generator. # To revisit - [ ] GADTs in anger — I know the theory, haven't reached for them in OCaml yet. - [ ] Effect handlers (OCaml 5) — algebraic effects in the stdlib, curious how they compare to what I've seen in other languages. - [ ] First-class modules as a replacement for typeclass-style polymorphism. - [ ] `ppx` rewriters — how deep is the rabbit hole. --- Back to [[Index|Notes]] · [[Home]] · see also my reading of _Real World OCaml_ on [[Now]].