Using the current environment
Getting started
We recommend using VS Code + Jupyter as the default development stack for DataHaskell:
- VS Code as your editor
- Jupyter notebooks for literate, reproducible analysis
- A Haskell notebook kernel (currently IHaskell)
- The DataHaskell libraries (e.g.
dataframe,hasktorch, plotting, etc.)
This page walks you through:
- Installing the basic tools
- Choosing an environment (Dev Container vs local install)
- Verifying everything with a “hello DataHaskell” notebook
1. Install the basics
You only need to do this once per machine.
1.1. VS Code
- Install Visual Studio Code from the official website.
- Open VS Code and install these extensions:
- Jupyter
- Python (used by the Jupyter extension, even if you write Haskell)
- Dev Containers (if you plan to use the container-based environment)
- Haskell (for syntax highlighting, type info, etc.)
1.2. Git
Install Git so you can clone repositories:
- macOS: via Homebrew (
brew install git) or Xcode command line tools - Linux: via your package manager (e.g.
sudo apt install git) - Windows: [Git for Windows] or via WSL (Ubuntu on Windows)
1.3. (Optional but recommended) Docker
If you want the easiest, most reproducible setup, install Docker:
- Docker Desktop (macOS/Windows) or
docker+docker-composefrom your Linux distro
The Dev Container–based environment assumes Docker is available.
2. Choose an environment
You have two main options:
-
Option A (recommended): VS Code Dev Container
Everything is pre-installed in a Docker image (GHC, Cabal/Stack, IHaskell, DataFrame, etc). -
Option B: Local installation
Install GHC, Cabal, Jupyter, IHaskell, and DataHaskell libraries directly on your machine.
If you’re not sure which to choose, pick Option A.
3. Option A – Dev Container (recommended)
This is the “batteries included” path. You get a pinned environment without polluting your global system.
3.1. Clone the starter repository
We provide a starter repository with a ready-made environment and example notebooks:
git clone https://github.com/DataHaskell/datahaskell-starter
cd datahaskell-starter
3.2. Open the project in VS Code
code .
You’ll get a popup asking if you want to re-ooen the project in a container. Select this option and VS Code will load the DataHaskell docker file.
3.3. Running the example notebook
Open the getting-started notebook. You’ll see a section that says Select Kernel at the top right.
Upon clicking it you’ll be asked to select a kernel. Go to Jupyter Environment and use the Haskell kernel installed there.
3. Option B – Installing everything locally
We recommend you use cabal for this section.
cabal update
cabal install --lib dataframe ihaskell-dataframe hasktorch \
ihaskell dataframe-hasktorch ihaskell-dataframe time ihaskell template-haskell \
vector text containers array random unix directory regex-tdfa containers \
cassava statistics monad-bayes aeson \
--force-reinstalls
cabal install ihaskell --install-method=copy --installdir=/opt/bin
ihaskell install --ghclib=$(ghc --print-libdir) --prefix=$HOME/.local/
jupyter kernelspec install $HOME/.local/share/jupyter/kernels/haskell/
jupyter notebook
Check if this setup is working by trying out the linear regression tutorial from the DataHaskell website.
Note this way of globally installing packages might break some of your existing projects.