Introduction to Docker Environments for various Classes
1. Quick setup reminder
Prior to using docker environments for my classes, you will need to have your Docker Dekstop setup. A quick guide is shown below. A more detailed workshop on getting yourself familiar with Docker can be found at Introduction to Docker
Older version of Docker Desktop
If you had previously installed Docker Desktop on your system, you need to make sure that what you have installed is up to date. The latest Docker Desktop and the accompanying Docker Engine contain many useful tools to help administrating your images and containers.
As of Summer 2025, this material is tested on:
- Docker Dekstop version 4.41.2 (191736)
- Docker Engine: 28.1.1
- Docker Compose: v2.35.1-desktop.1
It is possible that by the time that you read this setup, the versions you have will be higher. That would be a good thing!
- Links to download and install Docker Desktop
Docker Desktop Terminal
- The most recent version of Docker Desktop comes with a built-in Terminal.
- If you are running the latest Docker Desktop version (4.35.1), this is a default feature available on the lower right cornder of the GUI.
- For earlier versions, this could show up as a beta feature.
- The remainder of this workshop will use the Docker Desktop Terminal app for consistency purpose. All the CLI docker commands can be executed on the standard Linux-based terminal of Mac and Linux platforms.
Docker Hub
- Docker Hub is one of the public repository for Docker images (think GitHub for container images).
- You should register for Docker Hub account at https://hub.docker.com and use it to log into your Docker Desktop environment (similar to how you link your GitHub account to GitHub Desktop, if you use GitHub Desktop).
2. CSC418-587
3. Distributed and Parallel Programming
- This environment consists of three images:
base
: contains all common softwarehead-instructor
: is built from base and contains additional Jupyter server and Code server designed to help instructors to edit lecture notes for the course.head-student
: is built from base and contains only the Code server
- When this environment is deployed, by default, there will be
- One
head
container (from eitherhead-instructor
orhead-student
images) with running Jupyter server (ifhead-instructor
) and Code server. User will interact with the environment through these servers. - Two
compute
containers:compute01
andcompute02
. They are only needed if users work on distributed MPI. - All containers can be connected via passwordless SSH by the built-in user account
student
. This account also has passwordless sudo power. - The internal
/home/student
is mounted from a volume directory shared across three containers mount this directory, effectively creating a shared storage environment.
- One
- To setup CSC466 environment, following the intructions in the README.md file of the csc466env GitHub
- If you are interested in tinkering with this environment, you are welcomed to fork the repo into your own GitHub repository before cloning.
4. Big Data Engineering
- This environment consists of four images:
base
: contains all common software and is built fromspark:3.5.2-java17-python3
image of Databrick.master-instructor
: is built from base and contains Jupyter server and additional Code server designed to help instructors to edit lecture notes for the course.master-student
: is built from base and contains only the Jupyter serverworker
: an almost-exact copy ofbase
with additional startup script added.
- When this environment is deployed, by default, there will be
- One
master
container (from eithermaster-instructor
ormaster-student
images) with running Jupyter server and Code server (ifmaster-instructor
) . User will interact with the environment through these servers. - Multiple worker containers. The number of
worker-
containters can be scaled at run time based on howdocker compose
is invoked. - The two internal directories,
/data
andnotebooks
are mounted from the corresponding local direcries existing inside the git repo into the master container. There is no external storage sharing with the worker containers (similar to how actual Spark clusters work).
- One
- To setup CSC467 environment, following the intructions in the README.md file of the csc467env GitHub
- If you are interested in tinkering with this environment, you are welcomed to fork the repo into your own GitHub repository before cloning.
5. Operating Systems (CSC331)
- This environment consists of three images:
base
: contains all common softwarehead-instructor
: is built from base and contains Code server. Volume mounts to help instructors to edit lecture notes for the course are added.head-student
: is built from base and contains only the Code server
- When this environment is deployed, by default, there will be
- One
head
container (from eitherhead-instructor
orhead-student
images) with run the Code server. User will interact with the environment through these servers. - It is also possible to SSH into the container via
localhost:2222
, using the loginstudent
and passwordgoldenrams
. - The internal
/home/student
contains the example code from OSTEP authors.
- One
- To setup CSC331 environment, following the intructions in the the-one-ring repository.
- If you are interested in tinkering with this environment, you are welcomed to fork the repo into your own GitHub repository before cloning.