A **monad** is defined as a triple $\langle T,\eta,\mu\rangle$ where $T$ is an endofunctor of some category $\mathsf{X}$, and $\eta:1_\mathsf{X}\Rightarrow T$ and $\mu:T^2\Rightarrow T$ are natural transformations , which are required to be such that the following two diagrams commute:

The natural transformation $\eta$ is called the **unit** of the monad, and the natural transformation $\mu$ is called the **multiplication** of the monad. The commutativity of the first diagram is the **associative law** of the monad, and the second diagram expresses the **left and right unit laws** of the monad.

But what the heck does all of this this mean? These conditions are super obscure, so let's try and get some intuition about how to interpret what a monad actually *does*.

Right now, the way of conceptualizing a monad that's been most helpful for me has been to think of a monad as a kind of *"wrapper"* for data. The natural transformation $\eta:1_\mathsf{X}\to T$ expresses that any object in the category $\mathsf{X}$ *can be wrapped* in the kind of wrapper that $T$ represents, or that there is another object in the same category that is a "wrapped version" of the original object. The natural transformation $\mu:T^2\to T$, then, expresses that "double-wrapping" an object is in some sense redundant, and a double-wrapped version of an object can be reduced to a singly-wrapped versions of the same object.

Okay then, but under this interpretation, what do the two commutative diagrams mean?

The first commutative diagram addresses how a *triply-wrapped* object can be reduced to a *singly-wrapped* object. In particular, there are two potentially differend ways of reducing a triple-wrapping to a single-wrapping using $\mu$: we can either reduce the two outer wrappers to a single wrapper, and then reduce the combination of the resulting wrapper and the innermost wrapper; or we can reduce the two inner wrappers to a single wrapper, and then reduce the combination of the outermost wrapper and this newly formed inner wrapper to a single wrapper. The commutativity of this diagram expresses that *these two ways of reducing a triple-wrapper are identical*.

The second commutativity condition, on the other hand, tells us about what happens if we add a second wrapper to pre-wrapped data and then immediately reduce the two wrappers to a single wrapper. In particular, adding another wrapper *outside* of the original wrapper and then reducing has the same effect as adding another wrapper *inside* of the original wrapper and then reducing.

Every adjunction between functors gives rise to a monad. To be specific, if $F\dashv G$ with $F:\mathsf{C}\leftrightarrows\mathsf{D}:G$, then the endofunctor $GF:\mathsf{C}\to\mathsf{C}$ is a monad with unit given by $\eta:1_\mathsf{C}\Rightarrow T$, which is the same as the unit of the adjunction, and multiplication given by $\mu=G\epsilon F:T^2\Rightarrow T$, where $\epsilon$ is the counit of the adjunction.