{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Bayes Nets as Causal Graphical Models\n", "This notebook accompanies Lecture 17. The exercises should be used to gain facility with computing interventional distributions and generally playing around with encoding joint distributions as Bayes Nets. \n", "\n", "To require the fewest dependencies, this notebook encode distributions in a bare-bones way. You should feel free to use matrix or object representations as you see fit.\n", "\n", "Our first task will be to encode the conditional probability distributions for the graph \n", "\n", "$B$        $E$\n", "
  $\\searrow~\\swarrow$\n", "
     $A$\n", "
  $\\swarrow~\\searrow$\n", "
$F$        $L$" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# use indices to denote values that the variable takes on\n", "B = [0.99, 0.01]\n", "E = [0.98, 0.02]\n", "# Altered from lecture -- using the outer index for the variable's value\n", "# So: P(A=1 | B=0, E=1) is accessed like so: \n", "# > A[1][0][1]\n", "# old definition: A[b][e][a] -> A[a][b][e]\n", "# A = [[[0.99, 0.01], [0.10, 0.90]], [[0.05, 0.95], [0.01, 0.99]]]\n", "A = [[[None, None], [None, None]], [[None, None], [None, None]]]\n", "for (b, bprobs) in enumerate([[[0.99, 0.01], [0.10, 0.90]], [[0.05, 0.95], [0.01, 0.99]]]):\n", " for (e, eprobs) in enumerate(bprobs):\n", " for (a, p) in enumerate(eprobs):\n", " A[a][b][e] = p\n", "# old definitions: outer index is value of A\n", "# F = [[0.95, 0.05], [0.25, 0.75]]\n", "# L = [[0.90, 0.10], [0.05, 0.95]]\n", "F = [[None, None], [None, None]]\n", "L = [[None, None], [None, None]]\n", "\n", "for (a, aprobs) in enumerate([[0.95, 0.05], [0.25, 0.75]]):\n", " for (f, p) in enumerate(aprobs):\n", " F[f][a] = p\n", "for (a, aprobs) in enumerate([[0.90, 0.10], [0.05, 0.95]]):\n", " for (l, p) in enumerate(aprobs):\n", " L[l][a] = p\n", "\n", "# Quick sanity check to ensure, e.g., P(A=0 | B=1, E=0) + P(A=1 | B=1, E=0) = 1.0\n", "assert(abs(A[0][0][0] + A[1][0][0] - 1.0) < 0.001)\n", "assert(abs(A[0][0][1] + A[1][0][1] - 1.0) < 0.001)\n", "assert(abs(A[0][1][0] + A[1][1][0] - 1.0) < 0.001)\n", "assert(abs(A[0][1][1] + A[1][1][1] - 1.0) < 0.001)\n", "\n", "assert(abs(F[0][1] + F[1][1] - 1.0) < 0.001)\n", "assert(abs(F[0][0] + F[1][0] - 1.0) < 0.001)\n", "\n", "assert(abs(L[0][1] + L[1][1] - 1.0) < 0.001)\n", "assert(abs(L[0][0] + L[1][0] - 1.0) < 0.001)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then answer the usual queries, e.g., $P(F=1\\mid A=1)$ by looking up their values. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.75\n" ] } ], "source": [ "print(F[1][1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute arbitrary queries in the usual way. For example, we can compute the marginal probability of F (i.e., $P(F)$) using the factorization of $P(B, E, A, F, L)$." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.924079, 0.075921]\n" ] } ], "source": [ "# F takes on 2 values (0 and 1), so we initialize our representation of its \n", "# parameters as a 2-vector, indexed on the value of F\n", "PF = [0.0, 0.0]\n", "for b in range(len(B)):\n", " for e in range(len(E)):\n", " for a in range(len(A)):\n", " for f in range(len(F)):\n", " for l in range(len(L)):\n", " # marginalize over P(F|A)P(L|A)P(A|E,B)P(E)P(B)\n", " PF[f] += F[f][a] * L[l][a] * A[a][b][e] * E[e] * B[b]\n", "\n", "# Let's make sure we've just computed a valid probability distribution\n", "print(PF)\n", "assert(abs((PF[0] + PF[1]) - 1.0) < 0.001)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now compute $P(F\\mid do(A:=1))$. We could express this in two ways: we can update the probability distribution, setting all instances where A=1 to have probability 1.0, or we could modify our loop. Let's do one, then the other." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# What is P(F|do(A:=1))?\n", "\n", "newA = [[[0, 0], [0, 0]], [[1, 1], [1, 1]]]\n", "\n", "PFdoA1 = [0.0, 0.0]\n", "for b in range(len(B)):\n", " for e in range(len(E)):\n", " for a in range(len(newA)):\n", " for f in range(len(F)):\n", " for l in range(len(L)):\n", " # marginalize over P(F|A)P(L|A)P(A|E,B)P(E)P(B)\n", " # Change (2): A takes on 1 with probability 1; replace A[b][e][a] with 1.0 (or remove)\n", " PFdoA1[f] += F[f][a] * L[l][a] * newA[a][b][e] * E[e] * B[b]\n", "\n", "# Let's make sure we've just computed a valid probability distribution\n", "assert(abs((PFdoA1[0] + PFdoA1[1]) - 1.0) < 0.001)\n", "\n", "# equivalently, without using newA\n", "PFdoA1_2 = [0.0, 0.0]\n", "for b in range(len(B)):\n", " for e in range(len(E)):\n", " # Change (1): manually restricting values of A from [0,1] to [1]\n", " for a in [1]:\n", " for f in [0, 1]:\n", " for l in [0, 1]:\n", " # marginalize over P(F|A)P(L|A)P(A|E,B)P(E)P(B)\n", " # Change (2): A takes on 1 with probability 1; replace A[b][e][a] with 1.0 (or remove)\n", " PFdoA1_2[f] += F[f][a] * L[l][a] * 1.0 * E[e] * B[b]\n", "\n", "# Let's make sure we've just computed a valid probability distribution\n", "assert(abs((PFdoA1_2[0] + PFdoA1_2[1]) - 1.0) < 0.001)\n", "# assert that the results are the same\n", "assert(abs((PFdoA1[0] - PFdoA1_2[0])) < 0.001)\n", "assert(abs((PFdoA1[1] - PFdoA1_2[1])) < 0.001)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have the _interventional_ distribution, we can compute the effects of setting A:=0 vs. A:=1. Since we may not have observed instances of both A:=0 and A:=1, the way to do this is to compute the _expected value_ of the effect: $\\mathbb{E}(F\\mid do(A:=0) - F\\mid do(A:=1)) = \\sum_{f\\in\\{0, 1\\}} f * P(f\\mid do(A:=0)) - \\sum_{f\\in\\{0, 1\\}} f * P(f\\mid do(A:=1))$." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Effect of setting A:=1:\t -1.75\n" ] } ], "source": [ "# What is the effect of intervention on A?\n", "\n", "# First need to compute effect of intervetion under A:=0:\n", "# TODO\n", "PFdoA0 = [-1, -1]\n", "\n", "EFdoA0 = sum([f * PFdoA1[f] for f in range(len(F))])\n", "EFdoA1 = sum([f * PFdoA0[f] for f in range(len(F))])\n", "\n", "print(\"Effect of setting A:=1:\\t\", EFdoA0 - EFdoA1)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# Does P(F|A=1) differ from P(F|do(A:=1))? Why or why not?" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# What is the effect of intervention on A for the modified graph below?\n", "# use indices to denote values that the variable takes on\n", "B = [0.99, 0.01]\n", "E = [0.98, 0.02]\n", "# outer index is B; inner is E. don't do this at home.\n", "A = [[[0.99, 0.01], [0.10, 0.90]], [[0.05, 0.95], [0.01, 0.99]]]\n", "# outer index is value of A; inner is B\n", "F = [[[0.99, 0.01], [0.98, 0.02]], [[0.97, 0.03], [0.96, 0.04]]]\n", "L = [[0.90, 0.10], [0.05, 0.95]]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# practive computing each of P(F), P(F|do(A:=0)), P(F|do(A:=1)), and E[F|do(A:=0) - F|do(A:=1)]" ] } ], "metadata": { "interpreter": { "hash": "7889c71bdef5446a46fcad694a46e8ee114e56669b21828ffa64e3f126bb7780" }, "kernelspec": { "display_name": "Python 3.8.12 ('venv': venv)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }