Due at 3:00 pm on Thursday, April 9th. Please submit to Gradescope and follow the general guidelines regarding homework assignments.

This assignment is based on the research done in the Seung lab. Microscope images are taken of adjacent sections of brain tissue (serial section microscopy) and algorithms scan through these images and segment all the neurons. The hope is that these segmentations, in combination with knowledge about where the neurons form synapses with each other, can allow people to generate wiring diagrams (connectomes) accurate and large enough to learn something about the function of a brain.

In this assignment, we will use the UNet architecture to give pixelwise labelings of neuron boundaries and then we will use connected components to fill in these boundaries, giving us segmentations. The notebook for this pset can be found here: UNet.ipynb.

Using a GPU in this exercise should speed up training by more than a factor of 10 so it is recommended to use the GPU capabilities of Collaboratory for this exercise. Ensure that the GPU is enabled by clicking on Runtime -> Change runtime type -> Hardward accelerator and making sure GPU is selected.

  1. Visualize the Dataset
    • You don’t have to submit anything for this part. Just run the code under the section “Visualize Dataset” to view examples of microscope images, neuron borders, and the resulting connected components segmentation. Can you identify the neuron borders without help from the ground truth border maps?
  2. Implement UNet
    • Use the paper for guidance here, especially figure 1. Make 3 modifications to the architecture in this assignment:
      1. Output a single feature map instead of two. Sigmoid of this value gives us our predicted probability of the pixel not being a boundary.
      2. Use a network that is 4 times narrower. Note that “narrow” refers to the number of channels/feature maps, not they xy size of each feature map. So for example, the feature maps with 64 channels in the original UNet paper now have 16 channels.
      3. Use “same” padding convention so convolution operations do not change the xy size of feature maps. This ensures the output is the same xy size as the input.
    • We have put some helper functions in the cell under “Create UNet”. Your task is to fill in the following sections:
      1. ConvBlock: perform two 3x3 convolutions (two horizontal blue arrows in figure 1)
      2. DownConvBlock: perform downsampling followed by two 3x3 convolutions (1 red arrow plus 2 horizontal blue arrows in figure 1)
      3. UpConvBlock: perform upsampling, concatenate feature maps, then do two 3x3 convolutions (1 gray, 1 green and 2 blue arrows in figure 1)
      4. UNet.forward: perform forward pass of UNet
      5. UNet.initialize_weights: This network has more than 20 layers of convolution. This makes initialization important. Implement the kernel initialization described in the paper. Initialize the biases to zero.
  3. Train UNet
    • You don’t have to submit anything for this part. Simply run the cells in the “Train UNet” section. You can stop once validation loss is not improving. This should be a good check that your UNet implementation is correct.
    • You should see both train and validation loss decrease fairly quickly.
  4. Make Segmentations
    • You don’t have to submit anything for this part. Run the cells in the “Make Segmentations” section. This uses connected components to generate a segmentation. Can you see any qualitative differences between training set and validation segmentations?
  5. Receptive Field:
    • The receptive field of a CNN is the set of pixels in the input that effect a single output pixel. We compute the receptive field for UNet using two methods:
      1. Analytic Method: Ignoring border effects, calculate the size of the receptive field for UNet. This is a tricky calculation. Also note that for this architecture, it depends on exactly which output pixel you choose (even when ignoring border effects). Your final answer should have 2 possible receptive fields sizes in x and y for a total of 4 possible receptive field sizes. Hint: The skip connections don’t change the computation here so you can just consider the downsampling/upsampling path.
      2. Empirical Method: Use PyTorch’s auto-differentiation capabilities to compute the gradient of a central output pixel (y,x) = (256,256) with respect to the input image. (Use a randomly initialized network here). Explain why this gradient should allow you to get a good estimate of the receptive field.