Full problem statements can be found here. All code for this assignment can be found here.
Question 1 - Aj R Laddha
Question 2 - Pixelwise Image Segmentation - Kaustubh Verma
Code for the task can be found here
Description
This task invloved Pixelwise Image Segmentation on Iris image dataset. The image segmentation problem is a core vision problem with a longstanding history of research. Historically, this problem has been studied in the unsupervised setting as a clustering problem of given an image, produce a pixelwise prediction that segments the image into coherent clusters corresponding to objects in the image .
My Model
Here we have implemented FCN(Fully Convolutional Networks) model for Semantic Segmentation.Fully convolutional networks can efficiently learn to make dense predictions for per-pixel tasks like semantic segmentation. For this task I have implemented FCN 8 architecture which uses three concatenation of layers from pool3,pool4 and conv7 to produce final image segmentation. Such architecture enables fine detailing for segmentation task.
Question 3 - Core Point Detection - Nikhil T R
Description
In this task were asked to build and train a neural network that can find the core point of a given fingerprint. The dataset provided had 4000 images. Test images were not provided.
My Work
Initial look into the dataset showed 14 images that were polluted - they were blank. Those images were removed. I resized all the images to 250x400. The corresponding ground truths were also modified. A npz compressed archive of the images was made, along with the ground truths which was saved in a txt file for comparison later on. Scripts were also made to run the trained model on new test data.
The Model - Explained
Since the final output are co-ordinates, this is basically a regression problem. The CNN block consisted of 3 layers with kernel size 3x3 and padding being valid
. BatchNormalization was used for reducing overfitting and MaxPooling2D was used to bring down the feature size. The dense block was made of 3 layers with 128, 64 and 2 neurons respectively.
Input images were normalized to the range of 0-1 from 0-255. However, the output was not normalized. The activation of the last layer was set to relu
.