Local Pooling
Process that reduces the spatial dimensions of input data by aggregating information in local regions to create more abstract representations.
Local pooling operates on the feature maps generated by convolutional layers in neural networks, typically applying functions like max or average pooling. The primary purpose is to downsample or summarize the feature responses in localized patches of the feature map, making the output less sensitive to the exact location of features in the input image. This contributes to the network's ability to generalize from the training data to new, unseen data by providing an invariance to small shifts and distortions. Max pooling, for example, takes the maximum value from each patch of the feature map, while average pooling computes the average value. This process effectively reduces the computational complexity for subsequent layers, lowers the risk of overfitting by providing an abstracted form of the input, and helps in detecting features at multiple scales and orientations.
Local pooling became prominent with the rise of convolutional neural networks in the late 1980s and early 1990s. It was popularized by applications in deep learning, especially after the success of AlexNet in 2012, which used overlapping max pooling extensively.
Key figures in the development and popularization of local pooling include Yann LeCun, who applied similar concepts in the LeNet architecture for digit recognition in the late 1980s, and Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, whose AlexNet architecture won the ImageNet challenge in 2012 and heavily influenced the use of pooling layers in deep learning architectures.