Efficient Semantic Segmentation of Urban Traffic Images using ERFNet and Fisheye Cameras

Abstract
Efficient and accurate semantic segmentation of urban traffic images is critical for various applications such as autonomous driving, traffic monitoring, and urban planning. However, the complex and dynamic nature of urban scenes, occlusions, and fisheye distortions pose significant challenges for accurate semantic segmentation. In this paper, we propose the use of the Efficient Residual Factorized ConvNet (ERFNet) architecture for efficient and accurate semantic segmentation of urban traffic images captured using fisheye cameras. We conducted experiments on a dataset of urban traffic images and compared the performance of ERFNet with several state-of-the-art architectures. The results showed that ERFNet outperformed other architectures in terms of both accuracy and speed, achieving an intersection over union (IoU) score of 80.2%. Additionally, ERFNet had the lowest computational cost, making it suitable for real-time applications with limited resources. Our results demonstrate the potential of ERFNet for efficient semantic segmentation of urban traffic images captured using fisheye cameras, providing insights for future research in this area.

Keywords
Semantic segmentation, ERFNet, CNN, urban area traffic, fish-eye camera.

Cite this paper
Pichika Ravi Kiran, Midhun Chakkaravarthy, Efficient Semantic Segmentation of Urban Traffic Images using ERFNet and Fisheye Cameras , SCIREA Journal of Computer. Volume 9, Issue 3, June 2024 | PP. 73-86. 10.54647/computer520420

References

[ 1 ]	J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[ 2 ]	J. Zhang, Q. Zhang, W. Zheng, and Y. Liu, "Fisheye semantic segmentation using modified U-Net," Journal of Sensors, vol. 2020, 2020.
[ 3 ]	A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello, "ENet: A deep neural network architecture for real-time semantic segmentation," in Proceedings of the European Conference on Computer Vision, 2016.
[ 4 ]	H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[ 5 ]	Y. Li, J. Huang, Y. Zhang, S. Wang, and X. Liu, "Deep learning with superpixels for semantic segmentation of urban scenes," in Proceedings of the IEEE International Conference on Computer Vision, 2017.
[ 6 ]	Romera, E., Alvarez, J. M., Bergasa, L. M., & Arroyo, R. (2018). ERFNet: Efficient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 19(1), 263-272.
[ 7 ]	Zhang, L., Shen, T., Zhu, H., & Shen, J. (2020). Efficient semantic segmentation of fisheye images with novel CNN architectures. IEEE Transactions on Intelligent Transportation Systems, 21(1), 141-153.
[ 8 ]	Zhu, Y., Zhou, L., Yang, C., & Wang, X. (2020). Real-time semantic segmentation of urban traffic images with fisheye cameras using a lightweight CNN. Journal of Visual Communication and Image Representation, 68, 102809
[ 9 ]	Liu, J., Zhang, W., & Zhang, J. (2019). Semantic segmentation of urban traffic scenes using fully convolutional network with joint up-sampling and classification. IEEE Access, 7, 98808-98818.
[ 10 ]	Kuo, T. Y., & Cheng, Y. H. (2019). Real-time semantic segmentation for urban traffic scenes with fisheye cameras. IEEE Transactions on Intelligent Transportation Systems, 21(3), 1323-1335.
[ 11 ]	Li, B., Zhang, L., Li, Y., & Lu, H. (2020). Real-time semantic segmentation for fisheye images based on convolutional neural network. Applied Sciences, 10(13), 4624.