In this assignment, I explored the performance trade-offs between deep learning architectures (ResNet-18, ResNet-50) and traditional machine learning models (SVM) on computer vision benchmarks.
The ResNet-18 model emerged as the best performer for both MNIST and Fashion-MNIST datasets. Despite being shallower than ResNet-50, it achieved higher accuracy (~99.09% on MNIST) with significantly lower computational overhead. On low-resolution $28 \times 28$ images, the increased depth of ResNet-50 likely led to redundant feature learning or minor degradation, proving that "deeper is not always better" for simpler tasks.
• Understood the practical impact of hardware acceleration (GPU vs CPU) observing a 12.5x speedup.
• Gained insights into how hyperparameters like Batch Size and Optimizers (Adam vs SGD) fundamentally shift convergence paths.
• Learned the importance of data loader optimizations like pin_memory for efficient DMA transfers.