Skip to main content

PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13664))

Included in the following conference series:

Abstract

The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition. In this paper, we propose an optimizing framework to provide robust visual privacy protection along the human action recognition pipeline. Our framework parameterizes the camera lens to successfully degrade the quality of the videos to inhibit privacy attributes and protect against adversarial attacks while maintaining relevant features for activity recognition. We validate our approach with extensive simulations and hardware experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 89.00
Price excludes VAT (USA)
Softcover Book
USD 119.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawal, P., Narayanan, P.: Person de-identification in videos. IEEE Trans. Circuits Syst. Video Technol. 21(3), 299–310 (2011)

    Article  Google Scholar 

  2. Ahmad, Z., Illanko, K., Khan, N., Androutsos, D.: Human action recognition using convolutional neural network and depth sensor data. In: Proceedings of the 2019 International Conference on Information Technology and Computer Communications, pp. 1–5 (2019)

    Google Scholar 

  3. Bommasani, R., et al.: On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)

  4. Born, M., Wolf, E.: Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light. Elsevier, Amsterdam (2013)

    Google Scholar 

  5. Brkic, K., Sikiric, I., Hrkac, T., Kalafatic, Z.: I know that person: generative full body and face de-identification of people in images. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1319–1328. IEEE (2017)

    Google Scholar 

  6. Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2d pose estimation using part affinity fields. IEEE TPAMI 43(1), 172–186 (2019)

    Article  Google Scholar 

  7. Chen, D., Chang, Y., Yan, R., Yang, J.: Tools for protecting the privacy of specific individuals in video. EURASIP J. Adv. Signal Process. 2007, 1–9 (2007)

    Article  MATH  Google Scholar 

  8. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

    Google Scholar 

  9. Christoph, R., Pinz, F.A.: Spatiotemporal residual networks for video action recognition. In: Advances in Neural Information Processing Systems, pp. 3468–3476 (2016)

    Google Scholar 

  10. Dave, I.R., Chen, C., Shah, M.: SPAct: self-supervised privacy preservation for action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20164–20173 (2022)

    Google Scholar 

  11. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)

    Google Scholar 

  12. Dwibedi, D., Aytar, Y., Tompson, J., Sermanet, P., Zisserman, A.: Counting out time: class agnostic video repetition counting in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10387–10396 (2020)

    Google Scholar 

  13. Fan, L., et al.: RubiksNet: learnable 3D-shift for efficient video action recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 505–521. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_30

    Chapter  Google Scholar 

  14. Goodman, J.W.: Introduction to Fourier Optics, 4th edn. Macmillan Learning, New York (2017)

    Google Scholar 

  15. Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6

    Chapter  Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  17. Hinojosa, C., Niebles, J.C., Arguello, H.: Learning privacy-preserving optics for human pose estimation. In: ICCV, pp. 2573–2582, October 2021

    Google Scholar 

  18. Hore, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: 2010 20th International Conference on Pattern Recognition, pp. 2366–2369. IEEE (2010)

    Google Scholar 

  19. Huang, G.B., Mattar, M., Lee, H., Learned-Miller, E.: Learning to align from scratch. In: NIPS (2012)

    Google Scholar 

  20. Ji, X., Cheng, J., Feng, W., Tao, D.: Skeleton embedded motion body partition for human action recognition using depth sequences. Signal Process. 143, 56–68 (2018)

    Article  Google Scholar 

  21. Junejo, I.N., Dexter, E., Laptev, I., Perez, P.: View-independent action recognition from temporal self-similarities. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 172–185 (2010)

    Article  Google Scholar 

  22. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)

    Google Scholar 

  23. Kopuklu, O., Kose, N., Gunduz, A., Rigoll, G.: Resource efficient 3d convolutional neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)

    Google Scholar 

  24. Krishna, R., Gordon, M., Fei-Fei, L., Bernstein, M.: Visual intelligence through human interaction. In: Li, Y., Hilliges, O. (eds.) Artificial Intelligence for Human Computer Interaction: A Modern Approach. HIS, pp. 257–314. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82681-9_9

    Chapter  Google Scholar 

  25. Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE (2011)

    Google Scholar 

  26. Kupyn, O., Martyniuk, T., Wu, J., Wang, Z.: DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8878–8887 (2019)

    Google Scholar 

  27. Lakshminarayanan, V., Fleck, A.: Zernike polynomials: a guide. J. Mod. Opt. 58(7), 545–561 (2011)

    Article  Google Scholar 

  28. Liu, B., et al.: Spatiotemporal relationship reasoning for pedestrian intent prediction. IEEE Robot. Autom. Lett. 5(2), 3485–3492 (2020)

    Article  Google Scholar 

  29. Marquez, M., Meza, P., Arguello, H., Vera, E.: Compressive spectral imaging via deformable mirror and colored-mosaic detector. Opt. Express 27(13), 17795–17808 (2019)

    Article  Google Scholar 

  30. Marquez, M., Meza, P., Rojas, F., Arguello, H., Vera, E.: Snapshot compressive spectral depth imaging from coded aberrations. Opt. Express 29(6), 8142–8159 (2021)

    Article  Google Scholar 

  31. Metzler, C.A., Ikoma, H., Peng, Y., Wetzstein, G.: Deep optics for single-shot high-dynamic-range imaging. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  32. Orekondy, T., Schiele, B., Fritz, M.: Towards a visual privacy advisor: understanding and predicting privacy risks in images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3686–3695 (2017)

    Google Scholar 

  33. Padilla-López, J.R., Chaaraoui, A.A., Flórez-Revuelta, F.: Visual privacy protection methods: a survey. Expert Syst. Appl. 42(9), 4177–4195 (2015)

    Article  Google Scholar 

  34. Panagiotakis, C., Karvounas, G., Argyros, A.: Unsupervised detection of periodic segments in videos. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 923–927. IEEE (2018)

    Google Scholar 

  35. Pareek, P., Thakkar, A.: A survey on video-based human action recognition: recent updates, datasets, challenges, and applications. Artif. Intell. Rev. 54(3), 2259–2322 (2021)

    Article  Google Scholar 

  36. Pittaluga, F., Koppal, S., Chakrabarti, A.: Learning privacy preserving encodings through adversarial training. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 791–799. IEEE (2019)

    Google Scholar 

  37. Pittaluga, F., Koppal, S.J.: Privacy preserving optics for miniature vision sensors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 314–324 (2015)

    Google Scholar 

  38. Pittaluga, F., Koppal, S.J.: Pre-capture privacy for small vision sensors. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2215–2226 (2016)

    Article  Google Scholar 

  39. Purwanto, D., Renanda Adhi Pramono, R., Chen, Y.T., Fang, W.H.: Extreme low resolution action recognition with spatial-temporal multi-head self-attention and knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, p. 0 (2019)

    Google Scholar 

  40. Ren, Z., Lee, Y.J., Ryoo, M.S.: Learning to anonymize faces for privacy preserving action detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 639–655. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_38

    Chapter  Google Scholar 

  41. Ryoo, M.S., Kim, K., Yang, H.J.: Extreme low resolution activity recognition with multi-Siamese embedding learning. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  42. Ryoo, M.S., Rothrock, B., Fleming, C., Yang, H.J.: Privacy-preserving human activity recognition from extreme low resolution. In: AAAI (2017)

    Google Scholar 

  43. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV 2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  44. Sitzmann, V., et al.: End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM TOG 37, 1–13 (2018)

    Article  Google Scholar 

  45. Sun, C., Junejo, I.N., Tappen, M., Foroosh, H.: Exploring sparseness and self-similarity for action recognition. IEEE Trans. Image Process. 24(8), 2488–2501 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  46. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  47. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  48. Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)

    Google Scholar 

  49. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)

    Google Scholar 

  50. Tran, D., Wang, H., Torresani, L., Feiszli, M.: Video classification with channel-separated convolutional networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5552–5561 (2019)

    Google Scholar 

  51. Upton, G.J.: Fisher’s exact test. J. R. Stat. Soc. A. Stat. Soc. 155(3), 395–402 (1992)

    Article  Google Scholar 

  52. Van Der Maaten, L., Postma, E., Van den Herik, J., et al.: Dimensionality reduction: a comparative. J. Mach. Learn. Res. 10(66–71), 13 (2009)

    Google Scholar 

  53. Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2

    Chapter  Google Scholar 

  54. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  55. Wang, Z.W., Vineet, V., Pittaluga, F., Sinha, S.N., Cossairt, O., Bing Kang, S.: Privacy-preserving action recognition using coded aperture videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, p. 0 (2019)

    Google Scholar 

  56. Wu, Z., Wang, H., Wang, Z., Jin, H., Wang, Z.: Privacy-preserving deep action recognition: an adversarial learning framework and a new dataset. IEEE Trans. Pattern Anal. Mach. Intell. (2020)

    Google Scholar 

  57. Wu, Z., Wang, Z., Wang, Z., Jin, H.: Towards privacy-preserving visual recognition via adversarial training: a pilot study. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 627–645. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_37

    Chapter  Google Scholar 

  58. Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlos Hinojosa .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 15827 KB)

Supplementary material 2 (pdf 2099 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hinojosa, C., Marquez, M., Arguello, H., Adeli, E., Fei-Fei, L., Niebles, J.C. (2022). PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13664. Springer, Cham. https://doi.org/10.1007/978-3-031-19772-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19772-7_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19771-0

  • Online ISBN: 978-3-031-19772-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics