Skip to main content

Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11962))

Included in the following conference series:

Abstract

Most existing AU detection works considering AU relationships are relying on probabilistic graphical models with manually extracted features. This paper proposes an end-to-end deep learning framework for facial AU detection with graph convolutional network (GCN) for AU relation modeling, which has not been explored before. In particular, AU related regions are extracted firstly, latent representations full of AU information are learned through an auto-encoder. Moreover, each latent representation vector is feed into GCN as a node, the connection mode of GCN is determined based on the relationships of AUs. Finally, the assembled features updated through GCN are concatenated for AU detection. Extensive experiments on BP4D and DISFA benchmarks demonstrate that our framework significantly outperforms the state-of-the-art methods for facial AU detection. The proposed framework is also validated through a series of ablation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 89.00
Price excludes VAT (USA)
Softcover Book
USD 119.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Atwood, J., Towsley, D.: Diffusion-convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1993–2001 (2016)

    Google Scholar 

  2. Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013)

  3. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3844–3852 (2016)

    Google Scholar 

  4. Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. In: Advances in Neural Information Processing Systems, pp. 2224–2232 (2015)

    Google Scholar 

  5. Ekman, R.: What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, Oxford (1997)

    Google Scholar 

  6. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9(Aug), 1871–1874 (2008)

    MATH  Google Scholar 

  7. Hammond, D.K., Vandergheynst, P., Gribonval, R.: Wavelets on graphs via spectral graph theory. Appl. Comput. Harmonic Anal. 30(2), 129–150 (2011)

    Article  MathSciNet  Google Scholar 

  8. Henaff, M., Bruna, J., LeCun, Y.: Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163 (2015)

  9. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  10. Li, G., Zhu, X., Zeng, Y., Wang, Q., Lin, L.: Semantic relationships guided representation learning for facial action unit recognition. arXiv preprint arXiv:1904.09939 (2019)

    Article  Google Scholar 

  11. Li, W., Abtahi, F., Zhu, Z., Yin, L.: EAC-Net: a region-based deep enhancing and cropping approach for facial action unit detection. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 103–110. IEEE (2017)

    Google Scholar 

  12. Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.: Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015)

  13. Liu, Z., Song, G., Cai, J., Cham, T.J., Zhang, J.: Conditional adversarial synthesis of 3D facial action units. arXiv preprint arXiv:1802.07421 (2018)

  14. Martinez, B., Valstar, M.F., Jiang, B., Pantic, M.: Automatic analysis of facial actions: a survey. IEEE Trans. Affect. Comput. 10, 325–347 (2017)

    Article  Google Scholar 

  15. Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 52–59. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_7

    Chapter  Google Scholar 

  16. Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: DISFA: a spontaneous facial action intensity database. IEEE Trans. Affect. Comput. 4(2), 151–160 (2013)

    Article  Google Scholar 

  17. Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)

    Google Scholar 

  18. Ng, Y.C., Colombo, N., Silva, R.: Bayesian semi-supervised learning with graph gaussian processes. In: Advances in Neural Information Processing Systems, pp. 1683–1694 (2018)

    Google Scholar 

  19. Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs. In: International Conference on Machine Learning, pp. 2014–2023 (2016)

    Google Scholar 

  20. Song, Y., McDuff, D., Vasisht, D., Kapoor, A.: Exploiting sparsity and co-occurrence structure for action unit recognition. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG). vol. 1, pp. 1–8. IEEE (2015)

    Google Scholar 

  21. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)

    Google Scholar 

  22. Wang, S., Hao, L., Ji, Q.: Facial action unit recognition and intensity estimation enhanced through label dependencies. IEEE Trans. Image Process. 28(3), 1428–1442 (2019). https://doi.org/10.1109/TIP.2018.2878339

    Article  MathSciNet  MATH  Google Scholar 

  23. Wang, Z., Li, Y., Wang, S., Qiang, J.: Capturing global semantic relationships for facial action unit recognition. In: IEEE International Conference on Computer Vision (2014)

    Google Scholar 

  24. Zhang, X., et al.: A high-resolution spontaneous 3D dynamic facial expression database. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–6. IEEE (2013)

    Google Scholar 

  25. Zhao, K., Chu, W.S., De la Torre, F., Cohn, J.F., Zhang, H.: Joint patch and multi-label learning for facial action unit and holistic expression recognition. IEEE Trans. Image Process. 25(8), 3931–3946 (2016)

    Article  MathSciNet  Google Scholar 

  26. Zhao, K., Chu, W.S., Zhang, H.: Deep region and multi-label learning for facial action unit detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3391–3399 (2016)

    Google Scholar 

  27. Zou, M., Conzen, S.D.: A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 21(1), 71–79 (2005)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grants of 41806116 and 61503277. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cuicui Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Z., Dong, J., Zhang, C., Wang, L., Dang, J. (2020). Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37734-2_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37733-5

  • Online ISBN: 978-3-030-37734-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics