\paperheight

=11in

Generative prediction of flow field based on the diffusion model

Jiajun Hu\aff1 Zhen Lu\aff1\corresp zhen.lu@pku.edu.cn Yue Yang\aff1, 2\corresp yyg@pku.edu.cn \aff1State Key Laboratory for Turbulence and Complex Systems, College of Engineering, Peking University, Beijing 100871, China \aff2HEDPS-CAPT, Peking University, Beijing 100871, China

Abstract

We propose a geometry-to-flow diffusion model that utilizes the input of obstacle shape to predict a flow field past the obstacle. The model is based on a learnable Markov transition kernel to recover the data distribution from the Gaussian distribution. The Markov process is conditioned on the obstacle geometry, estimating the noise to be removed at each step, implemented via a U-Net. A cross-attention mechanism incorporates the geometry as a prompt. We train the geometry-to-flow diffusion model using a dataset of flows past simple obstacles, including the circle, ellipse, rectangle, and triangle. For comparison, the CNN model is trained using the same dataset. Tests are carried out on flows past obstacles with simple and complex geometries, representing interpolation and extrapolation on the geometry condition, respectively. In the test set, challenging scenarios include a cross and characters ‘PKU’. Generated flow fields show that the geometry-to-flow diffusion model is superior to the CNN model in predicting instantaneous flow fields and handling complex geometries. Quantitative analysis of the model accuracy and divergence in the fields demonstrate the high robustness of the diffusion model, indicating that the diffusion model learns physical laws implicitly.

keywords:

machine learning, computational methods, vortex shedding

1 Introduction

Understanding the interaction between fluid and solid objects is crucial for optimizing various engineering systems (Shelley & Zhang, 2011). Computational fluid dynamics (CFD) plays a vital role in revealing the complex fluid-solid interactions (Tong et al., 2021) and enhancing design efficiency. However, CFD faces several challenges, including the expertise required to generate high-quality computational meshes, especially for complex geometries (Perot, 2011), and the substantial computational resources needed for high-fidelity simulations (Kern et al., 2024). These challenges have motivated the exploration of CFD-accelerating approaches.

Machine learning (ML) has emerged as a promising solution to address the limitations of traditional CFD methods (Kochkov et al., 2021). By leveraging the data-driven models, ML has demonstrated the potential to significantly accelerate simulations (Vinuesa & Brunton, 2022). Various ML approaches have been applied to CFD successively, including super-resolution of flow fields (Fukami et al., 2019), estimating forces of a plate in flows (Tong et al., 2022), physics-informed neural network (PINN) for flow simulation (Chu et al., 2022), velocity interpolation in multiphase flows (Hu et al., 2024). Despite these advancements, many existing ML methods struggle with extrapolation (Wu et al., 2020). The deficit of ML methods on generalization to out-of-distribution scenarios poses a challenge for their application in CFD (Fukami et al., 2024), where extrapolation on flow parameters and solid geometry is often expected.

Generative artificial intelligence has recently revolutionized many fields, particularly large language models (Biever, 2023) and image generation (Pan et al., 2023). Generative models aim to learn the underlying probability distribution of a given dataset. Once trained, they generate outputs by sampling from the distribution. They have been applied to scientific research, such as the material design (Fu et al., 2023) and active control (Kim et al., 2024). Examples in CFD include the super-resolution of turbulence field via a generative adversarial network (Kim et al., 2021) and flow field prediction using a variational autoencoder (Ranade et al., 2021). However, most existing works still focus on interpolation within the conditions listed in the training set, lacking further examination of out-of-distribution generalization.

The diffusion model (Yang et al., 2023) has emerged as a powerful probabilistic method, with predominant formulations including the denoising diffusion probabilistic model (DDPM) (Sohl-Dickstein et al., 2015) and score-based generative models (Song & Ermon, 2020). The diffusion model progressively perturbs data distribution to the Gaussian by injecting noise, then learns to reverse this process for sample generation. Shan et al. (2024) raised a physics-informed diffusion model for super-resolution. Qiu et al. (2024) trained physics-informed diffusion model to output time evolution of flow past a cylinder and inside veins. The ability of the diffusion model to learn distributions and its potential for extrapolation makes it a promising approach for accelerating CFD and exploring complex flow phenomena around bodies with diverse geometries.

In this work, we mainly focus on the two-dimensional (2D) flow past an obstacle. We develop a geometry-to-flow (G2F) diffusion model in which the obstacle geometry serves as a prompt to guide the gradual denoising process, generating the flow field around obstacles. The G2F diffusion model is trained with elementary geometries. Evaluations using simple and complex geometries validate the model in interpolation and extrapolation tasks. Our results demonstrate the model’s capability to generate flow fields for a wide range of obstacle shapes, including those outside the training distribution.

2 Geometry prompt to flow field output

2.1 DDPM

We use the obstacle geometry as a prompt to generate a flow past the obstacle as output through the DDPM. Figure 1 provides a brief overview of this G2F diffusion model. There are two diffusion processes in the DDPM – the forward diffusion perturbs data to noise, and the reverse diffusion converts noise back to data (Sohl-Dickstein et al., 2015). The DDPM learns a probabilistic representation of the underlying data, thereby facilitating the generation of high-quality samples that resemble the target distribution.

For data $\boldsymbol{z}_{0}$ sampled from a distribution $q(\boldsymbol{z}_{0})$ , the forward diffusion utilizes a Markov transition kernel

q\left(\boldsymbol{z}_{t}|\boldsymbol{z}_{t-1}\right)=\mathcal{N}\left(% \boldsymbol{z}_{t};\sqrt{1-\beta_{t}}\boldsymbol{z}_{t-1},\beta_{t}\boldsymbol% {I}\right),\quad t=1,\dots,T

(1)

to inject noise, generating a sequence of noisy samples, $\boldsymbol{z}_{1},\dots,\boldsymbol{z}_{T}$ in total $T$ steps, where $\beta_{t}\in\left(0,1\right)$ is the diffusion rate, $\mathcal{N}$ denotes the Gaussian distribution, and $\boldsymbol{I}$ is the identity matrix. For $T\rightarrow\infty$ , $q\left(\boldsymbol{z}_{T}\right)\approx\mathcal{N}\left(\boldsymbol{0},% \boldsymbol{I}\right)$ , allowing to approximate $\boldsymbol{z}_{T}$ as a Gaussian vector.

Refer to caption — Figure 1: Schematic for the reverse diffusion process in the G2F diffusion model. It generates a sample by removing noises in a series of steps through a U-Net. The geometry prompt is injected into the U-Net via a cross-attention mechanism, which takes the geometry as the prompt to control the generation process.

To generate samples $\boldsymbol{z}_{0}$ , DDPM conducts a reverse diffusion process on the Gaussian vector $\boldsymbol{z}_{T}$ using a learnable Markov transition kernel

p_{\boldsymbol{\theta}}\left(\boldsymbol{z}_{t-1}|\boldsymbol{z}_{t};% \boldsymbol{c}\right)=\mathcal{N}\left[\boldsymbol{z}_{t-1};\boldsymbol{\mu}_{% \boldsymbol{\theta}}\left(\boldsymbol{z}_{t},t\right),\boldsymbol{\Sigma}_{% \boldsymbol{\theta}}\left(\boldsymbol{z}_{t},t\right),\boldsymbol{c}\right],

(2)

where $\boldsymbol{\theta}$ denotes model parameters, $\boldsymbol{c}$ denotes the condition, the mean $\boldsymbol{\mu}_{\boldsymbol{\theta}}$ and variance $\boldsymbol{\Sigma}_{\boldsymbol{\theta}}$ are parameterized by the neural network (NN). The parameters $\boldsymbol{\theta}$ are trained to make the reverse diffusion process a good approximation of the forward one, minimizing the Kullback-Leibler divergence (Kullback & Leibler, 1951) between $q\left(\boldsymbol{z}_{0},\boldsymbol{z}_{1},\dots,\boldsymbol{z}_{T}\right)$ and $p_{\boldsymbol{\theta}}\left(\boldsymbol{z}_{0},\boldsymbol{z}_{1},\dots,% \boldsymbol{z}_{T};\boldsymbol{c}\right)$ . To optimize training, Ho et al. (2020) proposed a loss function based on the mean-square-error (MSE) of the predicted noise,

\mathcal{L}=\mathbb{E}_{t,\boldsymbol{z}_{0},\boldsymbol{\epsilon}}\left[\left% \|\boldsymbol{\epsilon}-\boldsymbol{\epsilon}_{\boldsymbol{\theta}}\left(% \boldsymbol{z}_{t},t;\boldsymbol{c}\right)\right\|^{2}\right],

(3)

where $\mathbb{E}$ denotes the expectation, and $\boldsymbol{\epsilon}_{\boldsymbol{\theta}}\left(\boldsymbol{z}_{t},t;% \boldsymbol{c}\right)$ is a NN predicting the noise vector $\boldsymbol{\epsilon}\sim\mathcal{N}\left(\boldsymbol{0},\boldsymbol{I}\right)$ .

2.2 Geometry prompt

In the G2F diffusion model sketched in figure 1, we introduce the geometry prompt as the condition $\boldsymbol{c}$ in (2), and use a U-Net (Ronneberger et al., 2015) with a cross-attention mechanism (Rombach et al., 2022) to realize $\boldsymbol{\epsilon}_{\boldsymbol{\theta}}(z_{t},t;\boldsymbol{c})$ .

The U-Net consists of a shrinking path, an expanding path, and some skip connections between the two. The shrinking path gradually compresses a low-dimensional, high-resolution field to a high-dimensional, low-resolution field. It contains five layers, with sizes of (64, 64, 256), (32, 32, 512), (16, 16, 1024), (8, 8, 1024), and (4, 4, 2048), followed by an average pool layer to obtain encoded data with size of (1, 1, 2048). The expanding path follows the reverse process.

To incorporate the geometry prompt, we first encode the geometry information into a vector of 0 and 1, representing the fluid and solid, respectively. The prompt is encoded by two NNs, having the same sizes as the last and the second-to-last layers in the expanding path. Outputs from these two NNs are then combined into the last and the second-to-last layers in the expanding path. The cross-attention (Rombach et al., 2022) in the last two layers balances the interpolation and extrapolation in our practice.

Note that DDPM generates results from random noises, which means that DDPM’s results differ even with the same prompt. Here we focus on the geometry prompt, so we only refer the generated flow fields to a geometry, without specifying a particular time.

2.3 Datasets

We evaluate the G2F diffusion model based on 2D flows past various obstacles. The dataset includes a range of obstacle geometries, as illustrated in figure 2. The G2F diffusion model is trained using the flow field around simple geometries, including the circle, ellipse, rectangle, and triangle. They are often used as building blocks to construct complex geometries. To assess the model’s ability to extrapolate to different geometries, the test set includes flows around obstacles not present in the training set. This includes several cases with parallelogram obstacles. Furthermore, the model is tested on a cross and characters ‘PKU’ with convex shapes.

The geometry complexity is characterized by the roundness $r=4\pi S/C^{2}$ , where $S$ and $C$ are the area and circumference, respectively. The roundness is bounded by $r=1$ for the circle and $r\rightarrow 0$ for extremely complex geometry. The cross in the test set has $r=0.321$ , smaller than all shapes in the training set, and the characters ‘PKU’ have $r\approx 0$ .

Focusing on the geometry condition, we evaluate the model using flow field data with the same free-stream velocity $U_{\infty}=1$ , viscosity $\mu=1\times 10^{-2}$ , and density $\rho=1$ . The circle radius is 1, and the characteristic length of the obstacles ranges from 1 (a rectangle or ellipse with its longitudinal axis aligned in the streamwise direction) to 4 (a rectangle or ellipse with its longitudinal axis aligned in the spanwise direction). Correspondingly, the Reynolds number $\Rey$ varies from 100 to 400. We solve the incompressible Navier-Stokes equations to obtain the flow field around various obstacles as the ground truth (GT) using OpenFOAM (Greenshields, 2022). The body-fitted meshes with 55000 cells are employed for a computational domain of $\left(-20,60\right)\times\left(-20,20\right)$ , where the geometric centroid of the obstacle is positioned at the spatial location $(x,y)=(0,0)$ . Each flow-field snapshot has the velocity $(u,v)$ , pressure $p$ , and vorticity $\omega$ . Note that we interpolate the flow field to a $128\times 128$ uniform grid over a view box of $\left(-2,10\right)\times\left(-4,4\right)$ to obtain $z_{0}=[u,v,p,\omega]$ for the model. For each case, 50 snapshots are sampled with the time step of $1$ .

In the implementation, we empirically set $T=400$ and $\beta_{t}=1\times 10^{-4}+5\times 10^{-5}t$ . The parameters $\boldsymbol{\theta}$ in the NN are optimized to minimize the loss function in (3). The parameters are initialized by the Kaiming method (He et al., 2015) and updated using the Adam optimizer (Kingma & Ba, 2015). The initial learning rate is 0.001, subsequently reduced to one-tenth of its preceding value after every 2000 epochs.

In comparison, we trained a CNN model using the same U-Net and dataset. The CNN model loads geometry prompts at the entrance of the shrinking path and outputs flow fields at the exit of the expanding path, similar to Shirzadi et al. (2022). The CNN parameters are optimized to minimize the MSE of outputs against the GT. We specifically chose to compare with CNN as it utilizes the same U-Net architecture as the G2F diffusion model, allowing for a direct comparison of the two approaches. The physics-informed model, such as PINN (Karniadakis et al., 2021), is not compared here because the presented diffusion model does not incorporate explicit physical constraints. The physics-informed diffusion model is discussed in supplementary material and will be further studied within the same network structure in future work.

3 Results

We evaluate the G2F diffusion model against the obstacles in the training and test sets. Given the random nature of the model, where a specific time is not designated, a snapshot in the GT solution is selected for comparison with the model output. We measure the similarity between the model output and GT using the $L_{1}$ error

L_{1}\left(\boldsymbol{z}_{0}^{M}\right)=\left\|\boldsymbol{z}_{0}^{M}-% \boldsymbol{z}_{0}^{\mathrm{GT}}\right\|,

(4)

where the superscript $M$ denotes the model output. Then the GT snapshot with the minimal $L_{1}\left(\boldsymbol{z}_{0}^{M}\right)$ is used for comparison below.

3.1 Training set

We first present the flow past a cylinder as a standard ML-enhanced CFD case. Figure 3 compares the G2F diffusion and CNN models against the GT. At $\Rey=200$ , the flow is featured by periodic vortex shedding.

\begin{overpic}[height=126.47249pt]{pic/round/u.pdf} \put(-8.0,95.0){(a)} \end{overpic}

\begin{overpic}[height=126.47249pt]{pic/round/u-plot.pdf} \put(0.0,95.0){(b)} \end{overpic}

\begin{overpic}[height=126.47249pt]{pic/round/p.pdf} \put(-8.0,95.0){(c)} \end{overpic}

\begin{overpic}[height=126.47249pt]{pic/round/p-plot.pdf} \put(0.0,95.0){(d)} \end{overpic}

Figure 3: Flow past a cylinder generated by the G2F diffusion model and CNN, compared with the GT: (a) contour of

u

, (b) profiles of

u

in the wake at

x=0

, 2, and 6, (c) contour of pressure

p

, and (d) pressure on the obstacle surface. The vertical dashed lines in the upper left panel mark the stations at

x=0

, 2, and 6.

Figures 3(a) and (c) compare the instantaneous contours of the streamwise velocity $u$ and pressure $p$ , respectively. Both models perform well in the stagnation region. The G2F diffusion model captures the von Kármán vortex street in the wake. The model output well approximates the alternating shear layers and low-pressure zones, although it exhibits minor discrepancies in values. In contrast, the CNN model fails to generate the asymmetric flow structures, displaying a ‘time-averaged’ flow field. This is because the diffusion model learns the distribution $q\left(\boldsymbol{z}_{0}\right)$ of the data $\boldsymbol{z}_{0}$ , where a sample from the distribution approximates one snapshot of the flow field. Meanwhile, the CNN is trained to minimize the distance between its output and all snapshots, resulting in an ‘averaged’ result.

We plot the profiles of $u$ at $x=0$ , 2, and 6 in figure 3(b). The G2F diffusion model agrees with the GT on the shift of velocity peaks and valleys along the development of the wake. However, quantitative errors remain in the model output. Note that the G2F diffusion model returns negative inside the obstacle, where $u$ should be 0. Due to averaging over all snapshots, the CNN model gives symmetric $u$ profiles at all three stations.

We further compare the pressures on the obstacle from different models in figure 3(d). The pressure on the obstacle surface is plotted against the angle in the polar coordinate, as illustrated in figure 3(c). The G2F diffusion model follows the GT, capturing the peak at the stagnation point and the subsequent asymmetric pressure drop on both sides. It also accurately generates the pressure rises at $\varphi=\pi/2$ and $3\pi/2$ , indicating a good prediction on the boundary layer separation. Although the CNN gives reasonable results on the stagnation point and the local minimum position, its solution includes strong unphysical oscillations.

Overall, the G2F and CNN models exhibit distinct behaviours in predicting the flow field. The G2F diffusion model produces results that qualitatively match the instantaneous flow field, while the CNN model provides a time-averaged flow field. Similar results are observed in other geometries. This difference arises from the goals of the two deep-learning NNs. The G2F diffusion model is trained to minimize the KL divergence, equivalent to maximizing the likelihood. Sampling from the distribution $p_{\boldsymbol{\theta}}(\boldsymbol{z}_{0};\boldsymbol{c})$ then returns data close to one snapshot in training data. The CNN model struggles with asymmetry because it minimizes the sum of errors between its output and all snapshots, leading to an averaged result.

3.2 Test set

We employ obstacle geometries absent from the training set to test the extrapolation capability of the G2F diffusion model, including parallelograms, a cross, and the characters ‘PKU’. The parallelograms exhibit similarities to the simple geometries in the training set. Furthermore, the cross and characters ‘PKU’ are out-of-distribution cases, characterized by combinations of elementary geometries and convex shapes.

Figure 4 compares the velocity and pressure fields generated by the G2F diffusion and CNN models around the cross with the GT. The G2F diffusion model generates the instantaneous flow field with shedding vortices and clear boundary layers. Stagnation zones on the upstream side of the cross are predicted correctly, implied by the high-pressure values on the obstacle surface. Meanwhile, the CNN model returns obvious oscillations around the obstacle, and the fields are averaged in the wake. Moreover, the CNN model incorrectly predicts two negative pressure regions near the front arm of the cross, implying that it treats the cross arms as standalone objects in the flow field. The G2F diffusion model addresses this challenge effectively, demonstrating the extrapolation capability of generative models.

\begin{overpic}[height=126.47249pt]{pic/cross/u.pdf} \put(-8.0,95.0){(a)} \end{overpic}

\begin{overpic}[height=126.47249pt]{pic/cross/u-plot.pdf} \put(0.0,95.0){(b)} \end{overpic}

\begin{overpic}[height=126.47249pt]{pic/cross/p.pdf} \put(-8.0,95.0){(c)} \end{overpic}

\begin{overpic}[height=126.47249pt]{pic/cross/p-plot.pdf} \put(0.0,95.0){(d)} \end{overpic}

Figure 4: Flow past a cross generated by the G2F diffusion model and CNN, compared with the GT: (a) contour of

u

, (b) profiles of

u

in the wake at

x=

0, 2, and 6, (c) contour of pressure

p

, and (d) pressure on a unit circle crossing the four arms. The vertical dashed lines in the upper left panel mark the stations at

x=0

, 2, and 6.

Due to the two cavities downstream of the cross, there exists a local minimum following the upper edge of the cross, as shown around $y=1.5$ in the $u$ profile at $x=2$ . This is not found for simple geometries in the training set. The G2F diffusion model indicates a weak shear layer at this position, although it cannot accurately reproduce this structure. In addition, the G2F diffusion model gives a narrower wake flow compared with the GT, due to two possible reasons. First, the cross-attention mechanism makes the geometry more influential on the flow field around the obstacle (Vaswani et al., 2017), while the far field is less affected by the geometry. Second, a lack of obstacle geometries with large widths in the $y$ -direction in the training set makes it difficult for the G2F diffusion model to predict a wide wake flow.

Generating flow fields around the characters ‘PKU’ is challenging due to the sharp corners, convex curves, and separated components, none of which are included in the training set. Given the difficulty in simulating this flow with a simple mesh, we present the generated streamwise velocity $u$ and pressure $p$ by the two models in figure 5, without the GT. The G2F diffusion model demonstrates a remarkable extrapolation capability, capturing the major flow features, despite some artefacts such as the spurious flows through the characters ‘P’ and ‘U’ in the velocity field. Again, the CNN model introduces noise at character boundaries and provides time-averaged-like fields. Model performance on other obstacle geometries is provided in supplementary material.

All geometries’ results are summarized in figure 6. The black and red symbols represent the geometries in the training and test sets, respectively. The horizontal axis measures the roundness $r$ of geometries. Note that all geometries in the training set have $r\geq 0.5$ , while several geometries in the test set have $r<0.5$ , indicating out-of-distribution cases.

\begin{overpic}[height=151.76854pt]{pic/extra/uv.pdf} \put(6.0,95.0){(a)} \end{overpic}

\begin{overpic}[height=151.76854pt]{pic/extra/div_e.pdf} \put(5.0,95.0){(b)} \end{overpic}

Figure 6: Comparison of results for different obstacle geometries: (a) normalized

L_{1}

error between ML methods and GT and (b) divergence. It shows that the G2F diffusion model outperforms the CNN in understanding physical laws, and achieves good performances in extrapolation tasks.

We compare the accuracy of the two ML models in figure 6(a) using the $L_{1}$ error of $\left[u,v\right]$ normalized by the difference of the maximum and minimum values in the GT, $L_{1}([u,v])/([u,v]_{\max}^{\mathrm{GT}}-[u,v]_{\min}^{\mathrm{GT}})$ . The pressure and vorticity are not included, considering the differences in the order of magnitude. Most of the G2F diffusion model’s results are better than CNN’s, especially in the test set, indicating that the diffusion model is more accurate and robust. We also notice that CNN performs well for some slender obstacles, as these cases have weaker vortex shedding. Consequently, the wake flows in these cases are close to time-averaged results by the CNN model.

We further evaluate the model realizability via the divergence $\mathcal{D}_{\mathrm{div}}=\left\|\partial_{x}u+\partial_{y}v\right\|$ averaged over the domain, which quantifies the discrepancy from the incompressible flow. Results in figure 6(b) demonstrate that $\mathcal{D}_{\mathrm{div}}$ increases overall with the decrease of $r$ . For all cases, the G2F diffusion model yields smaller $\mathcal{D}_{\mathrm{div}}$ than the CNN model, and the increase rate of $\mathcal{D}_{\mathrm{div}}$ with the decrease in $r$ is smaller. This confirms the robustness of the G2F diffusion model across various geometries, also implying that incorporating a divergence-free constraint may not significantly contribute towards better performance. The physics-informed diffusion model is discussed in supplementary material. Additionally, flows with $\Rey$ closer to the training-set average (around 200) yield better results, suggesting that the data at a broader range of $\Rey$ should be incorporated in the training set to enhance the model performance.

4 Conclusion

We propose a G2F diffusion model for predicting flows past an obstacle with various geometries. The model employs a U-Net architecture with a cross-attention mechanism to incorporate geometry information as a prompt, guiding the reverse diffusion process to generate flow fields. A dataset including 2D flows with different geometries is established for training and testing. The G2F diffusion model is trained using flow fields for simple obstacle geometries and evaluated on both interpolation and extrapolation tasks.

We assess the G2F diffusion model using simple and complex obstacle geometries present and absent in the training set, respectively. The model results are compared with those from a CNN model and the GT. The results demonstrate that the diffusion model outperforms the CNN model, particularly in generating instantaneous flow fields and handling out-of-distribution geometries. In particular, the diffusion model successfully captures essential flow features while maintaining physical consistency, even in challenging scenarios. It reproduces instantaneous flow fields by learning data distribution, while the CNN model yields time-averaged data. Furthermore, the diffusion model’s ability to learn physical laws is evident from its consistent performance across various geometries, as quantified by the divergence.

Beyond the current preliminary study on the application of the diffusion model in CFD, there are several issues that deserve further investigation. In addition to the obstacle geometry, $\Rey$ is another key parameter, emphasizing the importance of diverse training data for improving the model’s generalization capabilities. Future research includes expanding the dataset to cover a wider range of Reynolds numbers and geometries, exploiting the scaling law, and enhancing the model’s ability to handle diverse flow conditions.

While results indicate that the diffusion model has the potential to learn the underlying physics implicitly, incorporating physics information could accelerate the sampling process, as discussed in supplementary material. Since the conditioned diffusion model suffers from the diversity problem (Gu et al., 2024), further exploration of condition implementation approaches is necessary to address this challenge. Additionally, investigating the application of the diffusion model in generating time-dependent flow field data can provide insights for understanding and predicting complex fluid dynamics.

\backsection

[Funding]Numerical simulations were performed on the TH-2A supercomputer in Guangzhou, China. This work has been supported in part by the National Natural Science Foundation of China (Nos. 11925201, 52306126, 11988102, and 92270203), the National Key R&D Program of China (Grant No. 2020YFE0204200), and the Xplore Prize.

\backsection

[Declaration of interests]The authors report no conflict of interest.

\backsection

[Author ORCIDs]Jiajun Hu, https://orcid.org/0009-0005-5192-876X; Zhen Lu, https://orcid.org/0000-0002-6729-3771; Yue Yang, https://orcid.org/0000-0001-9969-7431

\backsection

[Author contributions]Y.Y., Z.L, and J.H. designed research. J.H. and Z.L. preformed research. All the authors discussed the results and wrote the manuscript. All the authors have given approval for the manuscript.

References

Biever (2023) Biever, C. 2023 ChatGPT broke the Turing test — the race is on for new ways to assess AI. Nature 619, 686–689.
Chu et al. (2022) Chu, M., Liu, J., Zheng, Q., Franz, E., Seidel, H.-P., Theobalt, C. & Zayer, R. 2022 Physics informed neural fields for smoke reconstruction with sparse data. ACM Trans. Graph. 41, 119.
Fu et al. (2023) Fu, N., Wei, L., Song, Y., Li, Q., Xin, R., Omee, S. S., Dong, R., Siriwardane, E. M. D. & Hu, J. 2023 Material transformers: deep learning language models for generative materials design. Mach. Learn.: Sci. Technol. 4, 015001.
Fukami et al. (2019) Fukami, K., Fukagata, K. & Taira, K. 2019 Super-resolution reconstruction of turbulent flows with machine learning. J. Fluid Mech. 870, 106–120.
Fukami et al. (2024) Fukami, K., Goto, S. & Taira, K. 2024 Data-driven nonlinear turbulent flow scaling with Buckingham Pi variables. J. Fluid Mech. 984, R4.
Greenshields (2022) Greenshields, C. J. 2022 OpenFOAM User Guide, 10th edn. OpenFOAM Foundation Ltd.
Gu et al. (2024) Gu, J., Shen, Y., Zhai, S., Zhang, Y., Jaitly, N. & Susskind, J. M. 2024 Kaleido diffusion: Improving conditional diffusion models with autoregressive latent modeling, arXiv: 2405.21048.
He et al. (2015) He, K., Zhang, X., Ren, S. & Sun, J. 2015 Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In 2015 IEEE International Conference on Computer Vision, pp. 1026–1034.
Ho et al. (2020) Ho, J., Jain, A. & Abbeel, P. 2020 Denoising diffusion probabilistic models. In 34th Conference on Neural Information Processing Systems, , vol. 33, pp. 6840–6851.
Hu et al. (2024) Hu, J., Lu, Z. & Yang, Y. 2024 Improving prediction of preferential concentration in particle-laden turbulence using the neural-network interpolation. Phys. Rev. Fluids 9, 034606.
Karniadakis et al. (2021) Karniadakis, George Em, Kevrekidis, Ioannis G., Lu, Lu, Perdikaris, Paris, Wang, Sifan & Yang, Liu 2021 Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440.
Kern et al. (2024) Kern, J. S., Negi, P. S., Hanifi, A. & Henningson, D. S. 2024 Onset of absolute instability on a pitching aerofoil. J. Fluid Mech. 988, A8.
Kim et al. (2021) Kim, H., Kim, J., Won, S. & Lee, C. 2021 Unsupervised deep learning for super-resolution reconstruction of turbulence. J. Fluid Mech. 910, A29.
Kim et al. (2024) Kim, J., Kim, J. & Lee, C. 2024 Prediction and control of two-dimensional decaying turbulence using generative adversarial networks. J. Fluid Mech. 981, A19.
Kingma & Ba (2015) Kingma, D. P. & Ba, J. L. 2015 Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, p. 4.
Kochkov et al. (2021) Kochkov, D., Smith, J. A., Alieva, A., Wang, Q., Brenner, M. P. & Hoyer, Stephan 2021 Machine learning-accelerated computational fluid dynamics. Proc. Natl. Acad. Sci. USA 118, e2101784118.
Kullback & Leibler (1951) Kullback, S. & Leibler, R. A. 1951 On information and sufficiency. Ann. Math. Statist. 22, 79–86.
Pan et al. (2023) Pan, Z., Zhou, X. & Tian, H. 2023 Arbitrary style guidance for enhanced diffusion-based text-to-image generation. In IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 4450–4460.
Perot (2011) Perot, J. B. 2011 Discrete conservation properties of unstructured mesh schemes. Annu. Rev. Fluid Mech. 43, 299–318.
Qiu et al. (2024) Qiu, J., Huang, J., Zhang, X., Lin, Z., Pan, M., Liu, Z. & Miao, F. 2024 Pi-fusion: Physics-informed diffusion model for learning fluid dynamics, arXiv: 2406.03711.
Ranade et al. (2021) Ranade, R., Hill, C., He, H., Maleki, A., Chang, N. & Pathak, J. 2021 A composable autoencoder-based iterative algorithm for accelerating numerical simulations, arXiv: 2110.03780.
Rombach et al. (2022) Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. 2022 High-resolution image synthesis with latent diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10674–10685.
Ronneberger et al. (2015) Ronneberger, O., Fischer, P. & Brox, T. 2015 U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention 18th International Conference, , vol. 9351, pp. 234–241.
Shan et al. (2024) Shan, S., Wang, P., Chen, S., Liu, J., Xu, C. & Cai, S. 2024 Pird: Physics-informed residual diffusion for flow field reconstruction, arXiv: 2404.08412.
Shelley & Zhang (2011) Shelley, M. J. & Zhang, J. 2011 Flapping and bending bodies interacting with fluid flows. Annu. Rev. Fluid Mech. 43, 449–465.
Shirzadi et al. (2022) Shirzadi, M., Fukasawa, T., Fukui, K. & Ishigami, T. 2022 Prediction of submicron particle dynamics in fibrous filter using deep convolutional neural networks. Phys. Fluids 34, 123303.
Sohl-Dickstein et al. (2015) Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N. & Ganguli, S. 2015 Deep unsupervised learning using nonequilibrium thermodynamics. In 32nd International Conference on Machine Learning, , vol. 3, pp. 2246–2255.
Song & Ermon (2020) Song, J. & Ermon, S. 2020 Multi-label contrastive predictive coding. In 34th Conference on Neural Information Processing Systems, , vol. 33, pp. 8161–8173.
Tong et al. (2022) Tong, W., Wang, S. & Yang, Y. 2022 Estimating forces from cross-sectional data in the wake of flows past a plate using theoretical and data-driven models. Phys. Fluids 34 (11), 111905.
Tong et al. (2021) Tong, W., Yang, Y. & Wang, S. 2021 Estimating thrust from shedding vortex surfaces in the wake of a flapping plate. J. Fluid Mech. 920, A10.
Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł & Polosukhin, I. 2017 Attention is all you need. In 31st Conference on Neural Information Processing Systems, , vol. 2017-December, pp. 5999–6009.
Vinuesa & Brunton (2022) Vinuesa, R. & Brunton, S. L. 2022 Enhancing computational fluid dynamics with machine learning. Nat. Comput. Sci. 2, 358–366.
Wu et al. (2020) Wu, X., Li, R.-L., Zhang, F.-L., Liu, J.-C., Wang, J., Shamir, A. & Hu, S.-M. 2020 Deep portrait image completion and extrapolation. IEEE Trans. Image Process. 29, 2344–2355.
Yang et al. (2023) Yang, L., Zhang, Z., Song, Y., Hong, S., Xu, R., Zhao, Y., Zhang, W., Cui, B. & Yang, M.-H. 2023 Diffusion models: A comprehensive survey of methods and applications. ACM Comput. Surv. 56, 1–39.