Physics Informed Neural Networks

The key intuition is calculating the loss function without actually knowing the exact solution ("labels" in supervised learning parlance). Note that this is not unique to PINN: there are existing numerical methods that do exactly this.

I used to solve PDEs for a living; and my academic background is in numerical solutions to PDEs before going into ML. In my industry and academic experience, PINN is a novel curiosity with perhaps niche applications that I am not as familiar with. Yes, I am aware of works of Bruton, Duvenaud et al (and was even in the same lab group with some of them). I am happy to be corrected and learn if PINN has found a strong application.

A better introduction to this approach and its critique here: https://arxiv.org/pdf/2206.02016

Good read! I am developing PINNs at work and this certainly helped me recall important concepts. This post used deepxde library [2] to compose the PINN. Can anyone comment on how NVIDIA's modulus [2] compares to this? Modulus appears to be much more verbose and poorly documented.

[1]: https://github.com/lululxvi/deepxde [2]: https://github.com/nvidia/modulus

From "Physics-Based Deep Learning Book" (2021) https://news.ycombinator.com/item?id=28510010 :

> Physics-informed neural networks: https://en.wikipedia.org/wiki/Physics-informed_neural_networ...

I've never clearly understood the relationship/difference between PINNs and SciML (Scientific Machine Learning). The "how do they work" section here sounds pretty similar to how I've heard SciML described in the past.

From some searcing around, it sounds like maybe SciML is a broader concept with PINNs being a particular implementation of it? Maybe SciML started with PINN related ideas, but has broadened beyond that over time? Would appreciate an explanation from someone who's actively in this field.

Around a month ago there was a PINN post[1] on here and there was a healthy amount of skepticism in the comments. Even in the toxic positivity of LinkedIn, commentors say they're overhyped when a ML "Influencer" posts that one GIF with a MLP and PINN fitting to an oscillator. I would be interested to see what they're actively being used for.

[1] https://news.ycombinator.com/item?id=42769623

I fully agree with the comments in that post. I started studying them because, well, they sound really cool, but my first impression was definitely that this sounded like a lot of effort (computationally) to solve a single equation. But then I only tested on simple equations where traditional solvers have no difficulties.

In my previous position, we studied the behaviour of black holes with exotic geometries, and we never could make our solvers work with the added time dependence. I would be very curious to see how a PINN would have fared on this (given enough compute time of course).

[deleted]

I can also recommend Steve Brunton's playlist [0] on the topic of physics informed machine learning, as well as the book 'Data-driven Science & Engineering' [1] by him and Nathan Kutz.

[0] https://www.youtube.com/playlist?list=PLMrJAkhIeNNQ0BaKuBKY4...

[1] https://databookuw.com

Neural ODEs are also interesting.

Very well explained to a lay person.

Are PINNs the current state of the art in ML methods for solving PDEs? What are their limitations?

It depends on the PDE and what you want to do with it. A PINN requires:

1. Some example data or other way to add boundary conditions

2. Autograd over PDE constraints

3. A training loop incorporating both of those

And it produces

a. An approximate, differentiable, mesh-free solution

PINNs are most applicable when (1) is expensive (since that expense will apply more to traditional solvers, especially with fine meshes) and when the error in (a) is acceptable.

Regarding the error, PINNs are still extremely useful in generating an initial state to pass to a traditional solver even when the error is not tolerable, so that's not _really_ a concern. The main consideration is how expensive a particular problem is to solve classically. If it's too cheap, the PINN will never beat it.

You have a secondary consideration with (2) and (3). The training loop is a fixed cost which you can amortize over many executions, but you have to use the network enough times for that to actually pay off.

The last point I want to bring up is that you can sometimes get value from the extra features in (a). Perhaps you want to use the PINN to figure out where your mesh should be finer, or you have a derived field you want to inspect. Neural net gradients in general tend to poorly approximate real gradients if you only train on the function itself, but PINNs have the gradients you're likely to care about baked into their definition (and can thus approximate them well), and they'll model those much more cheaply than traditional solvers will.

We used them for a few things at my last job, and they were definitely worth it. We erred toward smaller (faster) nets with higher errors just to accelerate convergence with a classical solver.

I get that PINN is a less expensive approximate solution method. If so, how does it perform superior to many approximate, coarse numerical methods?

1. Those methods are coarse. The interpolation they provide is worse than what a PINN provides, meaning that equivalently performing PINNs (compard to coarse numerical methods) can easily and cheaply serve as better initializations for your finer numerical methods.

2. Go back to (1) from my previous message. For some intuition, fiddly solutions take a long time to optimize. Your only options (aside from spending more time and money) are tailoring the initial conditions and the algorithm for your particular problem. You see that a lot in, e.g., 1-3 atom quantum chemistry, where a good choice of basis functions is worth several papers. A neural network allows you to automagically bake everything that's hard about your problem into the training step and amortize those hard calculations across many experiments. It's not superior to enough man-centuries of human intuition, but it's dead simple to deploy, and for those sorts of hard problems it definitely beats a single human century of effort. Once you have a neural network output, the problem is well conditioned and suitable for refinement by a classical solver.

For a somewhat concrete example, imagine a problem where the space is largely uninteresting but there are a few tight swirls here and there. Coarse numerical methods can't really do anything with those. Adaptive-precision numerical methods can, but they're slow, and you have to re-run an intensive solving step for every new input. The PINN solution bakes everything that's hard about that into the neural net structure, and it solution will have approximately the right swirls in approximately the right places. If you want to refine them further, the fact that your solver doesn't have to dynamically handle resolution anymore and doesn't have to deal with any major phase shifts makes it much easier to iterate on via the normal classical methods.

Many thanks for your detailed input. For your concrete example, existing solutions employ FEM. My understanding of your point is that a NN abstracts away the meshing rules and learns the correct resolution in the areas of interest? I could see how this could be beneficial for quick solutions before a full blown solver.

If my above understanding is correct, than the following question is, why not use a NN to generate meshes directly? Let the classical solvers do what they do best: solve. Let NN do what they do best: take care of messy reality of geometry. This approach would actually give provable error bounds on the solution. I understand there are existing works on NN mesh generation, but I do not know any work that proves error bounds or has been incorporated into mainstream engineering software. Any hints?

(Thanks for this fascinating discussion.)

> [thank you]

You're welcome! Thank you for your questions! This has been a ton of fun on my part too.

> existing solutions employ FEM

FEM is pretty great for a lot of problems. Some fields are devoted to particularly tricky PDEs and go a long ways beyond that to generate asymptotically better solutions that don't generalize to other PDEs. Some easier problems have insights that improve on FEM.

> NN abstracts away the meshing rules and learns the correct resolution in the areas of interest

Something like that. That's certainly how I described it. The internals of a NN are a bit more wishy-washy, and for a variety of pathological problems (largely non-physical problems, e.g., ones devolving into rapidly tightening, infinite-curl spirals) none of the outputs will make intuitive sense when plotted against progressively finer-meshed classical solvers. For nice enough problems though (E&M with a smattering of QM, CFD, ...) that's approximately the net effect. Large portions of the weight space are devoted to interesting stuff, smaller portions to how it all fits together, and as you feed in information you'll naturally have more computation done with respect to the parts of the solution that need it.

> why not use a NN to generate meshes directly

If I'm understanding correctly, you're saying that this would contrasted with the current technique of generating outputs at various inputs. There's nothing wrong with that idea per se, but it's about as computationally intensive as generating approximate outputs at each of those mesh descriptors, so you might as well use that information. Combine that with the fact that a lot of these problems are extremely messy (how long does it take to transform gaussian noise or your favorite other initialization into a benzene molecule? (answer: weeks to months depending on your desired degree of accuracy, much worse for relativistic atoms)), and a classical solver will still struggle if all you do is give it a mesh and tell it to go wild.

> would give provable bounds on the solution

That point is a little interesting since it requires assumptions about the bounds of various partial derivatives in each region. In real-world problems, even when you can come up with such bounds, often you can't easily get those bounds to depend meaningfully on the mesh size (and certainly not on local properties in regions of widely spaced meshes). The net effect is that you can't prove much about the solution quality for an arbitrary PDE just based on mesh specifics (other than asymptotic information, which we can prove very easily).

> incorporated into mainstream engineering software

No clue, but I suspect not yet. Everything I've read about, used, written, or seen has been a one-off PINN.

> there are existing works on NN mesh generation, but ... proves error bounds

That's one of those things where I'd be strongly inclined to let the traditional software do its job. Much like using ChatGPT for recipe generation -- you might list some ingredients you do or don't want used, give the model a persona capable of cooking the thing you want to eat (to bias it away from the dregs of the internet), and ask for ideas and then maybe a followup or three. Once you have that result, you won't just blindly broil your shrimp at 550F for 180min; you'll independently verify the results (and still, hopefully, save time overall since you now at least know the right search terms and whatnot).

These NN results are similar. They're just approximations, and their best use IMO is feeding them straight into a tool that improves their accuracy and gives you known error bounds. The chief advantage is the speed with which you can obtain results, and the fact that the speed transfers when used as an initialization elsewhere is a very happy surprise.

> generate meshes directly

The status quo isn't bad for that for most classes of problems. For small, simple problems you'd never use a PINN. For large, complicated problems, classical techniques are so slow that the overhead of just sampling NN output to uncover the mesh isn't a huge problem. It's the in-between cases where you might want something smarter. I've seen a few papers and a few problems, but it doesn't look like there's a lot of interest. I'm not sure why, but collecting a few of those problems, trying to come up with real-world use cases where you'd need to solve tens of thousands or more of them (to make the PINN training worth it), and then using that as the backdrop for your project is probably the first direction I'd take if I had to work on that middle ground.

> any hints?

Let the NN do NN stuff. Expensively transform bulk data into a model that can much more cheaply approximate that data. PINNs use gradient information to replace most of that data (relying, then, on low sample counts of experimental or synthesized data). However you do it, the goal is to distill something expensive and messy into a model and then use the model to do something. Nobody has good error tracking via NNs, so don't use the NNs for that; use the NNs to feed data into a tool with good error tracking. Similarly with any other hard criterion.

what a well-written response. thank you.

may I ask which applications your work involve? your comments exhibit an exceptionally deep level of knowledge. As I mentioned in another comment, I am aware of some major authors' works (and was in the same research group at a point) but you exhibit a level of understanding uncommon even among those specialists.

How kind! Thank you.

Current applications are just ML/adtech, and the only parts of my job that have really used any PDE skills have been understanding the phase space of autoscaling and optimizing for a set of parameters that have minimal costs and don't wake the team up at night. There have been some other problems where my math background was helpful, but not in a PDE sense. Most of my current job doesn't use anything even halfway tangential to PINNs, despite being an ML engineer. I mostly do infrastructure work and make the machines go brr. I'm not positive yet, but I might be blogging about some of those things soon.

At my last job, one of the big problems was acquiring more MRI data in less time. One of the crucial steps in that is throwing away all the assumptions that make it easy (like having enormous field strengths and large enough relaxation times that you can treat anything nonlinear as gaussian noise). Those assumptions require time and money to generate a certain amount of data, but if you instead just blast blue noise at the patient and can model the physics involved well enough then you can gather much more data in much less time. The trick is in interpreting it. PINNs were very useful in speeding up classical solvers (in that case, entirely by choosing "good" initializations). For some applications (like quasi real-time plotting), you could even skip the classical solving step.

I've done a lot of things over the years. Back in school it was genomics and quantum chemistry. In between, I've had a lot of ideas (most of them bad, but no matter what anyone tells you I think the bad ideas are even more useful pedagogically), and I tend to throw at them the whole gamut of techniques I've learned as I explore. It's somewhere between "extremely wasteful", a "fun hobby", and "crucial to my professional learning and development". I'm not sure yet where the balance is, but I like how my career is progressing, so I keep studying things in detail.

If I had to guess, that "deep level of knowledge" you're referencing might be from my propensity for being a bit cocky and self-aggrandizing. Else, it might be from having built multiple versions of every optimization technique, ML framework, or other piece of software I've ever written about and studying what made them work and made them fail. I like to think it's more of the latter (enough so that I encourage other people to build things from first principles even when they only want to call an API and make a thing happen), but there's probably some truth to the former too.

It seems like you have had quite a career thus far. I have academic and professional backgrounds in PDEs (analytics and numerics) before moving onto ML. It's rare that I get a chance to talk to an expert in my own niche field. Thanks again for your expert answers and greatly contributing to this community. I look forward to reading your blog.

Thank you!

As far as I can tell, PINNs are promising and an active research area, but they are also young and far from being as widely adopted as finite element methods (at least that's my experience academic environments).

I do see great improvements are being made both on the performance level but also on the applications.

One aspect I didn't discuss in the post is the use for inverse solution search, where you fit experimental data to your equation, and where your parameters and your initial conditions can also be trainable parameters. This has great potential to improve the methodology of experimental results analysis.

> Are PINNs the current state of the art in ML methods for solving PDEs? What are their limitations?

I guess in a way they are. They aren't new, they have been around since the 90s [1]. The problem with them is, you typically need to train them on a specific problem (boundary conditions, domain, equation, PDE coefficients etc). Compared to a traditional solver, the training is much slower, and on top of that the results are typically much less accurate. The PDE + NN community has a bit of a problem dealing with this in general [2], there are tons of papers that make NNs look much better at solving PDEs than they are compared to traditional solvers.

[1] https://www.cs.uoi.gr/~lagaris/papers/TNN-LLF.pdf

[2] https://www.nature.com/articles/s42256-024-00897-5

[deleted]

It's from the future. Must be really good :)

Physics informed neural networks 16 Feb 2026