I made Zig compute 33M satellite positions in 3 seconds

Nice results! SIMD can be a pain, good to know Zig makes it easy.

However, note that the plot under "Native SIMD Throughput Comparison" is extremely misleading: for an accurate proportional comparison between bar charts, you should start the y-axis at zero. The way the data are presented makes it look like a 10-100x gain, rather than the actual 2x improvement.

I was going to comment the same. I saw the huge difference and went "wow", then read that it was a 2x improvement and had to check the axes properly, thinking "slightly less wow". It reminds me of that barchart of women's average heights in different countries that starts at 5 feet https://preview.redd.it/dohqa8l94kb41.png?auto=webp&s=865180...

Starting the axis at anything other than 0 for a length comparison is hilarious.

Looks like the author fixed the graph!

Beautiful visualization at https://attron.github.io/astroz-demo! (~160 MB transfer)

My (very amateur) belief is that we're in a quiet golden age of cool space viz/tools, for example:

- https://www.asterank.com/3d

- https://dmytry.github.io/space/

- https://www.tng-project.org/explore/3d

- https://github.com/da-luce/astroterm

- https://ssd.jpl.nasa.gov/tools

Feels like the halcyon techno-optimism from TOS + TNG!

This compares it to Python and Rust implementations, I wonder how it compares to Julia's SatelliteToolbox.jl? Julia also compiles to LLVM and has macros for explicit SIMD.

https://github.com/JuliaSpace/SatelliteToolbox.jl

Is that solving the right problem? The algorithm can give reasonably accurate positions at arbitrary points in future, but you don’t need to run it over and over if you need positions every second. You can generate keyframes and interpolate the positions between, as the short term orbital movements are rather trivial.

Totally agree. Cool little project, but I cannot think of one use case where this is needed.

> But "fine" starts to feel slow when you need dense time resolution. Generating a month of ephemeris data at one-second intervals is 2.6 million propagations per satellite.

Ok, except SGP4 loses its accuracy over WAY shorter time frames than a month (think hours/days)

> Pass prediction over a ground station network might need sub-second precision across weeks.

a) sub-second ephemeris for antenna pointing is crazy overkill, and b) same comment about accuracy as above.

[deleted]

I've never seen SIMD code before, and this is quite a nice little intro into that and Zig.

Nice to have a huge speed up. Kudos for applying the right tools and right approaches to get it done. I especially like your explanation of how to utilize SIMD in Zig. Learned something today.

It is funny how we often assume we need a graphics card for these kinds of calculations when a standard processor is actually plenty fast. The specific changes to the memory layout seemed to make the biggest difference here by allowing the hardware to actually use its vector capabilities.

At risk of being called out for my ignorance (I am still new to GPU development and have only limited experience with CUDA), it seems to come down to how appropriate the execution model is to the work e.g. SIMT vs SIMD here.

These days a single machine with lots of ram and cores will handle almost everything you throw at it, barring specific compute intensive / memory bound scenarios ( current AI, gaming etc ).

There's one example given where either the result of a simple or complex calculation is picked depending on eccentricity mentioning it's faster to just always calculate both and picking with a mask.

If you calculate both, wouldn't it be even faster to just always do the complex calculation? (presumably that's more precise?)

> If you calculate both, wouldn't it be even faster to just always do the complex calculation? (presumably that's more precise?)

The naming does imply that, but maybe they are simple vs complex, but also they're calculating different things? Seems like a stretch though.

Also the paragraph covering that part doesn't make much sense to me:

> This felt wasteful at first. Why compute both paths? But modern CPUs are so fast at arithmetic that computing both and selecting is often faster than branch misprediction. Plus, for SGP4, most satellites take the same path anyway, so we're rarely doing truly "wasted" work.

I'm always skeptical of claims about branch misprediction penalties without actual benchmarking (branch prediction is often very good!), and it also seems potentially undermined by the next sentence that "most satellites take the same path anyway", ie easily predictable.

But I don't understand at all what it means that because most satellites take the same path, the SIMD code is rarely doing wasted work, since the masking seems to be wasting part of the work by construction? (you could maybe handwave at pipelining or speculative execution making it irrelevant or wasted regardless, but no sign of those arguments).

The library seems good and always extremely nice when people produce write ups like this, but it might just be they're out over their skis when it comes to what was actually important about their optimizations.

I wonder if there is more fp drift from extra math that makes it LESS accurate in the case where the simple equation suffices? I'm impressed that satellites even have eccentricities so close to 1.0 that this makes sense, but I guess from, generically, an orbital planning perspective this make sense.

[deleted]

You tell that language what to do!