I am really looking forward for JAX to take over pytorch/cuda over the next year...

kadushka · on Feb 4, 2025

Most Pytorch users don’t bother even with the simplest performance optimizations, and you are talking about PTX.

throwaway287391 · on Feb 4, 2025

I like JAX but I'm not sure how an ML framework debate like "JAX vs PyTorch" is relevant to DeepSeek/PTX. The JAX API is at a similar level of abstraction to PyTorch [0]. Both are Python libraries and sit a few layers of abstraction above PTX/CUDA and their TPU equivalents.

[0] Although PyTorch arguably encompasses 2 levels, with both a pure functional library like the JAX API, as well as a "neural network" framework on top of it. Whereas JAX doesn't have the latter and leaves that to separate libraries like Flax.

jdeaton · on Feb 5, 2025

The interesting thing about this comment is that JAX is actually higher-level even than pytorch generally. Since everything is compiled you just express a logcial program and let the compiler (XLA) worry about the rest.

Are you suggesting that XLA would be where this "lower level" approach would reside since it can do more automatic optimization?

Scene_Cast2 · on Feb 5, 2025

I'm curious, what does paradigmatic JAX look like? Is there an equivalent of picoGPT [1] for JAX?

[1] https://github.com/jaymody/picoGPT/blob/main/gpt2.py

jdeaton · on Feb 5, 2025

yeah it looks exactly like that file but replace "import numpy as np" with "import jax.numpy as np" :)

achierius · on Feb 5, 2025

What PTX kerfuffle are you referring to?

saagarjha · on Feb 4, 2025

You do understand that PTX is part of CUDA right?