torch v0.11.0 is now on CRAN! This blog post highlights some of the changes included
in this release. But you can always find the full changelog
on the torch website.
Improved loading of state dicts
For a long time it has been possible to use torch from R to load state dicts (i.e.Â
model weights) trained with PyTorch using the load_state_dict()
function.
However, it was common to get the error:
Error in cpp_load_state_dict(path) : isGenericDict() INTERNAL ASSERT FAILED at
This happened because when saving the state_dict
from Python, it wasn’t really
a dictionary, but an ordered dictionary. Weights in PyTorch are serialized as Pickle files – a Python-specific format similar to our RDS. To load them in C++, without a Python runtime,
LibTorch implements a pickle reader that’s able to read only a subset of the
file format, and this subset didn’t include ordered dicts.
This release adds support for reading the ordered dictionaries, so you won’t see
this error any longer.
Besides that, reading theses files requires half of the peak memory usage, and in
consequence also is much faster. Here are the timings for reading a 3B parameter
model (StableLM-3B) with v0.10.0:
system.time({
x <- torch::load_state_dict("~/Downloads/pytorch_model-00001-of-00002.bin")
y <- torch::load_state_dict("~/Downloads/pytorch_model-00002-of-00002.bin")
})
user system elapsed
662.300 26.859 713.484
and with v0.11.0
user system elapsed
0.022 3.016 4.016
Meaning that we went from minutes to just a few seconds.
Using JIT operations
One of the most common ways of extending LibTorch/PyTorch is by implementing JIT
operations. This allows developers to write custom, optimized code in C++ and
use it directly in PyTorch, with full support for JIT tracing and scripting.
See our ‘Torch outside the box’
blog post if you want to learn more about it.
Using JIT operators in R used to require package developers to implement C++/Rcpp
for each operator if they wanted to be able to call them from R directly.
This release added support for calling JIT operators without requiring authors to
implement the wrappers.
The only visible change is that we now have a new symbol in the torch namespace, called
jit_ops
. Let’s load torchvisionlib, a torch extension that registers many different
JIT operations. Just loading the package with library(torchvisionlib)
will make
its operators available for torch to use – this is because the mechanism that registers
the operators acts when the package DLL (or shared library) is loaded.
For instance, let’s use the read_file
operator that efficiently reads a file
into a raw (bytes) torch tensor.
torch_tensor
137
80
78
71
...
0
0
103
... [the output was truncated (use n=-1 to disable)]
[ CPUByteType{325862} ]
We’ve made it so autocomplete works nicely, such that you can interactively explore the available
operators using jit_ops$
and pressing
Other small improvements
This release also adds many small improvements that make torch more intuitive:
-
You can now specify the tensor dtype using a string, eg:
torch_randn(3, dtype = "float64")
. (Previously you had to specify the dtype using a torch function, such astorch_float64()
).torch_randn(3, dtype = "float64")
torch_tensor -1.0919 1.3140 1.3559 [ CPUDoubleType{3} ]
-
You can now use
with_device()
andlocal_device()
to temporarily modify the device
on which tensors are created. Before, you had to usedevice
in each tensor
creation function call. This allows for initializing a module on a specific device:with_device(device="mps", { linear <- nn_linear(10, 1) }) linear$weight$device
torch_device(type='mps', index=0)
-
It’s now possible to temporarily modify the torch seed, which makes creating
reproducible programs easier.with_torch_manual_seed(seed = 1, { torch_randn(1) })
torch_tensor 0.6614 [ CPUFloatType{1} ]
Thank you to all contributors to the torch ecosystem. This work would not be possible without
all the helpful issues opened, PRs you created, and your hard work.
If you are new to torch and want to learn more, we highly recommend the recently announced book ‘Deep Learning and Scientific Computing with R torch
’.
If you want to start contributing to torch, feel free to reach out on GitHub and see our contributing guide.
The full changelog for this release can be found here.
Photo by Ian Schneider on Unsplash
Reuse
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.
Citation
For attribution, please cite this work as
Falbel (2023, June 7). Posit AI Blog: torch 0.11.0. Retrieved from https://blogs.rstudio.com/tensorflow/posts/2023-06-07-torch-0-11/
BibTeX citation
@misc{torch-0-11-0, author = {Falbel, Daniel}, title = {Posit AI Blog: torch 0.11.0}, url = {https://blogs.rstudio.com/tensorflow/posts/2023-06-07-torch-0-11/}, year = {2023} }