Test driving Python integration in R, using the ‘reticulate’ package

RP

Introduction

Not so long ago RStudio released the R package ‘reticulate‘, it is an R interface to Python. Of course, it was already possible to execute python scripts from within R, but this integration takes it one step further. Imported Python modules, classes and functions can be called inside an R session as if it were just native R functions.

Below you’ll find some screen shot code snippets of using certain Python modules within R with the reticulate package. On my GitHub page you’ll find the R files from which these snippets were taken from.

Using python packages

The nice thing about reticulate in RStudio is the support for code completion. When you have imported a python module, RStudio will recognize the methods that are available in the python module:

clarifai_code_comp

The clarifai module

Clarifai provides a set of computer vision API’s for image recognition, face detection, extracting tags, etc. There is an official python module and there is also an R package by Guarav Sood, but it exposes less functionality. So I am going to use the python module in R. The following code snippet shows how easy it is to call python functions.

clarifaicode

The output returned from the clarifai call is a nested list and can be quit intimidating at first sight. To browse trough these nested lists and to get a better idea of what is in those lists, you can use the package listviewer:

listviewer

The pattern.nl module

The pattern.nl module contains a fast part-of-speech tagger for Dutch, sentiment analysis, and tools for Dutch verb conjugation and noun singularization & pluralization. At the moment it does not support python 3. That is not a big deal, I am using Anaconda and created a Python 2.7 environment to install pattern.nl.

The nice thing of the reticulate package is that it allows you to choose a specific Python environment to be used.

pattern

The pytorch module

pytorch is a python package that provides tensor computations and deep neural networks. There is no ‘R torch’ equivalent, but we can use reticulate in R. There is an example of training a logistic regression in pytorch, see the code here. It takes just a little rewrite of this code to make this work in R. See the first few lines in the figure below.

pytorch

Conclusion

As a data scientist you should know both R and Python, the reticulate package is no excuse for not learning Python! However, the reticulate package can be very useful if you want to do all your analysis in the RStudio environment. It works very well.

For example, I have used rvest to scrape some Dutch news texts, then used the Python module pattern.nl for Dutch sentiment and wrote an R Markdown document to present the results. Then the reticulate package is a nice way to keep everything in one environment.

Cheers, Longhow