For the last couple of years, we’ve been using the statistical programming language R when we do statistical analysis or data visualizations at work. We typically deal with small data — most of the time, our data sets are high-tens or low-hundreds of rows of data.

A lot of the time, we create R Notebooks with our analysis and visualizations. This works well for us: the R Notebook contains the code used to do the analysis, the results of the analysis and the visualizations, all in one place. This eliminates questions like: “did you remove outliers before making the graph?” Or, “did you check that the data are distributed normally before you did that test?” A reviewer of the R Notebook can see exactly what was done.

By default, the R Notebook produces an html file that you can open in your browser. You can email this html file to a colleague, and they can see your results and graphs, as well as exactly how you obtained them. If you made a logical mistake, or an inappropriate assumption, your colleague has the opportunity to find it.

There is also a button in the html file that the R Notebook gets exported to that says “Download Rmd.” This allows your colleague to open the notebook in R Studio and run your code. If you sent your data.

The one problem with just emailing R Notebooks to a colleague is that the R Notebook does not include the data. This might be okay if the data source is a file on a network, or a database that you both have access to, but in a lot of cases — at least in my work — the data is a CSV or Excel file. Now, if I want to send an R Notebook to a colleague to review, I need to remember to send the data file along with it.

Enter rde.

I wrote the package rde (which stands for Reproducible Data Embedding) to tackle this problem. This package allows you to embed data right in your R Notebook (or any other R code). It does so by compressing the data and then base-64 encoding it into an ASCII string. This string can be pasted into the R Notebook and converted back into the original data when someone re-runs the Notebook.

I won’t go into all the details of how to use the package. If you’d like to learn more, you can read the package vignette.

This isn’t the first R Package that I’ve written, but it is the first one that I’ve submitted to CRAN. When you install an R package using install.packages(), you’re installing it from CRAN. I think that CRAN is one of the best parts of the R ecosystem since it does continuous integration for all of the packages hosted there. This helps ensure that all the packages continue to work as R is updated and as other packages are updated. I’ll likely talk about this more in a future blog post.

If you’re an R user and you think that the package rde would help you in your workflow, check it out. You can install it by typing install.packages("rde") in R. If you find a bug, please file an issue on GitHub. And, if you would like to add functionality or improve it in some way, feel free to send me a pull request.