class: center, middle, inverse, title-slide # Reproduce the Problem ## ⚔
in order to solve it ### Julia Piaskowski ### 2022-03-08 --- class: center, middle ## How to ask for help <br> <br> *the coding error* *unsolvable mystery* *make it simple, friend* --- # Getting Help (again) * Eventually, you will probably need outside help to solve a programming problem. * there are many skilled R programmers out there: Stack Overflow, R4DS, RStudio Community, that one grad student who knows everything, ... * You are asking them to kindly help you, but in order for them to do that, they need to understand your problem. <img src="images/obi_wan.jpg" width="60%" /> --- # Example 1 >I want to ask: are there significant differences in the yield responses of corn to 3 different levels of manure application with and without biofertilizer? What R code should I use? Thank you so much! Manure application levels = none, low, normal Biofertilizer = Y or N. --- # Example 2 >I am new to R so I am struggling to figure out how to do this. I have a data frame with thousands of observations that looks like this x total subgroup1 subgroup2 .03 1000 500 500 .04 1050 550 500 .07 1100 525 575 .08 9900 4900 5000 I am trying to come up with a weighted value for x based on the following equation. weighted x for subgroup1 = (Σ x*subgroup1) / (Σ subgroup1) The idea is that for each observation we multiply the x value by the subgroup 1 value and add them all together and divide it by the sum of all the subgroup values. I also want to do this for subgroup 2. I imagine I will need some type of loop function. However, I am not sure where to start in R. Any leads would be great. --- class: center, middle ## Anyone feel ready to troubleshoot those problems in your R session? --- # Reprex = reproducible example = minimum version of what you are doing ~~in R~~ that is generating an error ![lion and cub](images/lion_and_cub.jpg) --- # A good reprex includes... 1. A short description of what you are trying to do programatically (e.g. 'filter a data set based on condition') 1. Code in the language you are using that reproduces the problem for someone trying to solve this *on their system* (people often cannot just look at code and see what is wrong, they need to work with it). 1. The error message being produced and/or the incorrect or undesirable output produced by your code. --- # Some good examples of a reprex * [Example 1](https://stackoverflow.com/questions/71403633/conditional-formatting-of-cells-using-if-else-or-hiding-columns-in-dt-package-i) * [Example 2](https://stackoverflow.com/questions/71402703/how-to-obtain-the-specific-grouping-of-cases-and-controls-in-r) * [Example 3](https://stackoverflow.com/questions/71403688/save-kable-in-r-writes-empty-file) --- # A minimal data set for a reprex 1. Use smallest, simplest data set. Preferably a built-in data set (e.g. iris or cars, yawn). 1. If you have to create a data set, keep it small, keep it simple. * minimum size (e.g. number of rows), minimum variables * enough information to replicate your situation - at least, the situation that is creating a problem <br><br><br> <center> **Remember, the goal is to recreate your problem!** </center> <center> **Tailor your (minimal) data set to meet that goal.** </center> --- # Tools to create a minimal data set * `data.frame()`, `matrix()`, `list()` and other base R functions * if you simulate new data through a random process, use `set.seed()`. ```r set.seed(208) my_df <- data.frame(x1 = c(1, 2, 3, 4), x2 = c("a", "a", "b", "b"), x3 = rnorm(4)) ``` ------------------ **mini-exercise:** make a `data.frame` with at least 2 variables and 5 rows. --- # Tools to create a minimal data set * Find those data sets! Use `data()`, `data(package = "dplyr")` or search the documentation (their data sets are listed in the help files). *(no point in running these in an Rmarkdown document - do this in your R session)* ------------------ **mini-exercise:** find a data set from a tidyverse package you've never used before and examine it. How big is it? What type of variables are present? --- # datapasta, a convenience package [<img src="images/datapasta_hex.png" style="max-width:22%;min-width:60px;" alt="datapasta" />](https://milesmcbain.github.io/datapasta/index.html) * Format a table so it is represented as R code, and put it on your system clipboard: ```r library(datapasta); library(dplyr) iris %>% head() %>% df_format() ``` * Next, paste the R-formatted data ([CTRL + V] or [CMD + V] on many systems) into an R source file. --- # datapasta, a convenience package * Pull part of table into R-formatted syntax: First, copy part of a table from a spreadsheet (e.g. Googlesheets, Excel, Numbers, Calc). Next, run this line of code to generate R code that will format your copied spreadsheet cells into an R data.frame: ```r df_paste() ``` You can also run this line of code to generate R code that will format your copied spreadsheet cells into an R tibble: ```r tribble_paste() ``` --- class: center, middle, inverse # Why do something so tedious?? ### Because we need this code for a reprex. This is not for us - it's for the anonymous programmers out there we are asking to help us. --- class: center, middle, inverse # [exercise](exercises/exercise-8.html) --- # Narrow down the problem: * DON'T provide a long description of why you want to do something, "We plan to save the world by comparing storm drainage rates..." * DON'T dump a long R script with the single descriptor "it didn't work" * DO Be very judicious in your use of screenshots. In most cases, text and code describes the problem. * DO find the line of code where it all breaks down (this does not mean digging into the source code - find the error in *your* code) * DO include all lines of code needed to recreate the problem - including libraries that must be loaded * DON'T load a meta-library like `library(tidyverse)` - only load the packages needed to recreate the problem * DO <mark>ruthlessly strip out every bit of unneeded code</mark>, any extra lines, extraneous arguments that can be ignored and still the error persists --- # Be considerate of your anonymous helpers .pull-left[ <img src="images/kind_rewind.jpg" width="840" /> ] .pull-right[ * no `rm = list(ls())` since it is running on someone else's system * no `setwd("path/to/my/computer")` * if your code changes systems systems (e.g. `par(...)`, make sure you reset it at the end of the reprex code * don't mask built-in functions (e.g. `mean()`) with your own custom functions ] --- # Last step: using `reprex()` * **reprex** is a package supporting how to generate reproducible examples. * like **datapasta**, it can make your life easier Let's pretend the code below is generating an error. Copy both lines of code. ```r x<- 1:10 plot(x) ``` Next, run this code in an R session: ```r reprex::reprex() ``` **You won't see much, but, valuable information has been put on your system clipboard!** Paste your clipboard contents on the resource where you are seeking help (e.g. [GitHub issue](https://github.com/IdahoAgStats/what-they-forgot-202203/issues), Stack Overflow) --- # Example: Let's Write a Reprex! ![](images/need_reprex.png) --- class: middle, center, inverse # [exercise](exercises/exercise-9.html) --- # Use good style ### (it's easier to read) * There is no cap on white space usage: ```r # No need to pack in code this tightly! # (We aren't trying to send this data stream to the moon) myfunc<-onebigfunc(mydf,arg=1,arg2=TRUE,arg4="Go") more<-evenmore(myfunc,argA=c("Idaho","Montana","Utah")) if(condition){dothis} else{dothat} ``` * Pay attention to function examples in documentation for code styling * There is actually a [code guide!](https://style.tidyverse.org/syntax.html) and a package, [styler](https://styler.r-lib.org/) to support this. Tip: <mark>CTRL + Shift + a</mark> will reformat things in RStudio. --- # Show me the money! * Creating a good reprex is **a lot** of work. * If you want to elicit feedback to solve your problem, this is how to do that. * Very frequently, the process of creating a reprex will reveal an error, a mistaken assumption, or something that will help you resolve the problem. <iframe src="https://giphy.com/embed/fdLR6LGwAiVNhGQNvf" width="480" height="270" frameBorder="0" class="giphy-embed" allowFullScreen></iframe><p><a href="https://giphy.com/gifs/fdLR6LGwAiVNhGQNvf">via GIPHY</a></p> --- # Other guides for creating a good reprex ### [Stack Overflow](https://stackoverflow.com/help/minimal-reproducible-example) ### [Rstudio Community](https://community.rstudio.com/t/faq-how-to-do-a-minimal-reproducible-example-reprex-for-beginners/23061) ### [Tidyverse Reprex Do's and Don'ts](https://reprex.tidyverse.org/articles/reprex-dos-and-donts.html)