By an odd coincidence, my sister and I will both be learning #R this summer.
@peterdrake Nice for doing some mathy stuff, horrible as a programming language though
@freemo My sense is that people who come to data science from stats like R, but the ones who come from computer science like Python.
@peterdrake I would say thats sort of true. But replace "like" with "taught first"/ No one in their right mind would build a product with R, but its often all statisticians know since they arent taught to build products.
This is all very relative to what im doing now as I am in charge of a team that includes a software division of statisticians who started out knowing just R and I have literally needed to teach them to code in python from scratch. .While they know R better even they would admit its a horrible language to build concrete apps out of.
@freemo @peterdrake Unsolicited opinion time! 😂
I've used R for 6-7 years; in short, it's *not bad*. It has a lot of nice things: first class functions and lambdas, automatic vectorized syntax, DECADES worth of really good packages, excellent publishing software with LaTeX integration, etc. but in my opinion falls flat in a number of key areas:
1. Very slow in most cases. Can't be compiled and doesn't package easily, which makes transporting code to different people cumbersome.
2. Syntax is frustrating at times to deal with: there are different conventions across the myriad of opinionated packages, and none of them play nicely. Tidyverse is great, but dealing with dataframes, tibbles, matrices, tables, etc. are all really annoying. They're all basically tables, but you never really know what some function will return (unless you check), and they all look the same in the View/print output. Also the typeof() and class() functions return different values on the same input, so you often have to run both to diagnose the issue.
3. Sometimes it "tries to help" a little too much in terms of assuming how certain code you write should be interpreted. Certain code works on vectors works on matrices and lists of lists, but other functions will fail spectacularly. These failures often don't error out and break the program; thus begins the bug hunt.
4. Error messages suck. A lot. No line numbers, nothing, just some arbitrary message, sometimes that has no discernible meaning. RStudio tries to help here, but it just provides an okay-ish debugger.
5. IMO it's barely usable without a repl until you have a year or two under your belt, hence you're basically locked in to RStudio or a Jupyter Notebook. The former has autocomplete and a better help system, so I would start there.
6. No pipes by default, no map/reduce/repeat/etc. by default either. Look into the purrr package for these features though!
7. While technically multiparadigm, I personally don't think they did a good job integrating them all to work cohesively. Julia does this much better.
I think that covers my take. While R certainly isn't as terrible as @freemo is making it out to be (sorry not sorry 😉), my current favorite is Julia by far. If you have the option to use is instead of or to interop it with R, I really suggest it.
Either way, I hope you have fun with it! I'm interested to see what you think after you give it a shot, assuming you're willing to do a review, lol.
Knowing both Python and R is a good place to be if you want to deal with the scientific community for sure.