For me the appeal is less that tidyverse is great and more that the R standard library is horrible. It's full of esoteric names, inconsistent use and order of parameters, unreasonable default behavior, behavior that surprises you coming from other programming experience. It's all in a couple massive packages instead of broken up into manageable pieces.
Tidyverse is imperfect and it feels heavy-handed and awkward to replace all the major standard library functions, but Tidyverse stuff is way more ergonomic.
I think the R standard library is quite excellent. It pretty much follows the Unix philosophy of "doing one thing right". The only exception being `reshape` which tries to do too many things, but it can usually be avoided. It isn't inconsistent. I think the problem is the lack of tutorials that explain how to use all the data manipulation tools effectively, because there are quite a lot of functions and it isn't easy to figure out how to use them together to accomplish practical things. Tidyverse may be consistent with itself, but it's inconsistent with everything else. Either you only use tidyverse, or your program looks like an inconsistent mess.
Honestly, it might partly be that I've used R somewhat irregularly and I put a lot of value in design choices that "make sense" and are easier to remember. I'm sure once you are intimately familiar with the whole base language you can be really happy and productive with it.
> I think the problem is the lack of tutorials that explain how to use all the data manipulation tools effectively, because there are quite a lot of functions and it isn't easy to figure out how to use them together to accomplish practical things.
Most languages solve this problem by not cramming quite a lot of functions in one package and using shared design concepts to make it easier to fit them together. I don't think tutorials would solve these problems effectively but I guess it makes sense that they affect newer users the most.
> Tidyverse may be consistent with itself, but it's inconsistent with everything else.
Yeah, totally agree and I really dislike this part.
Author here; I think I understand where you might be coming from. I find functional nature of R combined with pipes incredibly powerful and elegant to work with.
OTOH in a pipeline, you're mutating/summarising/joining a data frame, and it's really difficult to look at it and keep track of what state the data is in. I try my best to write in a way that you understand the state of the data (hence the tables I spread throughout the post), but I do acknowledge it can be inscrutable.
A "pipe" is simply a composition of functions. Tidyverse adds a different syntax for doing function composition, using the pipe operator, which I don't particularly like. My general objection to Tidyverse is that it tries to reinvent everything but the end result is a language that is less practical and less transparent than standard R.
Why? The tidyverse is so readable, elegant, compositional, functional and declarative. It allows me to produce a lot more and higher quality than I could without it. ggplot2 is the best visualization software hands down, and dplyr leverages Unix’s famous point free programming style (that reduces the surface area for errors).
I disagree. In this example tidyverse looks convoluted compared to just using an array and apply. ggplot2 is okay but we already had lattice. Lattice does everything ggplot2 does and produces much better-looking plots IMO.
I like simplicity and I love a good base R idiom, but there's a lot less consistency in base R compared to the tidyverse (and that comes with a productivity penalty).
Lattice is really low-level. It's like doing vis with matplotlib (requires a lot of time and hair-pulling). Higher level interfaces boost productivity.
For me the appeal is less that tidyverse is great and more that the R standard library is horrible. It's full of esoteric names, inconsistent use and order of parameters, unreasonable default behavior, behavior that surprises you coming from other programming experience. It's all in a couple massive packages instead of broken up into manageable pieces.
Tidyverse is imperfect and it feels heavy-handed and awkward to replace all the major standard library functions, but Tidyverse stuff is way more ergonomic.
I think the R standard library is quite excellent. It pretty much follows the Unix philosophy of "doing one thing right". The only exception being `reshape` which tries to do too many things, but it can usually be avoided. It isn't inconsistent. I think the problem is the lack of tutorials that explain how to use all the data manipulation tools effectively, because there are quite a lot of functions and it isn't easy to figure out how to use them together to accomplish practical things. Tidyverse may be consistent with itself, but it's inconsistent with everything else. Either you only use tidyverse, or your program looks like an inconsistent mess.
Honestly, it might partly be that I've used R somewhat irregularly and I put a lot of value in design choices that "make sense" and are easier to remember. I'm sure once you are intimately familiar with the whole base language you can be really happy and productive with it.
> I think the problem is the lack of tutorials that explain how to use all the data manipulation tools effectively, because there are quite a lot of functions and it isn't easy to figure out how to use them together to accomplish practical things.
Most languages solve this problem by not cramming quite a lot of functions in one package and using shared design concepts to make it easier to fit them together. I don't think tutorials would solve these problems effectively but I guess it makes sense that they affect newer users the most.
> Tidyverse may be consistent with itself, but it's inconsistent with everything else.
Yeah, totally agree and I really dislike this part.
Author here; I think I understand where you might be coming from. I find functional nature of R combined with pipes incredibly powerful and elegant to work with.
OTOH in a pipeline, you're mutating/summarising/joining a data frame, and it's really difficult to look at it and keep track of what state the data is in. I try my best to write in a way that you understand the state of the data (hence the tables I spread throughout the post), but I do acknowledge it can be inscrutable.
A "pipe" is simply a composition of functions. Tidyverse adds a different syntax for doing function composition, using the pipe operator, which I don't particularly like. My general objection to Tidyverse is that it tries to reinvent everything but the end result is a language that is less practical and less transparent than standard R.
Can you rewrite some of those snippets in standard R w/o Tidyverse? Curious what it would look like
3 replies →
Somehow, tidyverse didn't click with me. I still use it sometimes. But now I primarily use base R and data.table
Why? The tidyverse is so readable, elegant, compositional, functional and declarative. It allows me to produce a lot more and higher quality than I could without it. ggplot2 is the best visualization software hands down, and dplyr leverages Unix’s famous point free programming style (that reduces the surface area for errors).
I disagree. In this example tidyverse looks convoluted compared to just using an array and apply. ggplot2 is okay but we already had lattice. Lattice does everything ggplot2 does and produces much better-looking plots IMO.
I like simplicity and I love a good base R idiom, but there's a lot less consistency in base R compared to the tidyverse (and that comes with a productivity penalty).
Lattice is really low-level. It's like doing vis with matplotlib (requires a lot of time and hair-pulling). Higher level interfaces boost productivity.
the equivalent in any other language would be an ugly, unreadable, inconsistent mess.