I think most of these are extremely poor. They can only be interpreted in many cases if you already understand the data, such as by reading the table first.
I believe that's actually the point! Choosing the right way to display information is a skill all on its own!
Sure, but it’d be a lot more interesting and challenging to build a 100 visualizations where each gives a unique insight of the same dataset. An isometric 3d bar chart is just going through the motions.
Next, 1 essay, 100 fonts!…
See this as a form of brainstorming.
Next session is about ranking and discarding.
From my POV this is worth bookmarking - there are many datasets that are much clearer with one chart type or another - having 100 styles with the same data will later offer a visual index to help me decide what will best serve my needs.
My thoughts exactly! At least half of these are chart types that I've never seen before or at least would never think of using so having this reference is awesome.
Interesting. I'm currently writing a book I on visualization. It's pretty opinionated. My thesis after teaching viz to companies for a while is that most plots should be limited to one of 4: line, bar, scatter, and histogram.
I think I'm going to try this dataset with my thesis.
Is histogram not a bar chart? As in, it's a transformation you do on the data (to get the frequency per bin) and then most commonly represent it as a bar chart. It seems to me like saying that line smoothing is a different graph type: it's a new look, but if it was previously a line chart, it's still a line chart except that the lines are drawn between different values. How do you see that?
I would also say there are more useful visualizations one can do than bar, line, and scatter. For example, though I'm not sure what it's called, there are charts that suit different orders of magnitude. They're like area charts (line chart with the area under the line filled in) but it wraps around from the bottom, when it would otherwise exceed the top, and shows that second layer in a different color. Think I mainly see them for like network traffic or latency graphs. I find them useful because you can see different scales without a lot of vertical space. (Don't know how well they work for color blind people, but then the same argument goes for any sight impairment and visualizations.) The underlying data points remain the same so I'd say it's actually a visualization change and not a change of the values being shown
Of course, most of the things people pick as more-pretty-looking alternatives to bar/line/scatter aren't good visualizations, so I agree with the sentiment. Just that there do seem to be more options that have benefits for certain datasets
There's a ton of visualizations. Most folks don't know how to tell basic stories using just these plots. If you look at professional visualizations from media outlets you don't see many "fancy" plots, you see well crafted versions of these 4 plots (and maps) 80% of the time.
Applying the Pareto principle, you will get the most bang for your buck if you master story telling with these. (And you won't need to touch the other types of plots).
Histogram uses bars but it illustrates frequency in buckets whereas by bars OP likely means more categorical in terms of what the bar represents.
Yes, it is a specialized version of a bar chart. But typically the goal is to look at the distribution of the data. I will also explore using specialized scatter plots (with line plots) to explore the distribution of the data.
I pretty much always use ECDF in preference to histogram these days.
It’s like saying a 100% bar chart is also a different chart type.
In my opinion violin plots are severely underused, they are basically box plots on steroids.
Beeswarm plots are more intuitively interpretable, in my experience
I prefer a histogram or jittered scatter plot.
Hard same. So much effort has been wasted on fancy charts that are hard to make and often indecipherable.
This is really cool, I love the idea of demonstrating the different approaches that a visualization can take and how it affects the message.
It's very useful as a source of inspiration.
Fun exercise. #54 is the best, but still has plenty of room for improvement.
Completely agree. It’s what Edward Tufte calls a slopegraph. His canonical example is surprisingly similar to the data used here. His one design is better than any of these 100 - it shows the actual numbers, and removes the unnecessary vertical lines.
To me the vertical lines help me read the chart. I wouldn’t call them unnecessary. Also, depending on what the data means, maybe you want to add 0 on that line.
I suppose the whole point of this exercise is to show different ways to tell _different_ stories? (real time edit: “how we can tell different stories” from the source)
#54 is good for showing comparable increases in sites between years, right? But if the story you’re telling is primarily “how many sites were there then? How many now?” you kinda have to squint and guess. (One could improve on this one, as you suggest. But the primary story would still be rates of change.)
#60 is also good. Most of the others... a lot of them (#16, #22, #26, etc.) I guess would work as eyecandy to put on the side of an article when you don't know what else to put there, but I don't think I could think of a dataset which they're useful for, let alone this dataset being a good showcase for them
What is the name for #26?
It’s somewhat perfect for something I need to visualise, wondered if it has a name and/or a d3 implementation
Is there a way to get source files? To do similar layouts in excel or PowerPoint?
Would be great to see the code behind these. Is it posted somewhere?
Are Plotly and/or Seaborn still the best Python packages to get these kind of visualisations out of the box? I am always looking for new ways to better visualise data in reporting, and some of these look very helpful in telling a story from data.
My (opinionated) take is that if you learn 4 basic plots, it will take you far. These are easy to do with Pandas. In fact, I think the easiest way to do Matplotlib is with pandas rather than the Matplotlib API.
I do pull out plotly for 3d scatter plots (for PCA visualization). Matplotlib is horrible for this.
Personally I find plotly hard to beat. Unless you're doing really fancy stuff, it gives you everything you need. Seaborn also a great option IMO.
100 visualizations, and still room for more! Which really tells us how hard this is.
At least one or more alternatives I can come up with is to use the map more. Adding colors to the countries.
Another similar to #76, but show miniature to each heritage site.
Is this a freely available library (JS?) they use for this?
Why a table is not enough?
Because humans are really good at seing patterns.
I've been saying this for years now in the context of sysadmin work and dashboards.
Some people think about graphs in dashboards as pointless frivolity and for show. I've heard/seen people claim: all we need is an indicator: green if OK, else red.
In my opinion, while that is useful, in my position that is often too late.
Always visible that shows whatever is important to my work: disk space, numbers of errors, (mega/giga/tera)bytes in/out pr second/minute/hour, that allows me both to predict and react, often ahead of time and also to easier diagnose, both because I now have an eye into the system but also because I have over time built a feeling for what is normal and not.
The same is true for visualizations and we also have the same enemies, for example misleading scaling and too little/much detail, distracting details and colors that looks way too similar.
It consists of 4 datasets with the same summary statistics, but when plotted look very different. It's much easier to see the patterns in the plots than in the data table.
Thanks for that reference.
Nice! No Sankey though ?
Isn't #8 basically that, just vertically? Or am I misremembering what a sankey is
Edit: and #42 is a visually similar horizontal variant, but not completely the same
Not much to add other than: I love this.
Now, it will be nice to get the code snippets for these!
Right at the bottom: “HIRE US” :)
I suspect these are all so hand crafted that there’s not much in the way of code.
I think most of these are extremely poor. They can only be interpreted in many cases if you already understand the data, such as by reading the table first.
I believe that's actually the point! Choosing the right way to display information is a skill all on its own!
Sure, but it’d be a lot more interesting and challenging to build a 100 visualizations where each gives a unique insight of the same dataset. An isometric 3d bar chart is just going through the motions.
Next, 1 essay, 100 fonts!…
See this as a form of brainstorming.
Next session is about ranking and discarding.
From my POV this is worth bookmarking - there are many datasets that are much clearer with one chart type or another - having 100 styles with the same data will later offer a visual index to help me decide what will best serve my needs.
My thoughts exactly! At least half of these are chart types that I've never seen before or at least would never think of using so having this reference is awesome.
Interesting. I'm currently writing a book I on visualization. It's pretty opinionated. My thesis after teaching viz to companies for a while is that most plots should be limited to one of 4: line, bar, scatter, and histogram.
I think I'm going to try this dataset with my thesis.
Is histogram not a bar chart? As in, it's a transformation you do on the data (to get the frequency per bin) and then most commonly represent it as a bar chart. It seems to me like saying that line smoothing is a different graph type: it's a new look, but if it was previously a line chart, it's still a line chart except that the lines are drawn between different values. How do you see that?
I would also say there are more useful visualizations one can do than bar, line, and scatter. For example, though I'm not sure what it's called, there are charts that suit different orders of magnitude. They're like area charts (line chart with the area under the line filled in) but it wraps around from the bottom, when it would otherwise exceed the top, and shows that second layer in a different color. Think I mainly see them for like network traffic or latency graphs. I find them useful because you can see different scales without a lot of vertical space. (Don't know how well they work for color blind people, but then the same argument goes for any sight impairment and visualizations.) The underlying data points remain the same so I'd say it's actually a visualization change and not a change of the values being shown
Of course, most of the things people pick as more-pretty-looking alternatives to bar/line/scatter aren't good visualizations, so I agree with the sentiment. Just that there do seem to be more options that have benefits for certain datasets
There's a ton of visualizations. Most folks don't know how to tell basic stories using just these plots. If you look at professional visualizations from media outlets you don't see many "fancy" plots, you see well crafted versions of these 4 plots (and maps) 80% of the time.
Applying the Pareto principle, you will get the most bang for your buck if you master story telling with these. (And you won't need to touch the other types of plots).
Histogram uses bars but it illustrates frequency in buckets whereas by bars OP likely means more categorical in terms of what the bar represents.
Yes, it is a specialized version of a bar chart. But typically the goal is to look at the distribution of the data. I will also explore using specialized scatter plots (with line plots) to explore the distribution of the data.
I pretty much always use ECDF in preference to histogram these days.
It’s like saying a 100% bar chart is also a different chart type.
In my opinion violin plots are severely underused, they are basically box plots on steroids.
A random example: https://stackabuse.s3.amazonaws.com/media/seaborn-violin-plo...
Beeswarm plots are more intuitively interpretable, in my experience
I prefer a histogram or jittered scatter plot.
Hard same. So much effort has been wasted on fancy charts that are hard to make and often indecipherable.
This is really cool, I love the idea of demonstrating the different approaches that a visualization can take and how it affects the message.
It's very useful as a source of inspiration.
Fun exercise. #54 is the best, but still has plenty of room for improvement.
Completely agree. It’s what Edward Tufte calls a slopegraph. His canonical example is surprisingly similar to the data used here. His one design is better than any of these 100 - it shows the actual numbers, and removes the unnecessary vertical lines.
https://www.edwardtufte.com/notebook/slopegraphs-for-compari...
To me the vertical lines help me read the chart. I wouldn’t call them unnecessary. Also, depending on what the data means, maybe you want to add 0 on that line.
I suppose the whole point of this exercise is to show different ways to tell _different_ stories? (real time edit: “how we can tell different stories” from the source)
#54 is good for showing comparable increases in sites between years, right? But if the story you’re telling is primarily “how many sites were there then? How many now?” you kinda have to squint and guess. (One could improve on this one, as you suggest. But the primary story would still be rates of change.)
#60 is also good. Most of the others... a lot of them (#16, #22, #26, etc.) I guess would work as eyecandy to put on the side of an article when you don't know what else to put there, but I don't think I could think of a dataset which they're useful for, let alone this dataset being a good showcase for them
What is the name for #26?
It’s somewhat perfect for something I need to visualise, wondered if it has a name and/or a d3 implementation
Is there a way to get source files? To do similar layouts in excel or PowerPoint?
Would be great to see the code behind these. Is it posted somewhere?
Are Plotly and/or Seaborn still the best Python packages to get these kind of visualisations out of the box? I am always looking for new ways to better visualise data in reporting, and some of these look very helpful in telling a story from data.
My (opinionated) take is that if you learn 4 basic plots, it will take you far. These are easy to do with Pandas. In fact, I think the easiest way to do Matplotlib is with pandas rather than the Matplotlib API.
I do pull out plotly for 3d scatter plots (for PCA visualization). Matplotlib is horrible for this.
Personally I find plotly hard to beat. Unless you're doing really fancy stuff, it gives you everything you need. Seaborn also a great option IMO.
100 visualizations, and still room for more! Which really tells us how hard this is.
At least one or more alternatives I can come up with is to use the map more. Adding colors to the countries.
Another similar to #76, but show miniature to each heritage site.
Reminds me of the CSS Zen Garden https://csszengarden.com/
I’m kinda interested to know what dataset would benefit from visualization number 26
https://news.ycombinator.com/item?id=42226891
Is this a freely available library (JS?) they use for this?
Why a table is not enough?
Because humans are really good at seing patterns.
I've been saying this for years now in the context of sysadmin work and dashboards.
Some people think about graphs in dashboards as pointless frivolity and for show. I've heard/seen people claim: all we need is an indicator: green if OK, else red.
In my opinion, while that is useful, in my position that is often too late.
Always visible that shows whatever is important to my work: disk space, numbers of errors, (mega/giga/tera)bytes in/out pr second/minute/hour, that allows me both to predict and react, often ahead of time and also to easier diagnose, both because I now have an eye into the system but also because I have over time built a feeling for what is normal and not.
The same is true for visualizations and we also have the same enemies, for example misleading scaling and too little/much detail, distracting details and colors that looks way too similar.
Fair enough.
The usual example used in teaching is Ansombe's Quartet: https://en.wikipedia.org/wiki/Anscombe%27s_quartet
It consists of 4 datasets with the same summary statistics, but when plotted look very different. It's much easier to see the patterns in the plots than in the data table.
Thanks for that reference.
Nice! No Sankey though ?
Isn't #8 basically that, just vertically? Or am I misremembering what a sankey is
Edit: and #42 is a visually similar horizontal variant, but not completely the same
Not much to add other than: I love this.
Now, it will be nice to get the code snippets for these!
Right at the bottom: “HIRE US” :)
I suspect these are all so hand crafted that there’s not much in the way of code.