Show code
library("dplyr")
library("ggplot2")
library("plotly")
library("knitr")Aisha Aslam
July 1, 2025
If you love books, you are at just the right place :) Wouldn’t it be interesting to find out:
Is there a correlation between higher book ratings and the number of book reviews some popular books receive?
Let us find out using the following dataset.
We will be using the publicly accessible dataset ‘Good reads books’ . (https://www.kaggle.com/datasets/jealousleopard/goodreadsbooks ).
A Sneak-peak into the dataset
| X | bookID | title | authors | average_rating | isbn | isbn13 | language_code | num_pages | ratings_count | text_reviews_count | publication_date | publisher |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | Harry Potter and the Half-Blood Prince (Harry Potter #6) | J.K. Rowling/Mary GrandPré | 4.57 | 0439785960 | 9.780440e+12 | eng | 652 | 2095690 | 27591 | 2006-09-16 | Scholastic Inc. |
| 2 | 2 | Harry Potter and the Order of the Phoenix (Harry Potter #5) | J.K. Rowling/Mary GrandPré | 4.49 | 0439358078 | 9.780439e+12 | eng | 870 | 2153167 | 29221 | 2004-09-01 | Scholastic Inc. |
| 3 | 4 | Harry Potter and the Chamber of Secrets (Harry Potter #2) | J.K. Rowling | 4.42 | 0439554896 | 9.780440e+12 | eng | 352 | 6333 | 244 | 2003-11-01 | Scholastic |
| 4 | 5 | Harry Potter and the Prisoner of Azkaban (Harry Potter #3) | J.K. Rowling/Mary GrandPré | 4.56 | 043965548X | 9.780440e+12 | eng | 435 | 2339585 | 36325 | 2004-05-01 | Scholastic Inc. |
| 5 | 8 | Harry Potter Boxed Set Books 1-5 (Harry Potter #1-5) | J.K. Rowling/Mary GrandPré | 4.78 | 0439682584 | 9.780440e+12 | eng | 2690 | 41428 | 164 | 2004-09-13 | Scholastic |
| 6 | 9 | Unauthorized Harry Potter Book Seven News: Half-Blood Prince Analysis and Speculation | W. Frederick Zimmerman | 3.74 | 0976540606 | 9.780977e+12 | en-US | 152 | 19 | 1 | 2005-04-26 | Nimble Books |
Let’s visualize the book reviews and book ratings

More ratings = more reviews :)
---
title: "Good Reads"
author: "Aisha Aslam"
format: html
code-fold: true
code-summary: "Show code"
code-tools: true
#editor: visual
date: Jul-2025
warning: false
categories: [R, 25Winter, "data: books.csv"]
---
## Introduction
If you love books, you are at just the right place :) Wouldn't it be interesting to find out:
***Is there a correlation between higher book ratings and the number of book reviews some popular books receive?***
Let us find out using the following dataset.
## The Dataset
We will be using the publicly accessible dataset 'Good reads books' . (<https://www.kaggle.com/datasets/jealousleopard/goodreadsbooks> ).
```{r}
#| eval: false
#| include: false
install.packages(c("dplyr", "ggplot2", "plotly"))
install.packages("knitr")
```
```{r}
library("dplyr")
library("ggplot2")
library("plotly")
library("knitr")
```
**A Sneak-peak into the dataset**
```{r}
goodreadsdata <- read.csv("../../../../data/books.csv")
kable(head(goodreadsdata))
```
**Let's visualize the book reviews and book ratings**
```{r}
#View(goodreadsdata)
ggplot(goodreadsdata, aes(x = average_rating, y = text_reviews_count)) +
geom_point( colour = "blue")
```
More ratings = more reviews :)