Practice Set 1
Create a R Markdown document using html output
Install R package randomNames
Generate 100 student names (first and last names) using random names package.
Generate 100 grades for above students. Random number should have a mean of 70 with standard distribution of 15. Below functions, rnorm and floor would be useful. rnorm(n, mean=a, sd=b) floor
Put student names and grades to one tibble (data frame) You can use as_tibble function for this purpose. Your tibble should have at least 2 columns (names, grades).
export this data frame as csv file, student_grades.csv.
Note the difference between write.csv which is in base R library and write_csv which is readr library in tidyverse.
Practice Set 2
1. please install gapminder package. Gapminder contains life expectancy, GDP per capita, and population by country. install.packages("gapminder") library(tidyverse) library(gapminder) data(gapminder)
2. using dataset of gapminder, plot the population vs time of Germany. You should use
geom_point
geom_line
which one is better?
3. Create same plots for your own country.
4. Make a histogram of for life expectancy values for year 2007. Do not forget that you need to subset your dataset
5. Create a facets of histograms life expectancy values by year
6. Create a line plot of life expectancy values for the countries. In the aesthetics, you need to set x,y and by
7. Create a line plot of life expectancy values for the countries. Set the color for the continent values In the aesthetics, you need to set x,y, by and color.
8. Create a line plot of life expectancy values for the countries. create the facets for the continent values
9. Create a R Markdown document that shows your plots
Practice Set 3
1. create a R Studio Project named Lab-2021-10-19
2. create a R Markdown document using html output
3. rename YAML part of your document with following
4. Write following setup code in your r markdown document
#install.packages("gapminder")
library(tidyverse)
library(gapminder)
data(gapminder)
create headers using ## for every question of this lab. like ## Question 5
6. Write necessary code so that following ____ parts will be filled. You need to use inline code for this purpose.
Gapminder dataset contains information about countries and their life expectancy. It has _____ rows and ______ variables.
7. print columns of gapminder dataset as html table. Use both normal output and knitr::kable output.
8. Create a plot of the population vs time of France
9. knit your document to html
10. knit your document to word
11. end your document with running following code.
sessionInfo()
Practice Set 4
create a R Studio Project named Lab-2021-10-19
create a shiny document
give it Application Name: AppTemplate
choose Application Type: Single File (app.R) option
Run application and see your web application
Change your slider input and see your histogram change
create an another shiny document
give it Application Name: AppHelloWorld
choose Application Type: Multiple File (ui.R/server.R) option
in the ui.R remove all code and copy paste following code
library(shiny)
shinyUI(
fluidPage("Hello World")
)
11. Run application and see your web application, saying Hello World
12. Stop your application and change ui.R with following code
library(shiny)
shinyUI(
fluidPage(
textInput("name","Enter your Name"),
textOutput("outputHello")
)
)
We are using a simple text input here.
13. Run application and see your input textbox. It does nothing right now but can enter text in it
14. replace your server.R code with the following code
library(shiny)
shinyServer(function(input, output) {
output$outputHello <- renderText({
paste("Hello ",input$name)
})
})
15. Deploy your application to http://www.shinyapps.io/
Practice Set 5
1. install e1071 package in R install.packages("e1071")
2. run following R file in the RStudio
classification_iris_full_data.R
library("e1071")
df = iris
model = naiveBayes(Species~.,data=df)
predicted_values = predict(model,df[,1:4])
correctly_predicted = sum(predicted_values == df[,"Species"])
print(paste("Correctly predicted",correctly_predicted))
accuracy = correctly_predicted / nrow(df)
print(paste("accuracy",accuracy))
3. In our R codes, we will follow the same approach always. Our classification codes will call the function and give which column should be predicted. Here we are trying to predict Species. Using data argument we set our data set.
model = naiveBayes(Species~.,data=df)
For prediction, we use predict function and give our model and test data.
predict(model,data)
In this file, we use following line, since only first 4 columns contains our X values. Our target is 5th column; therefore, we exclude it from our data frame.
predict(model,df[,1:4])
4. run following R file in the RStudio
classification_iris_train_test_split.R
library("e1071")
df = iris
train_test_split_percentage = 0.66
train_rows = sample(nrow(df), nrow(df)*train_test_split_percentage)
train_data = df[train_rows,]
test_data = df[-train_rows,]
model = naiveBayes(Species~.,data=train_data)
predicted_values_test = predict(model,test_data[,1:4])
correctly_predicted_test = sum(predicted_values_test == test_data[,"Species"])
print(paste("Correctly predicted on TEST",correctly_predicted_test))
accuracy = correctly_predicted_test / nrow(test_data)
print(paste("accuracy on Test Dataset",accuracy))
Since R do not have formal train test split function. We use sample function of R to sample rows from R data frame. Using negative indexing we get test data as in the below code.
test_data = df[-train_rows,]
Other differences in our code to use train and test data in our model and predict codes. Since we have to use train and test data in our calls.
model = naiveBayes(Species~.,data=train_data)
predicted_values = predict(model,test_data[,1:4])
correctly_predicted = sum(predicted_values == test_data[,"Species"])
5. install rpart package in R install.packages("rpart")
6. save as classification_iris_full_data.R code as classification_iris_full_data_dt.R
7. You need to change the code so that you will be using rpart instead of naiveBayes in your classification.
8. predict.rpart does not work like naiveBayes. You need to give one more argument to predict part, like below.
predicted_values = predict(model,df[,1:4],type = "class")
9. save as classification_iris_train_test_split.R as classification_iris_train_test_split_dt.R and do similar changes to it so that it will work as decision tree rpart.
10. Create a R markdown document that shows above steps in your document
Practice Set 6
load the example-data-bad.csv file to rstudio. You will have problems loading this file. You need to customize read_csv call so that you can load this file.
Create a R markdown document
In this R Markdown document show example-data-bad contents in table
Practice Set 7
1. Do either one of the following
goto https://www.kaggle.com/drubal/top-100-global-steel-producers-20112016 download the dataset. You may need to create a account in kaggle
2. import the dataset in a tibble
3. this dataset is not in tidy format. Transform the dataset so that it is in tidy format.
4. using ggplot2 create a bar chart which shows total tonnage of steel produced between 2011-2016. That is you must sum these years to one column. For example Turkey should show 53.21.
5. Create a pie chart which shows percentage of the whole. This figure will show same information as in previous part, but it shows it differently.
6. Create a RMarkdown file.
7. Put these two figures to this markdown file.
To get solution of above R Programming problems you can contact us or send your requirement details at:
realcode4you@gmail.com
Here we are also providing all R Programming related help to write Thesis or Master Projects. If you are struggle to write code then don't worry. We are group of more than 15+ experts that can help you to write your code as per your given requirement.
Comments