In this set we will convers most of important Machine Learning and Deep Learning related topics which can improve your skills related to make the data scientist:
1 Machine Learning
Tasks 1.1
Using TSNE to Visualize the Digits Dataset in 3D You have visualized the Digits dataset’s clusters in two dimensions.
In this exercise, you’ll create a 3D scatter plot using TSNE and Matplotlib’s Axes3D, which provides x-, y- and z-axes for plotting in three dimensions.
To do so, load the Digits dataset, create a TSNE estimator that reduces data to three dimensions and call the estimator’s fit_transform method to reduce the dataset’s dimensions. Store the result in reduced_data. Next, execute the following code:
from mpl_toolkits.mplot3d import Axes3D
figure = plt.figure(figsize=(9, 9))
axes = figure.add_subplot(111, projection='3d')
dots = axes.scatter(
xs=reduced_data[:, 0],
ys=reduced_data[:, 1],
zs=reduced_data[:, 2],
c=digits.target,
cmap=plt.cm.get_cmap('nipy_spectral_r', 10))
2.2 Simple Linear Regression with Average Yearly NYC Temperatures Time Series
Go to NOAA’s Climate at a Glance page (https://www.ncdc.noaa.gov/cag) and download the available time series data for the New York City average annual temperatures from 1895 through present. Reimplement the simple linear regression using the average yearly temperature data. How does the temperature trend compare to the average January high temperatures?
2.3 Classification with the Iris Dataset: Hyperparameter Tuning
Using scikit-learn’s KFold class and cross_val_score function, determine the optimal k value for classifying Iris samples using a KNeighborsClassifier.
2.4 Classification with the Iris Dataset: Choosing the Best Estimator
Run multiple classification estimators for the Iris dataset and compare the results to see which one performs best.
2.5 Clustering the Digits Dataset with DBSCAN and MeanShift
Recall that when using the DBSCAN and MeanShift clustering estimators you do not specify the number of clusters in advance. Use each of these estimators with the Digits dataset to determine whether each estimator recognizes 10 clusters of digits.
2.6 Using %timeit to Time Training and Prediction
In the k-nearest neighbors algorithm, the computation time for classifying samples increases with the value of k. Use %timeit to calculate the run time of the KNeighborsClassifier cross-validation for the Digits dataset. Use values of 1, 10 and 20 for k. Compare the results.
2.7 Linear Regression with Sea Level Trends
NOAA’s Sea Level Trends website https://tidesandcurrents.noaa.gov/sltrends/ provides time series data for sea levels worldwide. Use their Trend Tables link to access tables listing sea-level time series for cities in the U.S. and worldwide. The date ranges available vary by city. Choose three cities for which 100% of the data is available (as shown in the % Complete column). Clicking the link in the Station ID column displays a table of time series data, which you can then export to your system as a CSV file. Load and plot each dataset on the same diagram using Seaborn’s regplot function. In IPython interactive mode, each call to regplot uses the same diagram by default and adds data in a new color. Implement Linear Regression for monthly mean sea level of each location and plot the predicted lines together with the datasets.
2.8 Linear Regression with Sea Temperature Trends
Ocean temperatures are changing fish migratory patterns. Download NOAA’s global average surface temperature anomalies time series data for 1880–2018 from https://www.ncdc.noaa.gov/cag/global/time-series/globe/ocean/1/12/1880-2021 then load and plot the dataset using Seaborn’s regplot function. What trend do you see? Implement Linear Regression with this dataset and plot the predicted line together with the dataset.
2.9 Linear Regression with the Diabetes Dataset
Investigate the Diabetes dataset bundled with https://scikitlearn.org/stable/datasets/toy_dataset.html The dataset contains 442 samples, each with 10 features and a label indicating the “disease progression one year after baseline.” Using this dataset, implement multiple linear regression.
2.10 Binary Classification with the Breast Cancer Dataset
Check out the Breast Cancer Wisconsin Diagnostic dataset that’s bundled with scikit-learn https://scikit-learn.org/stable/datasets/toy_dataset.html. The dataset contains 569 samples, each with 30 features and a label indicating whether a tumor was malignant (0) or benign (1). There are only two labels, so this dataset is commonly used to perform binary classification.
Classify this dataset using the GaussianNB (short for Gaussian Naive Bayes) estimator.
Execute multiple classifiers to determine which one is best for the Breast Cancer Wisconsin Diagnostic dataset, include a LogisticRegression classifier in the estimators dictionary. Logistic regression is another popular algorithm for binary classification
3 Deep Learning Tasks
Convolutional Neural Networks
3.1 Image Recognition: The Fashion-MNIST Dataset
Keras comes bundled with the Fashion-MNIST database of fashion articles which, like the MNIST digits dataset, provides 28-by-28 grayscale images. Fashion-MNIST contains clothing-article images labeled in 10 categories—0 (T-shirt/top), 1 (Trouser), 2 (Pullover), 3 (Dress), 4 (Coat), 5 (Sandal), 6 (Shirt), 7 (Sneaker), 8 (Bag), 9 (Ankle boot)—with 60,000 training samples and 10,000 testing samples. Modify the convnet example to load and process Fashion-MNIST rather than MNIST—this requires simply importing the correct module, loading the data then running the model with these images and labels, then re-run the entire example. How well does the model perform on Fashion-MNIST compared to MNIST? How do the training times compare?
3.2 MNIST Handwritten Digits Hyperparameter Tuning: Changing the Kernel Size
In the MNIST convent example, change the kernel size from 3-by-3 to 5-by-5. Reexecute the model. How does this change the prediction accuracy?
3.3 MNIST Handwritten Digits Hyperparameter Tuning: Changing the Batch Size
In the MNIST convnet example, we used a training batch size of 64. Larger batch sizes can decrease model accuracy. Re-execute the model for batch sizes of 32 and 128. How do these values change the prediction accuracy?
3.4 Convnet Layers
Remove the first Dense layer in the convnet model. How does this change the prediction accuracy? Several Keras pretrained convnets contain Dense layers with 4096 neurons. Add such a layer before the two Dense layers in the convnet model. How does this change the prediction accuracy?
3.5 Does the Size of the Training Data Set Matter?
Rerun the MNIST convnet model with only 25% of the original training dataset, then 50%, then 75%. Use scikit-learn’s train_test_split function to randomly select the training dataset items. Compare the results to when you trained the model with the complete training dataset.
If you need any programming assignment help in Machine Learning, Machine Learning project or Machine Learning homework or need solution of above problem then we are ready to help you.
Send your request at realcode4you@gmail.com and get instant help with an affordable price.
We are always focus to delivered unique or without plagiarism code which is written by our highly educated professional which provide well structured code within your given time frame.
If you are looking other programming language help like C, C++, Java, Python, PHP, Asp.Net, NodeJs, ReactJs, etc. with the different types of databases like MySQL, MongoDB, SQL Server, Oracle, etc. then also contact us.
Comments