# NPTEL Introduction to Machine Learning Assignment 5 Answers 2023

Hello NPTEL Learners, In this article, you will find NPTEL Introduction to Machine Learning Assignment 5 Week 5 Answers 2023. All the Answers are provided below to help the students as a reference don’t straight away look for the solutions, first try to solve the questions by yourself. If you find any difficulty, then look for the solutions.

###### NPTEL Introduction to Machine Learning Assignment 6 Answers 2023 Join Group👇

Note: We are trying to give our best so please share with your friends also.

## NPTEL Introduction to Machine Learning Assignment 5 Answers 2023:

• y=wx˙withw>0
• y=wx˙withw<0
• y=xwwithw>0
• y=xwwithw<0

#### Q.2. For training a binary classification model with five independent variables, you choose to use neural networks. You apply one hidden layer with three neurons. What are the number of parameters to be estimated? (Consider the bias term as a parameter)

• 16
• 21
• 34=81
• 43=64
• 12
• 22
• 25
• 26
• 4
• None of these

#### Q.3. Suppose the marks obtained by randomly sampled students follow a normal distribution with unknown μ. A random sample of 5 marks are 25, 55, 64, 7 and 99. Using the given samples find the maximum likelihood estimate for the mean.

• 54.2
• 67.75
• 50
• Information not sufficient for estimation

#### Q.4. You are given the following neural networks which take two binary valued inputs x1,x2∈{0,1} and the activation function is the threshold function(h(x)=1 if x>0;0 otherwise). Which of the following logical functions does it compute?

• OR
• AND
• NAND
• None of the above

#### Q.5.Using the notations used in class, evaluate the value of the neural network with a 3-3-1 archi- tecture (2-dimensional input with 1 node for the bias term in both the layers). The parameters are as follows

Using sigmoid function as the activation functions at both the layers, the output of the network for an input of (0.8, 0.7) will be

• 0.6710
• 0.9617
• 0.6948
• 0.7052
• 0.2023
• 0.7977
• 0.2446
• None of these

#### Q.6. Which of the following statements are true:

• The chances of overfitting decreases with increasing the number of hidden nodes and increasing the number of hidden layers.
• A neural network with one hidden layer can represent any Boolean function given sufficient number of hidden units and appropriate activation functions.
• Two hidden layer neural networks can represent any continuous functions (within a tolerance) as long as the number of hidden units is sufficient and appropriate activation functions used.

#### Q.7. We have a function which takes a two-dimensional input x=(x1,x2) and has two parameters w=(w1,w2) given by f(x,w)=σ(σ(x1w1)w2+x2) where σ(x)=11+e−x We use backprop- agation to estimate the right parameter values. We start by setting both the parameters to 1. Assume that we are given a training point x2=1,x1=0,y=5 . Given this information answer the next two questions. What is the value of ∂f∂w2 ?

• 0.150
• -0.25
• 0.125
• 0.098
• 0.0746
• 0.1604
• None of these

#### Q.8. If the learning rate is 0.5, what will be the value of w2 after one update using backpropagation algorithm?

• 0.4197
• -0.4197
• 0.6881
• -0.6881
• 1.3119
• -1.3119
• 0.5625
• -0.5625
• None of these

#### Q.9. Which of the following are true when comparing ANNs and SVMs?

• ANN error surface has multiple local minima while SVM error surface has only one minima
• After training, an ANN might land on a different minimum each time, when initialized with random weights during each run.
• As shown for Perceptron, there are some classes of functions that cannot be learnt by an ANN. An SVM can learn a hyperplane for any kind of distribution.
• In training, ANN’s error surface is navigated using a gradient descent technique while SVM’s error surface is navigated using convex optimization solvers.

#### Q.10. Which of the following are correct?

• A perceptron will learn the underlying linearly separable boundary with finite number of training steps.
• XOR function can be modelled by a single perceptron.
• Backpropagation algorithm used while estimating parameters of neural networks actually uses gradient descent algorithm.
• The backpropagation algorithm will always converge to global optimum, which is one of the reasons for impressive performance of neural networks.
##### NPTEL Introduction to Machine Learning Assignment 5 Answers Join Group👇

Disclaimer: This answer is provided by us only for discussion purpose if any answer will be getting wrong don’t blame us. If any doubt or suggestions regarding any question kindly comment. The solution is provided by Chase2learn. This tutorial is only for Discussion and Learning purpose.

#### About NPTEL Introduction to Machine Learning Course:

With the increased availability of data from varied sources there has been increasing attention paid to the various data driven disciplines such as analytics and machine learning. In this course we intend to introduce some of the basic concepts of machine learning from a mathematically well motivated perspective. We will cover the different learning paradigms and some of the more popular algorithms and architectures used in each of these paradigms.

##### Course Outcome:
• Week 0: Probability Theory, Linear Algebra, Convex Optimization – (Recap)
• Week 1: Introduction: Statistical Decision Theory – Regression, Classification, Bias Variance
• Week 2: Linear Regression, Multivariate Regression, Subset Selection, Shrinkage Methods, Principal Component Regression, Partial Least squares
• Week 3: Linear Classification, Logistic Regression, Linear Discriminant Analysis
• Week 4: Perceptron, Support Vector Machines
• Week 5: Neural Networks – Introduction, Early Models, Perceptron Learning, Backpropagation, Initialization, Training & Validation, Parameter Estimation – MLE, MAP, Bayesian Estimation
• Week 6: Decision Trees, Regression Trees, Stopping Criterion & Pruning loss functions, Categorical Attributes, Multiway Splits, Missing Values, Decision Trees – Instability Evaluation Measures
• Week 7: Bootstrapping & Cross Validation, Class Evaluation Measures, ROC curve, MDL, Ensemble Methods – Bagging, Committee Machines and Stacking, Boosting
• Week 8: Gradient Boosting, Random Forests, Multi-class Classification, Naive Bayes, Bayesian Networks
• Week 9: Undirected Graphical Models, HMM, Variable Elimination, Belief Propagation
• Week 10: Partitional Clustering, Hierarchical Clustering, Birch Algorithm, CURE Algorithm, Density-based Clustering
• Week 11: Gaussian Mixture Models, Expectation Maximization
• Week 12: Learning Theory, Introduction to Reinforcement Learning, Optional videos (RL framework, TD learning, Solution Methods, Applications)
###### CRITERIA TO GET A CERTIFICATE:

Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

If you have not registered for exam kindly register Through https://examform.nptel.ac.in/