RANDOM FOREST CLASSIFIER

A random forest classifier. A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

Random Forest is a well-known machine learning algorithm from the supervised learning approach. It may be applied to both classification and regression issues in machine learning. It is built on the notion of ensemble learning, which is the process of merging numerous classifiers to solve a complicated issue and enhance the model's performance. "Random Forest is a classifier that comprises a number of decision trees on various subsets of the provided dataset and takes the average to enhance the predicted accuracy of that dataset," as the name implies. Instead than depending on a single decision tree, the random forest collects the forecasts from each tree and predicts the final output based on the majority vote of predictions.

See the figure below:

Note: To better understand the Random Forest Algorithm, you should have knowledge of the Decision Tree Algorithm.

THE MAIN CODE OF RANDOM FOREST

  • from sklearn.ensemble import RandomForestClassifier
    random_forest = RandomForestClassifier(n_estimators=100,oob_score=True,max_features=5)
    random_forest.fit(X_train, Y_train)
    Y_pred = random_forest.predict(X_test)
    random_forest.score(X_train, Y_train)

Full Code Exercise of Random Forest Classifier

First, download the test and train file here and see the full code below :