Analysis of the Apple Quality Dataset
DOI:
https://doi.org/10.61173/ts88st51Keywords:
Data analysis, Machine learning, StatisticsAbstract
This article introduces a dataset containing apple features, with 4000 rows, including apple identifiers, size, weight, sweetness, crispness, juiciness, ripeness, acidity, and other characteristics. The data set can support classification and regression tasks, where quality features can be used as classification targets or converted to numerical values for regression. In addition, the data set’s features are evenly distributed, which is beneficial to model training. The experiment used two algorithms, decision tree, and random forest, for classification tasks. The results showed that the accuracy of the random forest reached 90.625%, which was better than the 80.625% of the decision tree. This confirms the effectiveness of the dataset in classification tasks and the superior performance of the random forest model.