This repository displays my personal study notes across various tech subjects, showcasing my continuous learning and expertise in all things related to computer science.
my_model = ModelName()my_model.fit(features, target)my_model.predict(data)import pandas as pdname = pd.read_csv(file_path)name.shape
name.columnsname.head()name.isnull().sum()name.describe()count, shows how many rows have non-missing values.pdpd.read_csv() functiondescribe() functiony = name.Potatoname_features = ['names', 'of', 'columns']X = name[name_features]X.describe()DecisionTreeRegressor() creates a new, untrained model
name_model = DecisionTreeRegressor(random_state=1)name_model.fit(X,y)head() functiontrain_X, val_X, train_y, val_y = train_test_split(X, y, random_state = 0)
mean_absolute_error() function from the scikit-learn.metrics moduledef get_mae(max_leaf_nodes, train_X, val_X, train_y, val_y): model = DecisionTreeRegressor(max_leaf_nodes=max_leaf_nodes, random_state=0) model.fit(train_X, train_y) preds_val = model.predict(val_X) mae = mean_absolute_error(val_y, preds_val) return(mae)
for max_leaf_nodes in [5, 50, 500, 5000]: my_mae = get_mae(max_leaf_nodes, train_X, val_X, train_y, val_y) print(“Max leaf nodes: %d \t\t Mean Absolute Error: %d” %(max_leaf_nodes, my_mae)) ```
max_leaf_nodes argument to control overfitting vs underfitting