머신러닝

머신 러닝-k-Nearest Neighbor(kNN)

J.H_DA 2022. 4. 14. 16:40

kNN(최근접 이웃)

지도학습 알고리즘중 하나로 어떤 데이터가 주어지면 그 데이터 주변에 있는 데이터들을 통해 분류를 하는 것이다.

 

from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y=iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=4)
In [6]:
from sklearn.neighbors import KNeighborsClassifier  
from sklearn import metrics

knn = KNeighborsClassifier(n_neighbors=25)  
knn.fit(X_train, y_train)

y_pred = knn.predict(X_test)
scores = metrics.accuracy_score(y_test, y_pred)
scores
Out[6]:
0.9666666666666667
In [21]:
from sklearn.metrics import accuracy_score
knn=KNeighborsClassifier(n_neighbors=5)
knn.fit(X,y)
#0 = setosa, 1 = versicolor, 2=virginica
classes = {0:'setosa', 1:'versicolor', 2:'virginica'}

# 아직 보지 못한 새로운 데이터를 제시해보자.
x_new=[[3,4,5,2], [5,4,2,2]]
y_predict=knn.predict(x_new)


print(y_pred)

print(classes[y_predict[0]])
print(classes[y_predict[1]])
[1 0]
versicolor
setosa
In [2]:
from sklearn.datasets import make_regression,make_classification

from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier


from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y=iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=4)

# it takes a list of tuples as parameter
pipeline = Pipeline([
    ('scaler',StandardScaler()),
    ('dtree', DecisionTreeClassifier())
	])

# use the pipeline object as you would
# a regular classifier
pipeline.fit(X_train,y_train)


y_preds = pipeline.predict(X_test)

accuracy_score(y_test,y_preds)
Out[2]:
0.9666666666666667
728x90

'머신러닝' 카테고리의 다른 글

Linear Regression  (0) 2022.04.15
회귀- 경사 하강법  (0) 2022.04.15
머신 러닝 XGboost  (0) 2022.04.14
머신 러닝 -분류 학습(Ensemble Learning)  (0) 2022.04.13
타이타닉 생존자 분류 예측  (0) 2022.04.13