python - using SciKitLearn Logistic Regression -


I'm trying to logistic regression on the following data that I use SciKitLearn:

< Pre> sklearn.linear_model import import from np SGD classifier import matplotlib.pyplot as plt x = np. Arre ([(456.09677777777779, 477.87349999999998), (370.16702631578943, 471.41847368421048), (208.0453181818182, 96.825818181818164), (213.35274999999996, 509.25293750000003), (279.30812500000002, 155.14600000000002), (23lk55,6 9 5, 146.21420000000001), (285.93539285714286, 140.41428571428571) ( 297.28620000000001, 150.98409999999998), (267.3011923076923, 136.76630769230769), (226.57899999999998, 138.03450000000001), (312.01369230769228, 158.06576923076923), (305.04823076923083, 152.89192307692309) (225.434, 138.76300000000001), (396.39516666666663 1 6,l02l6666666668 9), (23 9 .16028571428572 , 125.58142857142856) (235.8 9 8 116,980 99 99 99 99 99 9), (132.9879 99 99 99 99 7, 361.8559 99 99 99 99 99), (120.1848, 391.27560000000005) (495.97237 5, 223.47975000000002), (485.80450000000002, 222.89939999999996), (257.07245454545449, 136.36881818181817), (441.60896153846159, 209.63723076923083), (451.61168749999996, 212.58543750000001), (458.90889285714286, 215.38 342857142857), (474.8958235294117, 218.99223529411765), (467.85923529411775, 218.55094117647059), (251.96968421052637, 407.74273684210527), (181.53659999999999, 367.47239999999999), (356.85722222222222, 342.36394444444443), (234.99250000000001, 340.74079999999998), (211.58613157894737, 360.8791052631579), (207.18066666666667, 323.31349999999998) , (320.41081249999996, 341.58249999999998), (316.88186842105262, 308.40215789473683), (285.2390666666667, 322.81979999999999), (300.14074999999997, 362.1682222222222), (279.99999 99 99 99, 35 9.0 9 577777777781)) U = NP Array ([1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0 , 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0])

uses have:

  clf = SGDClas (loss = "Hing", alpha = 0.01 An_aitr = 200, Fit_insept = true) clf.fit (x, y)  

As a result, the hyperplanes do not come anywhere close to separation. Any ideas about what is going on in this data?

Cheers,

Greg

PS is the code used to create an image (just because there is something wrong there)

< Pre> xx = np.linspace (0, 1000, 10) yy = np.linspace (0, 600, 10) x1, x2 = np. Mesegrid (xx, yy) z = np (I, j), Anpi.ananpianaret (x 1) in Val: x1 = val x2 = x2 [i, j] p = clf.decision_function ([x1, x2]) Z [ I, j] = p [0] plt.contour (X1, X2, Z, [0], color = "blue")

If you want logging regression on using SGDClassifier:

  • Use loss =" log " , do not lose = "hinge" . 'Hing' gives a linear SVM, and gives a regressive return to 'log'
  • Try alpha for more iterations and different learning rates.

In fact, I can use a clf = LogisticRegression (C = 1, class_weight = 'auto', penalty = 'L2') Clasifayriyr , See, because SGD is based on classifier shield lineage, where free regression is trained incrementally


Comments