我有这条管道,
pl = Pipeline([
('union', FeatureUnion(
transformer_list = [
('numeric_features', Pipeline([
("selector", get_numeric_data),
])),
('text_features', Pipeline([
("selector",get_text_data),
("vectorizer", HashingVectorizer(token_pattern=TOKENS_ALPHANUMERIC,non_negative=True, norm=None, binary=False, ngram_range=(1,2))),
('dim_red', SelectKBest(chi2, chi_k))
]))
])), ("clf",LogisticRegression())
])
当我尝试做的时候
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
c_space = np.logspace(-5, 8, 15)
param_grid = {"C": c_space,"penalty": ['l1', 'l2']}
logreg_cv = GridSearchCV(pl,param_grid=param_grid,cv=5)
logreg_cv.fit(X_train,y_train)
它抛出了我
ValueError:估算器管道的参数代价无效(内存=无,步数= [('union',FeatureUnion(n_jobs = 1,transformer_list = [('numeric_features',管道(内存=无,步数= [('选择器', FunctionTransformer(accept_sparse = False,func = at 0x00000190ECB49488>,inv_kw_args = None,inverse_func = None,kw_args = None,pass_y = ... ty ='l2',random_state = None,solver ='liblinear',tol = 0.0001,verbose = 0,warm_start = False))]) . 使用estimator.get_params() . keys()检查可用参数列表 .
虽然“C”和“惩罚”在这种情况下是合法的参数 . 请帮我锄头去做吧 .
1 回答
“C”和“惩罚”是LogisticRegression的合法参数,而不是您发送给GridSearchCV的Pipeline对象 .
您的管道当前有两个组件,
"union"
和"clf"
. 现在管道不知道发送参数的哪个部分 . 您需要在params中附加管道中使用的这些名称,以便它可以识别它们并将它们发送到正确的对象 .做这个:
请注意,管道中的对象名称和参数之间有两个下划线 .
它在Pipeline and FeatureUnion here的文档中提到:
用各种例子来说明用法 .
在此之后,如果你想改变HashingVectorizer的
ngram_range
,你会这样做: