我想找到一种有效的方法来按照数据库中的属性搜索产品,其中数据由EAV model组织 .

有2个表:

  • 产品,商店产品,有产品ID .

  • ProductAttribute - 产品的属性值 .

SQL:

CREATE TABLE `Product` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(150) DEFAULT NULL,
  PRIMARY KEY (`id`)
)

CREATE TABLE `ProductAttribute` (
  `product_id` int(11) NOT NULL,
   # in real code it's attribute_id and value_id,
   # I just simplified this
  `attribute` varchar(10), 
  `value` varchar(255)
)

不同的产品可以具有不同数量的属性 . E. g . 产品'T恤'有尺寸和颜色,产品'笔记本'是否有cpu和ram . 产品可以有10个以上的属性 .

我想为产品创建一个搜索页面,让用户从过滤器中选择值,以适应当前的产品类型 .

我的问题是从这些表中搜索MySQL的查询效率很低 . 按10个产品属性搜索需要使用ProductAttribute表进行10个JOINS:

SELECT P.id
FROM Product P
JOIN ProductAttribute A1 ON (P.id = A1.product_id AND A1.attribute = 'cpu' AND A1.value = 'xxx')
JOIN ProductAttribute A2 ON (P.id = A2.product_id AND A2.attribute = 'ram' AND A2.value = 'yyy')
JOIN ProductAttribute A3 ON (P.id = A3.product_id AND A3.attribute = 'hdd' AND A3.value = 'zzz')
# ... more attributes require more joins

我可以想象如何解决这个问题:我可以为每个attibute集创建索引表(一个用于笔记本的表,另一个用于T恤的表等),并在产品属性更改时重新生成这些索引表 .

但我想知道是否有现成的解决方案,也许一些搜索引擎可以做到这一点 .

我使用PHP / MySQL,但我不限于此 - 任何解决方案都会很有趣 .

UPDATE :( SPOILER:如果此任务存在任何搜索引擎/索引器,我仍然感兴趣)我创建了多个JOINS查询的比较测试和索引表查询 . 索引表是一个表,其中所有属性都是单独的列 . 结果:JOINS - 4.5s,索引表 - 0.2s . See it on github .

使用JOIN进行示例查询:

SELECT P.id FROM Product P
JOIN ProductAttribute color ON (P.id = color.product_id AND color.attribute_id = 17 AND color.int_value IN (152,146,101,152,118,109))
JOIN ProductAttribute size ON (P.id = size.product_id AND size.attribute_id = 34 AND size.int_value IN (288,210,246,275,258,289))
JOIN ProductAttribute brand ON (P.id = brand.product_id AND brand.attribute_id = 51 AND brand.int_value IN (305,303,300,375,308,340))
JOIN ProductAttribute material ON (P.id = material.product_id AND material.attribute_id = 68 AND material.int_value IN (426,463,465,459,418,460))
JOIN ProductAttribute length ON (P.id = length.product_id AND length.attribute_id = 85 AND length.float_value > 10 AND length.float_value < 49)
JOIN ProductAttribute height ON (P.id = height.product_id AND height.attribute_id = 102 AND height.float_value > 78 AND height.float_value < 186)
JOIN ProductAttribute weight ON (P.id = weight.product_id AND weight.attribute_id = 119 AND weight.float_value > 10 AND weight.float_value < 15)
JOIN ProductAttribute waterproof ON (P.id = waterproof.product_id AND waterproof.attribute_id = 136 AND waterproof.bool_value = 1)
JOIN ProductAttribute tags ON (P.id = tags.product_id AND tags.attribute_id = 153 AND tags.string_value LIKE '%alluring%')
;

需要4.5秒才能执行 .

索引表的示例查询:

SELECT P.id FROM Product P
JOIN AttributeIndex AI ON (P.id = AI.product_id)
WHERE 
AI.color IN (152,146,101,152,118,109) 
AND AI.size IN (288,210,246,275,258,289) 
AND AI.brand IN (305,303,300,375,308,340) 
AND AI.material IN (426,463,465,459,418,460) 
AND AI.length > 10 AND AI.length < 49 
AND AI.height > 78 AND AI.height < 186 
AND AI.weight > 10 AND AI.weight < 15 
AND AI.waterproof = 1 
AND AI.tags LIKE '%alluring%';

执行需要0.2秒 .

用于测试的CREATE TABLE(有关github的更多信息):

CREATE TABLE `Product` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(150) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB

CREATE TABLE `ProductAttribute` (
  `product_id` int(11) NOT NULL,
  `attribute_id` smallint(6) DEFAULT NULL,
  `int_value` int(11) DEFAULT NULL,
  `float_value` float DEFAULT NULL,
  `bool_value` tinyint(1) DEFAULT NULL,
  `string_value` varchar(255) DEFAULT NULL,
  KEY `PA` (`product_id`,`attribute_id`)
) ENGINE=InnoDB

CREATE TABLE `AttributeIndex` (
  `product_id` int(11) NOT NULL,
  `color` int(11) DEFAULT NULL,
  `size` int(11) DEFAULT NULL,
  `brand` int(11) DEFAULT NULL,
  `material` int(11) DEFAULT NULL,
  `length` float DEFAULT NULL,
  `height` float DEFAULT NULL,
  `weight` float DEFAULT NULL,
  `waterproof` tinyint(1) DEFAULT NULL,
  `tags` varchar(255) DEFAULT NULL,
  KEY `P` (`product_id`)
) ENGINE=InnoDB

所以,我可以看到在这种情况下索引表比JOIN更有效 . 我想知道是否有任何解决方案可以帮助生成(并保持实际状态)此类索引表,或提供一些快速搜索具有多个属性的产品的其他方法 .