SQLite读取速度慢（每秒100条记录）-Java 学习之路

-1

我有一个大的SQLite数据库（~134 GB），它有多个表，每个表有14列，大约3.3亿条记录和4个索引 . 数据库上使用的唯一操作是“Select *”，因为我需要所有列（无插入或更新） . 当我查询数据库时，结果集很大时响应时间很慢（获取~18,000条记录需要160秒） .

我已经多次改进了索引的使用，这是我得到的最快的响应时间 .

我正在运行数据库作为具有32 GB RAM的服务器上的Web应用程序的后端数据库 .

有没有办法使用RAM（或其他任何东西）来加快查询过程？

这是执行查询的代码 .

async.each(proteins,function(item, callback) {

   `PI[item] = [];                      // Stores interaction proteins for all query proteins
    PS[item] = [];                      // Stores scores for all interaction proteins
    PIS[item] = [];                     // Stores interaction sites for all interaction proteins
    var sites = {};                     // a temporarily holder for interaction sites

var query_string = 'SELECT * FROM ' + organism + PIPE_output_table +
        ' WHERE ' + score_type + ' > ' + cutoff['range'] + ' AND (protein_A = "' + item + '" OR protein_B = "' + item '") ORDER BY PIPE_score DESC';

db.each(query_string, function (err, row) {

        if (row.protein_A == item) {
            PI[item].push(row.protein_B);

            // add 1 to interaction sites to represent sites starting from 1 not from 0
            sites['S1AS'] = row.site1_A_start + 1;
            sites['S1AE'] = row.site1_A_end + 1;
            sites['S1BS'] = row.site1_B_start + 1;
            sites['S1BE'] = row.site1_B_end + 1;

            sites['S2AS'] = row.site2_A_start + 1;
            sites['S2AE'] = row.site2_A_end + 1;
            sites['S2BS'] = row.site2_B_start + 1;
            sites['S2BE'] = row.site2_B_end + 1;

            sites['S3AS'] = row.site3_A_start + 1;
            sites['S3AE'] = row.site3_A_end + 1;
            sites['S3BS'] = row.site3_B_start + 1;
            sites['S3BE'] = row.site3_B_end + 1;

            PIS[item].push(sites);
            sites = {};
        }
}

1 回答

0
The query you posted uses no variables.

它将始终返回相同的内容：所有具有空分数的行，其 protein 列等于其 protein_a 或 protein_b 列 . 然后，您必须在Javascript中过滤所有这些额外的行，从而获取比您需要的更多的行 .

这就是为什么......

如果我正确理解了这个查询，你有 WHERE Score > [Score] . 我之前从未遇到过这种语法，所以我查了一下 .

[keyword]方括号中的关键字是标识符 . 这不是标准的SQL . 此引用机制由MS Access和SQL Server使用，并包含在SQLite中以实现兼容性 .

标识符类似于列或表名，而不是变量 .

这意味着......
```
SELECT * FROM [TABLE]
WHERE Score > [Score] AND
      (protein_A = [Protein] OR protein_B = [Protein])
ORDER BY [Score] DESC;
```
跟这个一样......
```
SELECT * FROM `TABLE`
WHERE Score > Score AND
      (protein_A = Protein OR protein_B = Protein)
ORDER BY Score DESC;
```
You never pass any variables to the query. 它总会返回相同的东西 .

这可以在运行时看到 .
```
db.each(query_string, function (err, row) {
```
既然你正在检查每种蛋白质是否与自身相同（或者与它本身非常相似），那么你很可能会获取每一行 . 这就是为什么你必须再次过滤所有行 . 这就是您的查询速度如此之慢的原因之一 .
```
if (row.protein_A == item) {
```
BUT! WHERE Score > [Score] 永远不会是真的，一件事不能超过自己 except for null ！ Trinary logic is weird . 因此，只有 Score 为null才能成立 .

因此，您将返回得分为null且 protein 列等于 protein_a 或 protein_b 的所有行 . 这比你需要的行多得多，我想你有很多行有空分数 .

您的查询应该包含变量（我使用node-sqlite3）并在执行查询时传入它们的值 .
```
var query = "                                              \
    SELECT * FROM `TABLE`                                  \
    WHERE Score > $score AND                               \
          (protein_A = $protein OR protein_B = $protein)   \
    ORDER BY Score DESC;                                   \
";
var stmt = db.prepare(query);
stmt.each({$score: score, $protein: protein}, function (err, row) {
    PI[item].push(row.protein_B);
    ...
});
```
回复于 2024-04-20T10:59:30+08:00

SQLite读取速度慢（每秒100条记录）

1 回答

相关问题