turn off the AUTOINCREMENT feature in our use of sqlite? #1864

Open
opened 2012-11-18 06:41:52 +00:00 by zooko · 10 comments
zooko commented 2012-11-18 06:41:52 +00:00
Owner

I just discovered this doc:

http://sqlite.org/autoinc.html

There are two different ways that sqlite can provide automatic "ROWID". The standard one is a tad more efficient for sqlite to implement, the other one — AUTOINCREMENT — makes sure that you don't get the same ROWID twice in a row if you create a new row, then delete it, then create another new row. (The AUTOINCREMENT one also has a different behavior if you manually set your ROWID and you set it to the largest integer that sqlite can handle.)

As far as I can think, we don't mind if the same ROWID is used to refer both to a row that currently exists and a row that used to exist but does no longer. (Because we never delete a row without also deleting or updating all other references to it. Right?)

If I'm right, which I'm not sure of, then we could stop specifying to sqlite that we need the AUTOINCREMENT style, and instead use the standard and slightly more efficient style of ROWID.

I just discovered this doc: <http://sqlite.org/autoinc.html> There are two different ways that sqlite can provide automatic "ROWID". The standard one is a tad more efficient for sqlite to implement, the other one — AUTOINCREMENT — makes sure that you don't get the same ROWID twice in a row if you create a new row, then delete it, then create another new row. (The AUTOINCREMENT one also has a different behavior if you manually set your ROWID and you set it to the largest integer that sqlite can handle.) As far as I can think, we don't mind if the same ROWID is used to refer both to a row that currently exists and a row that used to exist but does no longer. (Because we never delete a row without also deleting or updating all other references to it. Right?) If I'm right, which I'm not sure of, then we could stop specifying to sqlite that we need the AUTOINCREMENT style, and instead use the standard and slightly more efficient style of ROWID.
tahoe-lafs added the
code
normal
enhancement
1.9.2
labels 2012-11-18 06:41:52 +00:00
tahoe-lafs added this to the undecided milestone 2012-11-18 06:41:52 +00:00
zooko commented 2012-11-20 11:49:58 +00:00
Author
Owner

Brian: please design-review this ticket. I.e., think about it and tell me if I should go ahead with this or forget about it.

Brian: please design-review this ticket. I.e., think about it and tell me if I should go ahead with this or forget about it.
tahoe-lafs added
code-storage
and removed
code
labels 2012-11-20 16:39:21 +00:00
davidsarah commented 2012-11-20 16:42:25 +00:00
Author
Owner

I read the linked sqlite documentation and I am +0 on this change. I suspect the performance improvement is small, but we do need to improve the performance of leasedb.

I read the linked sqlite documentation and I am +0 on this change. I suspect the performance improvement is small, but we do need to improve the performance of leasedb.
zooko commented 2012-11-28 15:42:56 +00:00
Author
Owner

(Because we never delete a row without also deleting or updating all other references to it. Right?)

Oh, in fact if we use foreign key constraints then sqlite will enforce that we never do this, right?

> (Because we never delete a row without also deleting or updating all other references to it. Right?) Oh, in fact if we use foreign key constraints then sqlite will enforce that we never do this, right?
davidsarah commented 2012-11-29 21:13:29 +00:00
Author
Owner

Replying to zooko:

(Because we never delete a row without also deleting or updating all other references to it. Right?)

Oh, in fact if we use foreign key constraints then sqlite will enforce that we never do this, right?

Yes, within the database. IIUC, with foreign key integrity checking turned on (which it is), AUTOINCREMENT would only be useful if you were persistently storing the integer primary keys outside the db (which we are not).

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/1864#issuecomment-131758): > > (Because we never delete a row without also deleting or updating all other references to it. Right?) > > Oh, in fact if we use foreign key constraints then sqlite will enforce that we never do this, right? Yes, within the database. IIUC, with foreign key integrity checking turned on ([which it is](https://github.com/davidsarah/tahoe-lafs/blob/1818-leasedb/src/allmydata/util/dbutil.py#L30)), AUTOINCREMENT would only be useful if you were persistently storing the integer primary keys outside the db (which we are not).
tahoe-lafs modified the milestone from undecided to 1.11.0 2012-11-29 21:15:29 +00:00
davidsarah commented 2013-01-04 23:29:54 +00:00
Author
Owner

Note that in https://github.com/daira/tahoe-lafs/commit/b7cea9a3f86c5cc4cfd3910d0300a1008b7b5a13, I got rid of the AUTOINCREMENT key on the leases table, replacing it with a semantic primary key. So there are no longer any AUTOINCREMENT keys on leasedb tables where records are created at all frequently. Therefore, I think there is no available performance improvement from this change, at least for leasedb.

If you (zooko or warner) agree, I'll just close this ticket as invalid.

Note that in <https://github.com/daira/tahoe-lafs/commit/b7cea9a3f86c5cc4cfd3910d0300a1008b7b5a13>, I got rid of the AUTOINCREMENT key on the `leases` table, replacing it with a semantic primary key. So there are no longer any AUTOINCREMENT keys on leasedb tables where records are created at all frequently. Therefore, I think there is no available performance improvement from this change, at least for leasedb. If you (zooko or warner) agree, I'll just close this ticket as invalid.
zooko commented 2013-05-30 16:20:19 +00:00
Author
Owner

Replying to davidsarah:

Note that in https://github.com/davidsarah/tahoe-lafs/commit/b7cea9a3f86c5cc4cfd3910d0300a1008b7b5a13, I got rid of the AUTOINCREMENT key on the leases table, replacing it with a semantic primary key. So there are no longer any AUTOINCREMENT keys on leasedb tables where records are created at all frequently. Therefore, I think there is no available performance improvement from this change, at least for leasedb.

If you (zooko or warner) agree, I'll just close this ticket as invalid.

It seems to me that we should make this change:

zooko@spark ~/playground/tahoe-lafs $ git diff
diff --git a/docs/backupdb.rst b/docs/backupdb.rst
index 5a36b51..e47ca3b 100644
--- a/docs/backupdb.rst
+++ b/docs/backupdb.rst
@@ -61,7 +61,7 @@ The database contains the following tables::
   
   CREATE TABLE caps
   (
-   fileid integer PRIMARY KEY AUTOINCREMENT,
+   fileid integer PRIMARY KEY,
    filecap varchar(256) UNIQUE    -- URI:CHK:...
   );
   
diff --git a/src/allmydata/scripts/backupdb.py b/src/allmydata/scripts/backupdb.py
index 75ee0d9..37180f9 100644
--- a/src/allmydata/scripts/backupdb.py
+++ b/src/allmydata/scripts/backupdb.py
@@ -27,7 +27,7 @@ CREATE TABLE local_files -- added in v1
 
 CREATE TABLE caps -- added in v1
 (
- fileid INTEGER PRIMARY KEY AUTOINCREMENT,
+ fileid INTEGER PRIMARY KEY,
  filecap VARCHAR(256) UNIQUE       -- URI:CHK:...
 );
 

Not primarily for efficiency (since creation of new caps during a backup, or a future "sync" or a future "cp -r --with-db" is not high frequency), but primarily for simplicity. We don't need the AUTOINCREMENT feature of sqlite, so we shouldn't say that we want it (and it might slow a backupdb file-creation operation down by a few hundred nanoseconds occasionally).

Daira, Brian: what do you say?

Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/1864#issuecomment-131761): > Note that in <https://github.com/davidsarah/tahoe-lafs/commit/b7cea9a3f86c5cc4cfd3910d0300a1008b7b5a13>, I got rid of the AUTOINCREMENT key on the `leases` table, replacing it with a semantic primary key. So there are no longer any AUTOINCREMENT keys on leasedb tables where records are created at all frequently. Therefore, I think there is no available performance improvement from this change, at least for leasedb. > > If you (zooko or warner) agree, I'll just close this ticket as invalid. It seems to me that we should make this change: ``` zooko@spark ~/playground/tahoe-lafs $ git diff diff --git a/docs/backupdb.rst b/docs/backupdb.rst index 5a36b51..e47ca3b 100644 --- a/docs/backupdb.rst +++ b/docs/backupdb.rst @@ -61,7 +61,7 @@ The database contains the following tables:: CREATE TABLE caps ( - fileid integer PRIMARY KEY AUTOINCREMENT, + fileid integer PRIMARY KEY, filecap varchar(256) UNIQUE -- URI:CHK:... ); diff --git a/src/allmydata/scripts/backupdb.py b/src/allmydata/scripts/backupdb.py index 75ee0d9..37180f9 100644 --- a/src/allmydata/scripts/backupdb.py +++ b/src/allmydata/scripts/backupdb.py @@ -27,7 +27,7 @@ CREATE TABLE local_files -- added in v1 CREATE TABLE caps -- added in v1 ( - fileid INTEGER PRIMARY KEY AUTOINCREMENT, + fileid INTEGER PRIMARY KEY, filecap VARCHAR(256) UNIQUE -- URI:CHK:... ); ``` Not primarily for efficiency (since creation of new caps during a backup, or a future "sync" or a future "cp -r --with-db" is not high frequency), but primarily for simplicity. We don't need the `AUTOINCREMENT` feature of sqlite, so we shouldn't say that we want it (and it might slow a backupdb file-creation operation down by a few hundred nanoseconds occasionally). Daira, Brian: what do you say?
zooko commented 2013-05-30 16:24:45 +00:00
Author
Owner

This is a small change and easily reverted, so I'll go ahead and apply it as long as either daira or brian approves.

This is a small change and easily reverted, so I'll go ahead and apply it as long as *either* daira or brian approves.
daira commented 2013-05-30 16:36:25 +00:00
Author
Owner

I okayed this provided zooko smoke-tests compatibility with an existing backupdb file.

I okayed this provided zooko smoke-tests compatibility with an existing backupdb file.
warner commented 2013-05-30 18:17:09 +00:00
Author
Owner

Yeah, sounds good to me. I was unaware of this feature when I first wrote that schema. Let's also double-check that none of the SQL commands that insert rows are providing a value for fileid.

Yeah, sounds good to me. I was unaware of this feature when I first wrote that schema. Let's also double-check that none of the SQL commands that insert rows are providing a value for `fileid`.
zooko commented 2013-05-30 20:44:22 +00:00
Author
Owner

Replying to warner:

Yeah, sounds good to me. I was unaware of this feature when I first wrote that schema. Let's also double-check that none of the SQL commands that insert rows are providing a value for fileid.

Confirmed: I inspected and the fileid always comes from [get_or_allocate_fileid_for_cap()]source:trunk/src/allmydata/scripts/backupdb.py?annotate=blame&rev=a1a1b5bf8a6e1a09c70d4d2b618ad0c06b0f631f#L242, so sqlite always creates the fileid value.

Replying to [warner](/tahoe-lafs/trac-2024-07-25/issues/1864#issuecomment-131765): > Yeah, sounds good to me. I was unaware of this feature when I first wrote that schema. Let's also double-check that none of the SQL commands that insert rows are providing a value for `fileid`. Confirmed: I inspected and the `fileid` always comes from [get_or_allocate_fileid_for_cap()]source:trunk/src/allmydata/scripts/backupdb.py?annotate=blame&rev=a1a1b5bf8a6e1a09c70d4d2b618ad0c06b0f631f#L242, so sqlite always creates the `fileid` value.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#1864
No description provided.