Skip to content

Commit 2edff0b

Browse files
authored
Merge pull request #7822 from tangledbytes/utkarsh/cleanup/wal
[NSFS | NC] Wal Cleanup
2 parents 216c0ca + 25c30d3 commit 2edff0b

File tree

3 files changed

+48
-54
lines changed

3 files changed

+48
-54
lines changed

docs/design/NSFSGlacierStorageClass.md

Lines changed: 25 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -15,31 +15,30 @@ There are 3 primary flows of concern and this document will discuss all 3 of the
1515
2. Restore object that are uploaded to `GLACIER` storage class (API: `RestoreObject`).
1616
3. Copy objects where source is an object stored in `GLACIER` (API: `PutObject`).
1717

18-
### WAL
19-
Important component of all the flows is the write ahead log (WAL). NooBaa has a `SimpleWAL` which as name states
20-
is extremely simple in some senses. It does not deal with fsync issues, partial writes, holes, etc. rather just
21-
appends data seperated by a new line character.
18+
### Persistent Log
19+
Important component of all the flows is the Persistent Log. NooBaa has a `PersistentLogger` is extremely simple in some senses.
20+
It does not deal with fsync issues, partial writes, holes, etc. rather just appends data seperated by a new line character.
2221

23-
`SimpleWAL` features:
22+
`PersistentLogger` features:
2423
1. Exposes an `append` method which adds data to the file.
25-
2. Can perform auto rotation of the file which makes sure that a single WAL is never too huge for the
26-
WAL consumer to consume.
27-
3. Exposes a `process` method which allows "safe" iteration on the previous WAL files.
24+
2. Can perform auto rotation of the file which makes sure that a single log is never too huge for the
25+
log consumer to consume.
26+
3. Exposes a `process_inactive` method which allows "safe" iteration on the previous log files.
2827
4. Tries to make sure that no data loss happens due to process level races.
2928

3029
#### Races which are handled by the current implementation
31-
1. `n` processes open the WAL file while a "consumer" swoops and tries to process the file affectively losing the
30+
1. `n` processes open the log file while a "consumer" swoops and tries to process the file affectively losing the
3231
current writes (due to processing partially written file and ultimately invoking `unlink` on the file) - This isn't
33-
possible as `process` method makes sure that it doesn't iterate over the "current active file".
34-
2. `k` processes out of `n` (such that `k < n`) open the WAL while a "consumer" swoops and tries to process the
35-
file affectively losing the current writes (due to unliking the file others hold reference to) - Although `process`
32+
possible as `process_inactive` method makes sure that it doesn't iterate over the "current active file".
33+
2. `k` processes out of `n` (such that `k < n`) open the peristent log while a "consumer" swoops and tries to process the
34+
file affectively losing the current writes (due to unliking the file others hold reference to) - Although `process_inactive`
3635
method will not protect against this as technically "current active file" is a different file but this is still **not**
3736
possible as the "consumer" need to have an "EXCLUSIVE" lock on the files before it can process the file this makes sure
3837
that for as long as any process is writing on the file, the "consumer" cannot consume the file and will block.
39-
3. `k` processes out of `n` (such that `k < n`) open the WAL but before the NSFS process could get a "SHARED" lock on
38+
3. `k` processes out of `n` (such that `k < n`) open the peristent log but before the NSFS process could get a "SHARED" lock on
4039
the file the "consumer" process swoops in and process the files and then issues `unlink` on the file. The unlink will
4140
not delete the file as `k` processes have open FD to the file but as soon as those processes will be done writing to
42-
it and will close the FD, the file will be deleted which will result in lost writes - This isn't possible as `SimpleWAL`
41+
it and will close the FD, the file will be deleted which will result in lost writes - This isn't possible as `PersistentLogger`
4342
does not allow writing to a file till it can get a lock on the file and ensure that there are `> 0` links to the file.
4443
If there are no links then it tries to open file the again assuming that the consumer has issued `unlink` on the file
4544
it holds the FD to.
@@ -66,7 +65,7 @@ which manages the actual movements of the file.
6665
2. NooBaa rejects the request if NooBaa isn't configured to support the given storage class. This is **not** enabled
6766
by default and needs to be enabled via `config-local.js` by setting `config.NSFS_GLACIER_ENABLED = true` and `config.NSFS_GLACIER_LOGS_ENABLED = true`.
6867
3. NooBaa will set the storage class to `GLACIER` by setting `user.storage_class` extended attribute.
69-
4. NooBaa creates a simple WAL (Write Ahead Log) and appends the filename to the log file.
68+
4. NooBaa creates a persistent log and appends the filename to the log file.
7069
5. Completes the upload.
7170

7271
Once the upload is complete, the file sits on the disk till the second process kicks in and actually does the movement
@@ -79,19 +78,19 @@ does as well).
7978
1. A scheduler (eg. Cron, human, script, etc) issues `node src/cmd/manage_nsfs glacier migrate --interval <val>`.
8079
2. The command will first acquire an "EXCLUSIVE" lock so as to ensure that only one tape management command is running at once.
8180
3. Once the process has the lock it will start to iterate over the potentially currently inactive files.
82-
4. Before processing a WAL file, the proceess will get an "EXCLUSIVE" lock to the file ensuring that it is indeed the only
81+
4. Before processing a log file, the proceess will get an "EXCLUSIVE" lock to the file ensuring that it is indeed the only
8382
process processing the file.
84-
5. It will read the WAL one line at a time and will ensure the following:
83+
5. It will read the log one line at a time and will ensure the following:
8584
1. The file still exists.
8685
2. The file is still has `GLACIER` storage class. (This is can happen if the user uploads another object with `STANDARD`
8786
storage class).
8887
3. The file doesn't have any of the `RestoreObject` extended attributes. This is to ensure that if the file was marked
8988
for restoration as soon as it was uploaded then we don't perform the migration at all. This is to avoid unnecessary
9089
work and also make sure that we don't end up racing with ourselves.
91-
6. Once a file name passes through all the above criterions then we add its name to a temporary WAL and handover the file
90+
6. Once a file name passes through all the above criterions then we add its name to a temporary log and handover the file
9291
name to `migrate` script which should be in `config.NSFS_GLACIER_TAPECLOUD_BIN_DIR` directory. We expect that the script will take the file name as its first parameter and will perform the migration. If the `config.NSFS_GLACIER_BACKEND` is set to `TAPECLOUD` (default) then we expect the script to output data in compliance with `eeadm migrate` command.
93-
7. We delete the temporary WAL that we created.
94-
8. We delete the WAL created by NSFS process **iff** there were no failures in `migrate`. In case of failures we skip the WAL
92+
7. We delete the temporary log that we created.
93+
8. We delete the log created by NSFS process **iff** there were no failures in `migrate`. In case of failures we skip the log
9594
deletion as a way to retry during the next trigger of the script. It should be noted that NooBaa's `migrate` (`TAPECLOUD` backend) invocation does **not** consider `DUPLICATE TASK` an error.
9695

9796
### Flow 2: Restore Object
@@ -105,21 +104,21 @@ which manages the actual movements of the file.
105104
by default and needs to be enabled via `config-local.js` by setting `config.NSFS_GLACIER_ENABLED = true` and `config.NSFS_GLACIER_LOGS_ENABLED = true`.
106105
3. NooBaa performs a number of checks to ensure that the operation is valid (for example there is no already ongoing
107106
restore request going on etc).
108-
4. NooBaa saves the filename to a simple WAL (Write Ahead Log).
107+
4. NooBaa saves the filename to a persistent log.
109108
5. Returns the request with success indicating that the restore request has been accepted.
110109

111110
#### Phase 2
112111
1. A scheduler (eg. Cron, human, script, etc) issues `node src/cmd/manage_nsfs glacier restore --interval <val>`.
113112
2. The command will first acquire an "EXCLUSIVE" lock so as to ensure that only one tape management command is running at once.
114113
3. Once the process has the lock it will start to iterate over the potentially currently inactive files.
115-
4. Before processing a WAL file, the proceess will get an "EXCLUSIVE" lock to the file ensuring that it is indeed the only
114+
4. Before processing a log file, the proceess will get an "EXCLUSIVE" lock to the file ensuring that it is indeed the only
116115
process processing the file.
117-
5. It will read the WAL one line at a time and will store the names of the files that we expect to fail during an eeadm restore
116+
5. It will read the log one line at a time and will store the names of the files that we expect to fail during an eeadm restore
118117
(this can happen for example because a `RestoreObject` was issued for a file but later on that file was deleted before we could
119118
actually process the file).
120-
6. The WAL is handed over to `recall` script which should be present in `config.NSFS_GLACIER_TAPECLOUD_BIN_DIR` directory. We expect that the script will take the file name as its first parameter and will perform the recall. If the `config.NSFS_GLACIER_BACKEND` is set to `TAPECLOUD` (default) then we expect the script to output data in compliance with `eeadm recall` command.
121-
7. If we get any unexpected failures then we mark it a failure and make sure we do not delete the WAL file (so as to retry later).
122-
8. We iterate over the WAL again to set the final extended attributes. This is to make sure that we can communicate the latest with
119+
6. The log is handed over to `recall` script which should be present in `config.NSFS_GLACIER_TAPECLOUD_BIN_DIR` directory. We expect that the script will take the file name as its first parameter and will perform the recall. If the `config.NSFS_GLACIER_BACKEND` is set to `TAPECLOUD` (default) then we expect the script to output data in compliance with `eeadm recall` command.
120+
7. If we get any unexpected failures then we mark it a failure and make sure we do not delete the log file (so as to retry later).
121+
8. We iterate over the log again to set the final extended attributes. This is to make sure that we can communicate the latest with
123122
the NSFS processes.
124123

125124
### Flow 3: Copy Object with Glacier Object as copy source

src/manage_nsfs/manage_nsfs_glacier.js

Lines changed: 14 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,13 @@ async function process_migrations() {
2020
const fs_context = native_fs_utils.get_process_fs_context();
2121

2222
await lock_and_run(fs_context, CLUSTER_LOCK, async () => {
23+
const backend = getGlacierBackend();
24+
2325
if (
24-
await low_free_space() ||
26+
await backend.low_free_space() ||
2527
await time_exceeded(fs_context, config.NSFS_GLACIER_MIGRATE_INTERVAL, MIGRATE_TIMESTAMP_FILE)
2628
) {
27-
await run_glacier_migrations(fs_context);
29+
await run_glacier_migrations(fs_context, backend);
2830
await record_current_time(fs_context, MIGRATE_TIMESTAMP_FILE);
2931
}
3032
});
@@ -33,31 +35,33 @@ async function process_migrations() {
3335
/**
3436
* run_tape_migrations reads the migration WALs and attempts to migrate the
3537
* files mentioned in the WAL.
36-
* @param {nb.NativeFSContext} fs_context
38+
* @param {nb.NativeFSContext} fs_context
39+
* @param {import('../sdk/nsfs_glacier_backend/backend').GlacierBackend} backend
3740
*/
38-
async function run_glacier_migrations(fs_context) {
41+
async function run_glacier_migrations(fs_context, backend) {
3942
// This WAL is getting opened only so that we can process all the prcess WAL entries
4043
const wal = new PersistentLogger(
4144
config.NSFS_GLACIER_LOGS_DIR,
4245
GlacierBackend.MIGRATE_WAL_NAME,
4346
{ disable_rotate: true, locking: 'EXCLUSIVE' },
4447
);
4548

46-
const backend = getGlacierBackend();
4749
await wal.process_inactive(async file => backend.migrate(fs_context, file));
4850
}
4951

5052
async function process_restores() {
5153
const fs_context = native_fs_utils.get_process_fs_context();
5254

5355
await lock_and_run(fs_context, CLUSTER_LOCK, async () => {
56+
const backend = getGlacierBackend();
57+
5458
if (
55-
await low_free_space() ||
59+
await backend.low_free_space() ||
5660
!(await time_exceeded(fs_context, config.NSFS_GLACIER_RESTORE_INTERVAL, RESTORE_TIMESTAMP_FILE))
5761
) return;
5862

5963

60-
await run_glacier_restore(fs_context);
64+
await run_glacier_restore(fs_context, backend);
6165
await record_current_time(fs_context, RESTORE_TIMESTAMP_FILE);
6266
});
6367
}
@@ -66,16 +70,16 @@ async function process_restores() {
6670
* run_tape_restore reads the restore WALs and attempts to restore the
6771
* files mentioned in the WAL.
6872
* @param {nb.NativeFSContext} fs_context
73+
* @param {import('../sdk/nsfs_glacier_backend/backend').GlacierBackend} backend
6974
*/
70-
async function run_glacier_restore(fs_context) {
75+
async function run_glacier_restore(fs_context, backend) {
7176
// This WAL is getting opened only so that we can process all the prcess WAL entries
7277
const wal = new PersistentLogger(
7378
config.NSFS_GLACIER_LOGS_DIR,
7479
GlacierBackend.RESTORE_WAL_NAME,
7580
{ disable_rotate: true, locking: 'EXCLUSIVE' },
7681
);
7782

78-
const backend = getGlacierBackend();
7983
await wal.process_inactive(async file => backend.restore(fs_context, file));
8084
}
8185

@@ -86,16 +90,11 @@ async function process_expiry() {
8690
if (!(await time_exceeded(fs_context, config.NSFS_GLACIER_EXPIRY_INTERVAL, EXPIRY_TIMESTAMP_FILE))) return;
8791

8892

89-
await run_glacier_expiry(fs_context);
93+
await getGlacierBackend().expiry(fs_context);
9094
await record_current_time(fs_context, EXPIRY_TIMESTAMP_FILE);
9195
});
9296
}
9397

94-
async function run_glacier_expiry(fs_context) {
95-
const backend = getGlacierBackend();
96-
await backend.expiry(fs_context);
97-
}
98-
9998
/**
10099
* time_exceeded returns true if the time between last run recorded in the given
101100
* timestamp_file and now is greater than the given interval.
@@ -120,15 +119,6 @@ async function time_exceeded(fs_context, interval, timestamp_file) {
120119
return false;
121120
}
122121

123-
/**
124-
* low_free_space returns true if the default backend has low disk space
125-
* @returns {Promise<boolean>}
126-
*/
127-
async function low_free_space() {
128-
const backend = getGlacierBackend();
129-
return backend.low_free_space();
130-
}
131-
132122
/**
133123
* record_current_time stores the current timestamp in ISO format into
134124
* the given timestamp file

src/util/file_reader.js

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -91,10 +91,15 @@ class NewlineReader {
9191
}
9292

9393
async init() {
94-
const fh = await nb_native().fs.open(this.fs_context, this.path, 'r');
95-
if (this.lock) await fh.flock(this.fs_context, this.lock);
96-
97-
this.fh = fh;
94+
let fh = null;
95+
try {
96+
fh = await nb_native().fs.open(this.fs_context, this.path, 'r');
97+
if (this.lock) await fh.flock(this.fs_context, this.lock);
98+
99+
this.fh = fh;
100+
} finally {
101+
if (fh) await fh.close(this.fs_context);
102+
}
98103
}
99104

100105
async close() {

0 commit comments

Comments
 (0)