Skip to content

Commit b2dbc56

Browse files
maxceemimcaizhengnursoltan-s
authored
ES/DB compare script (#459)
* apply patch from original submission * fix 4 issues * fix issues on array comparison * add descriptions to each kind of mismatches * initial commit * chore: es-db-compare script code structure enhancements - added verification scripts and guides to the repo - moved README to the script folder - README and Verification guide improvements - ignore paths configured for production use * chore: add comment * fix: Report ES/DB - Upload to AWS S3 * chore: es-db-compare script enhancements * chore: dummy prices Co-authored-by: Aaron Peng <imcaizheng@gmail.com> Co-authored-by: Nursoltan Saipolda <Nursoltan.s@gmail.com>
1 parent 7eb9294 commit b2dbc56

18 files changed

+172613
-2
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,3 +47,6 @@ jspm_packages
4747
!.elasticbeanstalk/*.global.yml
4848
.DS_Store
4949
.idea
50+
51+
# Report which might be generated using `scripts/es-db-compare` script
52+
report.html

package.json

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@
2323
"test": "NODE_ENV=test npm run lint && NODE_ENV=test npm run sync:es && NODE_ENV=test npm run sync:db && NODE_ENV=test ./node_modules/.bin/istanbul cover ./node_modules/mocha/bin/_mocha -- --timeout 10000 --require babel-core/register $(find src -path '*spec.js*') --exit",
2424
"test:watch": "NODE_ENV=test ./node_modules/.bin/mocha -w --require babel-core/register $(find src -path '*spec.js*')",
2525
"seed": "babel-node src/tests/seed.js --presets es2015",
26-
"demo-data": "babel-node local/seed"
26+
"demo-data": "babel-node local/seed",
27+
"es-db-compare": "babel-node scripts/es-db-compare"
2728
},
2829
"repository": {
2930
"type": "git",
@@ -54,8 +55,11 @@
5455
"express-request-id": "^1.1.0",
5556
"express-sanitizer": "^1.0.2",
5657
"express-validation": "^0.6.0",
58+
"handlebars": "^4.5.3",
5759
"http-aws-es": "^4.0.0",
5860
"joi": "^8.0.5",
61+
"jsondiffpatch": "^0.4.1",
62+
"jsonpath": "^1.0.2",
5963
"jsonwebtoken": "^8.3.0",
6064
"lodash": "^4.17.11",
6165
"memwatch-next": "^0.3.0",
@@ -65,7 +69,6 @@
6569
"pg": "^7.11.0",
6670
"pg-native": "^3.0.0",
6771
"sequelize": "^5.8.7",
68-
"jsonpath": "^1.0.2",
6972
"swagger-ui-express": "^4.0.6",
7073
"tc-core-library-js": "appirio-tech/tc-core-library-js.git#v2.6.3",
7174
"traverse": "^0.6.6",

scripts/es-db-compare/README.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Script to find mismatches between data in DB and ES
2+
3+
We keep all the data in two places in DB (Database) and in ES (Elasticsearch Index). Every time we make any changes to the data in the DB all the changes are also reflected in ElasticSearch. Due to some circumstances data in ES and DB can become inconsistent.
4+
5+
This script may be run to find all the inconsistencies between data we have in ES and DB and create a report.
6+
7+
## Configuration
8+
9+
The following properties can be set from env variables:
10+
11+
- `PROJECT_START_ID`: if set, only projects with id that large than or equal to the value are compared.
12+
- `PROJECT_END_ID`: if set, only projects with id that less than or equal to the value are compared.
13+
- `PROJECT_LAST_ACTIVITY_AT`: if set, only projects with property lastActivityAt that large than or equal to the value are compared.
14+
- `REPORT_S3_BUCKET`: If set, report would be uploaded to this S3 bucket, otherwise report will be saved to disk.
15+
- `AWS_ACCESS_KEY_ID`: AWS credentials, required to upload report to S3 bucket.
16+
- `AWS_SECRET_ACCESS_KEY`: AWS credentials, required to upload report to S3 bucket.
17+
18+
There could be some fields that always mismatch in ES and DB.
19+
The variable named `ignoredPaths` at `scripts/es-db-compare/constants.js` maintains a list of json paths which will be ignored
20+
during the comparation. You may need to modify/add/delete items in the list.
21+
22+
### Required
23+
24+
- `PROJECT_START_ID` and `PROJECT_END_ID` must exist together.
25+
- At least one of `PROJECT_START_ID` with `PROJECT_END_ID` or `PROJECT_LAST_ACTIVITY_AT` needs be set before running the script.
26+
- If you want to upload report to AWS S3 you need to set `REPORT_S3_BUCKET`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` environment variables.
27+
28+
## Usage
29+
30+
Set up configuration and execute command `npm run es-db-compare` on the command line.
31+
It will then generate a HTML report with name `report.html` under the current directory.
32+
33+
Example commands:
34+
35+
- Generate a report comparing ALL the projects:
36+
37+
```bash
38+
PROJECT_LAST_ACTIVITY_AT=0 npm run es-db-compare
39+
```
40+
41+
- Generate a report comparing projects that have been updated on **26 December 2019** or later:
42+
43+
```bash
44+
PROJECT_LAST_ACTIVITY_AT="2019-12-26" npm run es-db-compare
45+
```
46+
47+
- Generate a report comparing projects with ID range:
48+
49+
```bash
50+
PROJECT_START_ID=5000 PROJECT_END_ID=6000 npm run es-db-compare
51+
```
52+
53+
- Any of the command above can be run with additionally set `REPORT_S3_BUCKET`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` environment variables to upload report to S3 bucket like:
54+
55+
```bash
56+
REPORT_S3_BUCKET=<S3 bucket name> AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID> AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY>PROJECT_LAST_ACTIVITY_AT="2019-12-26" npm run es-db-compare
57+
```
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
/* eslint-disable no-console */
2+
/* eslint-disable consistent-return */
3+
/* eslint-disable no-restricted-syntax */
4+
/* eslint-disable no-param-reassign */
5+
/*
6+
* Compare metadata between ES and DB.
7+
*/
8+
const lodash = require('lodash');
9+
10+
const scriptUtil = require('./util');
11+
const scriptConstants = require('./constants');
12+
13+
const hashKeyMapping = {
14+
ProjectTemplate: 'id',
15+
ProductTemplate: 'id',
16+
ProjectType: 'key',
17+
ProductCategory: 'key',
18+
MilestoneTemplate: 'id',
19+
OrgConfig: 'id',
20+
Form: 'id',
21+
PlanConfig: 'id',
22+
PriceConfig: 'id',
23+
BuildingBlock: 'id',
24+
};
25+
26+
/**
27+
* Process a single delta.
28+
*
29+
* @param {String} modelName the model name the delta belongs to
30+
* @param {Object} delta the diff delta.
31+
* @param {Object} dbData the data from DB
32+
* @param {Object} esData the data from ES
33+
* @param {Object} finalData the data patched
34+
* @returns {undefined}
35+
*/
36+
function processDelta(modelName, delta, dbData, esData, finalData) {
37+
const hashKey = hashKeyMapping[modelName];
38+
if (delta.dataType === 'array' && delta.path.length === 1) {
39+
if (delta.type === 'delete') {
40+
console.log(`one dbOnly found for ${modelName} with ${hashKey} ${delta.originalValue[hashKey]}`);
41+
return {
42+
type: 'dbOnly',
43+
modelName,
44+
hashKey,
45+
hashValue: delta.originalValue[hashKey],
46+
dbCopy: delta.originalValue,
47+
};
48+
}
49+
if (delta.type === 'add') {
50+
console.log(`one esOnly found for ${modelName} with ${hashKey} ${delta.value[hashKey]}`);
51+
return {
52+
type: 'esOnly',
53+
modelName,
54+
hashKey,
55+
hashValue: delta.value[hashKey],
56+
esCopy: delta.value,
57+
};
58+
}
59+
}
60+
if (['add', 'delete', 'modify'].includes(delta.type)) {
61+
const path = scriptUtil.generateJSONPath(lodash.slice(delta.path, 1));
62+
const hashValue = lodash.get(finalData, lodash.slice(delta.path, 0, 1))[hashKey];
63+
const hashObject = lodash.set({}, hashKey, hashValue);
64+
const dbCopy = lodash.find(dbData, hashObject);
65+
const esCopy = lodash.find(esData, hashObject);
66+
console.log(`one mismatch found for ${modelName} with ${hashKey} ${hashValue}`);
67+
return {
68+
type: 'mismatch',
69+
kind: delta.type,
70+
modelName,
71+
hashKey,
72+
hashValue,
73+
path,
74+
dbCopy,
75+
esCopy,
76+
};
77+
}
78+
}
79+
80+
81+
/**
82+
* Compare Metadata data from ES and DB.
83+
*
84+
* @param {Object} dbData the data from DB
85+
* @param {Object} esData the data from ES
86+
* @returns {Object} the data to feed handlebars template
87+
*/
88+
function compareMetadata(dbData, esData) {
89+
const data = {
90+
nestedModels: {},
91+
};
92+
93+
const countInconsistencies = () => {
94+
lodash.set(data, 'meta.totalObjects', 0);
95+
lodash.map(data.nestedModels, (model) => {
96+
const counts = Object.keys(model.mismatches).length + model.dbOnly.length + model.esOnly.length;
97+
lodash.set(model, 'meta.counts', counts);
98+
data.meta.totalObjects += counts;
99+
});
100+
};
101+
102+
const storeDelta = (modelName, delta) => {
103+
if (lodash.isUndefined(data.nestedModels[modelName])) {
104+
data.nestedModels[modelName] = {
105+
mismatches: {},
106+
dbOnly: [],
107+
esOnly: [],
108+
};
109+
}
110+
if (delta.type === 'mismatch') {
111+
if (lodash.isUndefined(data.nestedModels[modelName].mismatches[delta.hashValue])) {
112+
data.nestedModels[modelName].mismatches[delta.hashValue] = [];
113+
}
114+
data.nestedModels[modelName].mismatches[delta.hashValue].push(delta);
115+
return;
116+
}
117+
if (delta.type === 'dbOnly') {
118+
data.nestedModels[modelName].dbOnly.push(delta);
119+
return;
120+
}
121+
if (delta.type === 'esOnly') {
122+
data.nestedModels[modelName].esOnly.push(delta);
123+
}
124+
};
125+
126+
for (const refPath of Object.keys(scriptConstants.associations.metadata)) {
127+
const modelName = scriptConstants.associations.metadata[refPath];
128+
const { deltas, finalData } = scriptUtil.diffData(
129+
dbData[refPath],
130+
esData[refPath],
131+
{
132+
hashKey: hashKeyMapping[modelName],
133+
modelPathExprssions: lodash.set({}, modelName, '[*]'),
134+
},
135+
);
136+
for (const delta of deltas) {
137+
if (scriptUtil.isIgnoredPath(`metadata.${refPath}`, delta.path)) {
138+
continue; // eslint-disable-line no-continue
139+
}
140+
const deltaWithCopy = processDelta(modelName, delta, dbData[refPath], esData[refPath], finalData);
141+
if (deltaWithCopy) {
142+
storeDelta(modelName, deltaWithCopy);
143+
}
144+
}
145+
}
146+
countInconsistencies();
147+
return data;
148+
}
149+
150+
module.exports = {
151+
compareMetadata,
152+
};

0 commit comments

Comments
 (0)