Skip to content

Commit

Permalink
coa-updater: Convert to TS
Browse files Browse the repository at this point in the history
  • Loading branch information
bperel committed Jul 7, 2024
1 parent 11586aa commit 405e13a
Show file tree
Hide file tree
Showing 9 changed files with 379 additions and 85 deletions.
2 changes: 1 addition & 1 deletion apps/coa-updater/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
FROM debian:buster-slim
MAINTAINER Bruno Perel
LABEL maintainer="Bruno Perel"

RUN apt-get update && \
apt-get install -y mariadb-client wget csvtool && \
Expand Down
12 changes: 1 addition & 11 deletions apps/coa-updater/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,7 @@
# coa-updater

A simple way to set up and provision a COA database.

### Provisioning

Provisioning is executed when the container starts.

### Related projects

* [DucksManager](https://github.com/bperel/DucksManager) is a free and open-source website enabling comic book collectors to manage their Disney collection.
* [dm-server](https://github.com/bperel/dm-server) is the back-end project that DucksManager reads and writes data from/to.
* [WhatTheDuck](https://github.com/bperel/WhatTheDuck) is the mobile app of DucksManager, allowing users to check the contents of their collection on a mobile and add issues to the collection by photographing comic book covers.
* [Duck cover ID](https://github.com/bperel/duck-cover-id) is a collection of shell scripts launched by a daily cronjob, allowing to retrieve comic book covers from the Inducks website and add the features of these pictures to a Pastec index. This index is searched whn taking a picture of a cover in the WhatTheDuck app.
* [COA updater](https://github.com/bperel/coa-updater) is a shell script launched by a daily cronjob, allowing to retrieve the structure and the contents of the Inducks database and to create a copy of this database locally.
* [DucksManager-stats](https://github.com/bperel/DucksManager-stats) contains a list of scripts launched by a daily cronjob, allowing to calculate statistics about issues that are recommended to users on DucksManager, depending on the authors that they prefer.

![DucksManager architecture](https://raw.githubusercontent.com/bperel/DucksManager/master/server_architecture.png)
23 changes: 23 additions & 0 deletions apps/coa-updater/docker-compose-dev.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
services:
coa-updater:
init: true
container_name: coa-updater
build:
context: .
command: 'run index.ts'
volumes:
- .:/home/bun/app
- inducks_data:/tmp/inducks
environment:
MYSQL_HOST: db
MYSQL_DATABASE: coa
MYSQL_DATABASE_NEW: coa_new
MYSQL_ROOT_PASSWORD: ${MYSQL_ROOT_PASSWORD}
networks:
- db-network

volumes:
inducks_data:
networks:
db-network:
external: true
165 changes: 165 additions & 0 deletions apps/coa-updater/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
#!/usr/bin/env bun

import { $ } from "bun";
import { execSync } from "child_process";
import { parse } from "csv-parse";
import { createReadStream, mkdirSync, readFileSync, writeFileSync } from "fs";
import { createPool } from "mariadb";

const dataPath = "/tmp/inducks",
isvPath = `${dataPath}/isv`;

const poolParams = {
host: process.env.MYSQL_HOST,
user: "root",
password: process.env.MYSQL_ROOT_PASSWORD,
multipleStatements: true,
connectionLimit: 5,
permitLocalInfile: true,
};
const pool = createPool(poolParams);

console.log("Pool created");

try {
// mkdirSync(isvPath, { recursive: true });
// await $`wget -c https://inducks.org/inducks/isv.tgz -O - | tar -xz -C ${isvPath}`;

// Ignore lines with invalid UTF-8 characters
// List files in the directory and iterate over them

for await (let file of $`ls ${isvPath}/*.isv`.lines()) {
if (file) {
await $`iconv -f utf-8 -t utf-8 -c "${file}" > "${file}.clean" && mv -f "${file}.clean" "${file}"`;
}
}

console.log("iconv done");
let cleanSql = readFileSync(`${isvPath}/createtables.sql`, "utf8")
.split("\n")
.filter(
(line) =>
!(
["USE ", "RENAME ", "DROP ", "# Step ", "#End of file"].some(
(prefix) => line.startsWith(prefix)
) ||
/^.+priv[^;]+;$/.test(line) ||
/^CREATE TABLE IF NOT EXISTS ([^ ]+) LIKE \1_temp/.test(line)
)
)
.join("\n")
// Replace "pk0" indexes with actual primary keys
.replace(/KEY pk0/gms, "CONSTRAINT `PRIMARY` PRIMARY KEY")
.replace(
/LOAD DATA LOCAL INFILE ".\/([^"]+)"/gms,
`LOAD DATA LOCAL INFILE '${dataPath}/$1'`
)

// Prefix fulltext indexes with table name
.replace(
/(ALTER TABLE )(([^ ]+)_temp)( ADD FULLTEXT)(\([^()]+\));/gs,
"$1$2$4 fulltext_$3 $5;"
);

console.log("Renaming foreign keys...");
for (let fkIndex = 0; fkIndex <= 5; fkIndex++) {
cleanSql = cleanSql.replace(
new RegExp(
`(CREATE TABLE ((?:(?!_temp).)+?)_temp(?:(?!KEY fk'${fkIndex}')[^;])+?)KEY (fk)('${fkIndex}')`,
"gms"
),
"$1KEY $3_$2$4"
);
}
console.log("done.");

cleanSql = cleanSql.replace(/_temp/g, "");

const textFieldsToTransform = { inducks_person: ["fullname"] };
for (const [table, fields] of Object.entries(textFieldsToTransform)) {
for (const field of fields) {
const parser = createReadStream(`${isvPath}/${table}.isv`).pipe(
parse({
delimiter: "^",
columns: true,
quote: null,
})
);
let maxLength = 0;
for await (const record of parser) {
if (record[field]) {
maxLength = Math.max(maxLength, record[field].length);
}
}
console.log(`Max length for ${table}.${field}: ${maxLength}`);
cleanSql = cleanSql.replace(
new RegExp(`(?<=CREATE TABLE ${table})([^;]+ ${field} )text`),
`$1varchar(${maxLength})`
);
}
}

cleanSql = `
set unique_checks = 0;
set foreign_key_checks = 0;
set sql_log_bin=0;
${cleanSql}
ALTER TABLE inducks_entryurl ADD id INT AUTO_INCREMENT NOT NULL, ADD PRIMARY KEY (id);
# Add full text index on entry titles
ALTER TABLE inducks_entry ADD FULLTEXT INDEX entryTitleFullText(title);
set unique_checks = 1;
set foreign_key_checks = 1;
set sql_log_bin=1`;

const cleanSqlStatements = cleanSql.split(";");

const connection = await pool.getConnection();
await connection.query(
`DROP DATABASE IF EXISTS ${process.env.MYSQL_DATABASE_NEW};CREATE DATABASE ${process.env.MYSQL_DATABASE_NEW} CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; set global net_buffer_length=1000000;
set global max_allowed_packet=1000000000; `
);

const newDbPool = createPool({
...poolParams,
database: process.env.MYSQL_DATABASE_NEW,
});
const newDbConnection = await newDbPool.getConnection();
for (const statement of cleanSqlStatements) {
console.log(`Executing statement: ${statement}`);
await newDbConnection.query(statement);
console.log(" done.");
}

const tables = (
await newDbConnection.query(
`SELECT table_name FROM information_schema.tables WHERE table_schema = ?`,
[process.env.MYSQL_DATABASE_NEW]
)
).map((row: { table_name: string }) => row.table_name);
newDbConnection.release();

for (const table of tables) {
console.log(`Renaming ${table}...`);
await connection.query(
`set foreign_key_checks = 0;
drop table if exists ${process.env.MYSQL_DATABASE}.${table};
rename table ${process.env.MYSQL_DATABASE_NEW}.${table} to ${process.env.MYSQL_DATABASE}.${table};
set foreign_key_checks = 1;`
);
console.log(" done.");
}

await connection.query(`drop database ${process.env.MYSQL_DATABASE_NEW}`);
connection.release();

console.log("mysqlcheck...");
execSync(
`mysqlcheck -h ${process.env.MYSQL_HOST} -uroot -p${process.env.MYSQL_ROOT_PASSWORD} -v ${process.env.MYSQL_DATABASE}`
);
console.log(" done.");
process.exit(0);
} catch (error) {
console.error("Error:", (error as { message: string }).message);
}
21 changes: 21 additions & 0 deletions apps/coa-updater/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"name": "~coa-updater",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"dependencies": {
"csv-generate": "^4.4.1",
"csv-parse": "^5.5.6",
"mariadb": "^3.3.1"
},
"devDependencies": {
"@types/bun": "latest",
"@types/node": "^18.0.0",
"typescript": "^5.5.3"
},
"module": "index.ts",
"type": "module"
}
140 changes: 140 additions & 0 deletions apps/coa-updater/pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 405e13a

Please sign in to comment.