Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate scans to have their own mongo collection #1915

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
90d8a5e
BAI-1627 add initial Scan mongodb type and create Docker digest get s…
PE39806 Feb 19, 2025
f79acd9
BAI-1627 migrate backend to use new Scan Document rather than File.av…
PE39806 Feb 20, 2025
5110ff0
BAI-1627 add filescan to getRelease
PE39806 Feb 21, 2025
f97ee02
BAI-1627 update backend tests for new Scan Document
PE39806 Feb 21, 2025
26a9800
BAI-1627 fix bad test by casting
PE39806 Feb 21, 2025
48e0275
BAI-1627 remove unnecessary test cast
PE39806 Feb 21, 2025
b085401
BAI-1627 rework backend to pass file scan results within file objects
PE39806 Feb 21, 2025
887208c
BAI-1627 allow File.avScan to be an empty array and fix id check
PE39806 Feb 24, 2025
db43754
BAI-1627 apply review comments to remove File.avScan property, consol…
PE39806 Feb 25, 2025
36ac211
BAI-1627 fix legacy migrations
PE39806 Feb 25, 2025
78a4445
BAI-1627 fix getFilesByIds aggregate fields
PE39806 Feb 25, 2025
8a42f41
BAI-1627 improve frontend AvScanResult type and remove unnecessary co…
PE39806 Feb 25, 2025
4c0e2a5
BAI-1627 correct artefactKind typing
PE39806 Feb 25, 2025
2c7eb78
BAI-1627 rework scan and file typing, fix tests and re-add scans to r…
PE39806 Mar 5, 2025
19ed985
BAI-1567 fix mongoose _id typing
PE39806 Mar 5, 2025
35b4a1e
BAI-1627 fix broken file tests due to ObjectID
PE39806 Mar 5, 2025
004b6af
BAI-1627 retype FileWithScanResultsInterface and propagate changes, t…
PE39806 Mar 6, 2025
a72a2c1
BAI-1627 fix bad file and scan aggregate pipelines
PE39806 Mar 6, 2025
9c2c0d1
BAI-1627 fix backend build with FileWithScanResultsInterface typing
PE39806 Mar 6, 2025
73d2bcc
Merge remote-tracking branch 'origin/main' into feature/BAI-1627-migr…
PE39806 Mar 6, 2025
60a39eb
BAI-1627 fix uploadFile file spread syntax bad typing
PE39806 Mar 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions backend/src/migrations/015_migrate_avscan_to_own_model.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import FileModel from '../models/File.js'
import ScanModel, { ArtefactType } from '../models/Scan.js'

export async function up() {
// convert avScan from being stored in File to a new Scan Document
const files = await FileModel.find({ avScan: { $exists: true } })
for (const file of files) {
// TS linting fix - we already check for existing so this should never happen
if (file.avScan === undefined) {
continue
}
for (const avResult of file.avScan) {
// create new Scan Document
const newScan = new ScanModel({
artefactType: ArtefactType.File,
fileId: file._id,
toolName: avResult.toolName,
scannerVersion: avResult.scannerVersion,
state: avResult.state,
isInfected: avResult.isInfected,
viruses: avResult.viruses,
lastRunAt: avResult.lastRunAt,
createdAt: file.createdAt,
updatedAt: file.updatedAt,
})
await newScan.save()
}
}
// remove all old avScan fields
await FileModel.updateMany({ avScan: { $exists: true } }, { $unset: { avScan: 1 } })
}

export async function down() {
/* NOOP */
}
25 changes: 16 additions & 9 deletions backend/src/models/File.ts
Original file line number Diff line number Diff line change
Expand Up @@ -41,16 +41,23 @@ const FileSchema = new Schema<FileInterfaceDoc>(
bucket: { type: String, required: true },
path: { type: String, required: true },

avScan: [
{
toolName: { type: String },
scannerVersion: { type: String },
state: { type: String, enum: Object.values(ScanState) },
isInfected: { type: Boolean },
viruses: [{ type: String }],
lastRunAt: { type: Schema.Types.Date },
avScan: {
type: [
{
toolName: { type: String },
scannerVersion: { type: String },
state: { type: String, enum: Object.values(ScanState) },
isInfected: { type: Boolean },
viruses: [{ type: String }],
lastRunAt: { type: Schema.Types.Date },
},
],
required: false,
// prevent legacy field from being used
validate: function (val: any): boolean {
return val === undefined || val.length === 0
},
],
},

complete: { type: Boolean, default: false },
},
Expand Down
110 changes: 110 additions & 0 deletions backend/src/models/Scan.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
import { model, ObjectId, Schema } from 'mongoose'
import MongooseDelete, { SoftDeleteDocument } from 'mongoose-delete'

import { ScanState, ScanStateKeys } from '../connectors/fileScanning/Base.js'

// This interface stores information about the properties on the base object.
// It should be used for plain object representations, e.g. for sending to the
// client.
export interface ScanInterface {
_id: ObjectId

// file or image
artefactType: ArtefactTypeKeys
// for files
fileId?: string
// for images
// map this to registry GET /v2/<name>/manifests/<reference>
repositoryName?: string
imageDigest?: string

toolName: string
scannerVersion?: string
state: ScanStateKeys
isInfected?: boolean
viruses?: string[]
lastRunAt: Date

createdAt: Date
updatedAt: Date
}

export const ArtefactType = {
File: 'file',
Image: 'image',
} as const
export type ArtefactTypeKeys = (typeof ArtefactType)[keyof typeof ArtefactType]

// The doc type includes all values in the plain interface, as well as all the
// properties and functions that Mongoose provides. If a function takes in an
// object from Mongoose it should use this interface
export type ScanInterfaceDoc = ScanInterface & SoftDeleteDocument

const ScanSchema = new Schema<ScanInterfaceDoc>(
{
artefactType: { type: String, enum: Object.values(ArtefactType), required: true },
fileId: {
type: String,
required: function (): boolean {
return this['artefactType'] === ArtefactType.File
},
validate: function (val: any): boolean {
if (this['artefactType'] === ArtefactType.File && val) {
return true
}
throw new Error(`Cannot provide a 'fileId' with '${JSON.stringify({ artefactType: this['artefactType'] })}'`)
},
},
imageDigest: {
type: String,
required: function (): boolean {
return this['artefactType'] === ArtefactType.Image
},
validate: function (val: any): boolean {
if (this['artefactType'] === ArtefactType.Image && val) {
return true
}
throw new Error(
`Cannot provide an 'imageDigest' with '${JSON.stringify({ artefactType: this['artefactType'] })}'`,
)
},
},
repositoryName: {
type: String,
required: function (): boolean {
return this['artefactType'] === ArtefactType.Image
},
validate: function (val: any): boolean {
if (this['artefactType'] === ArtefactType.Image && val) {
return true
}
throw new Error(
`Cannot provide a 'repositoryName' with '${JSON.stringify({ artefactType: this['artefactType'] })}'`,
)
},
},

toolName: { type: String, required: true },
scannerVersion: { type: String },
state: { type: String, enum: Object.values(ScanState), required: true },
isInfected: { type: Boolean },
viruses: [{ type: String }],
lastRunAt: { type: Schema.Types.Date, required: true },
},
{
timestamps: true,
collection: 'v2_scans',
toJSON: { getters: true },
},
)

ScanSchema.plugin(MongooseDelete, {
overrideMethods: 'all',
deletedBy: true,
deletedByType: Schema.Types.ObjectId,
deletedAt: true,
})

const ScanModel = model<ScanInterfaceDoc>('v2_Scan', ScanSchema)

export default ScanModel
64 changes: 64 additions & 0 deletions backend/src/scripts/listRegistryDigests.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import fetch from 'node-fetch'

import { getAccessToken } from '../routes/v1/registryAuth.js'
import { getHttpsAgent } from '../services/http.js'
import log from '../services/log.js'
import config from '../utils/config.js'
import { connectToMongoose, disconnectFromMongoose } from '../utils/database.js'

const httpsAgent = getHttpsAgent({
rejectUnauthorized: !config.registry.insecure,
})

async function script() {
await connectToMongoose()

const registry = `https://localhost:5000/v2`

const token = await getAccessToken({ dn: 'user' }, [{ type: 'registry', class: '', name: 'catalog', actions: ['*'] }])

const authorisation = `Bearer ${token}`

const catalog = (await fetch(`${registry}/_catalog`, {
headers: {
Authorization: authorisation,
},
agent: httpsAgent,
}).then((res) => res.json())) as object

await Promise.all(
catalog['repositories'].map(async (repositoryName) => {
const repositoryToken = await getAccessToken({ dn: 'user' }, [
{ type: 'repository', class: '', name: repositoryName, actions: ['*'] },
])
const repositoryAuthorisation = `Bearer ${repositoryToken}`

const repositoryTags = (await fetch(`${registry}/${repositoryName}/tags/list`, {
headers: {
Authorization: repositoryAuthorisation,
},
agent: httpsAgent,
}).then((res) => res.json())) as object

await Promise.all(
repositoryTags['tags'].map(async (tag) => {
const repositoryDigest = await fetch(`${registry}/${repositoryName}/manifests/${tag}`, {
headers: {
Authorization: repositoryAuthorisation,
Accept: 'application/vnd.docker.distribution.manifest.v2+json',
},
agent: httpsAgent,
}).then((res) => {
return res.headers.get('docker-content-digest')
})

log.info({ repositoryName: repositoryName, tag: tag, digest: repositoryDigest }, 'Digest')
}),
)
}),
)

setTimeout(disconnectFromMongoose, 50)
}

script()
35 changes: 21 additions & 14 deletions backend/src/services/file.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import authorisation from '../connectors/authorisation/index.js'
import { FileScanResult, ScanState } from '../connectors/fileScanning/Base.js'
import scanners from '../connectors/fileScanning/index.js'
import FileModel, { FileInterface, FileInterfaceDoc } from '../models/File.js'
import ScanModel, { ArtefactType } from '../models/Scan.js'
import { UserInterface } from '../models/User.js'
import config from '../utils/config.js'
import { BadReq, Forbidden, NotFound } from '../utils/error.js'
Expand All @@ -29,7 +30,6 @@ export function isFileInterfaceDoc(data: unknown): data is FileInterfaceDoc {
!('bucket' in data) ||
!('path' in data) ||
!('complete' in data) ||
!('avScan' in data) ||
!('deleted' in data) ||
!('createdAt' in data) ||
!('updatedAt' in data) ||
Expand Down Expand Up @@ -81,19 +81,18 @@ export async function uploadFile(user: UserInterface, modelId: string, name: str

async function updateFileWithResults(_id: Schema.Types.ObjectId, results: FileScanResult[]) {
for (const result of results) {
const updateExistingResult = await FileModel.updateOne(
{ _id, 'avScan.toolName': result.toolName },
const updateExistingResult = await ScanModel.updateOne(
{ fileId: _id, toolName: result.toolName },
{
$set: { 'avScan.$': { ...result } },
$set: { ...result },
},
)
if (updateExistingResult.modifiedCount === 0) {
await FileModel.updateOne(
{ _id, avScan: { $exists: true } },
{
$push: { avScan: { toolName: result.toolName, state: result.state, lastRunAt: new Date() } },
},
)
await ScanModel.create({
artefactType: ArtefactType.File,
fileId: _id,
...result,
})
}
}
}
Expand Down Expand Up @@ -148,8 +147,15 @@ export async function getFilesByIds(user: UserInterface, modelId: string, fileId
throw NotFound(`The requested files were not found.`, { fileIds: notFoundFileIds })
}

const fileAvScans = await ScanModel.find({ fileId: { $in: fileIds } })
const filesWithAvScans = files.map((file) => {
const relevantAvScans = fileAvScans.filter((scan, _) => scan.fileId === file._id.toString())
file.avScan = (file.avScan || []).concat(relevantAvScans)
return file
})

const auths = await authorisation.files(user, model, files, FileAction.View)
return files.filter((_, i) => auths[i].success)
return filesWithAvScans.filter((_, i) => auths[i].success)
}

export async function removeFile(user: UserInterface, modelId: string, fileId: string) {
Expand Down Expand Up @@ -196,13 +202,14 @@ export async function markFileAsCompleteAfterImport(path: string) {
}
}

function fileScanDelay(file: FileInterface): number {
async function fileScanDelay(file: FileInterface): Promise<number> {
const delay = config.connectors.fileScanners.retryDelayInMinutes
if (delay === undefined) {
return 0
}
let minutesBeforeRetrying = 0
for (const scanResult of file.avScan) {
const fileAvScans = await ScanModel.find({ fileId: file._id })
for (const scanResult of fileAvScans) {
const delayInMilliseconds = delay * 60000
const scanTimeAtLimit = scanResult.lastRunAt.getTime() + delayInMilliseconds
if (scanTimeAtLimit > new Date().getTime()) {
Expand All @@ -229,7 +236,7 @@ export async function rerunFileScan(user: UserInterface, modelId, fileId: string
if (!file.size || file.size === 0) {
throw BadReq('Cannot run scan on an empty file')
}
const minutesBeforeRescanning = fileScanDelay(file)
const minutesBeforeRescanning = await fileScanDelay(file)
if (minutesBeforeRescanning > 0) {
throw BadReq(`Please wait ${plural(minutesBeforeRescanning, 'minute')} before attempting a rescan ${file.name}`)
}
Expand Down
2 changes: 1 addition & 1 deletion backend/src/services/mirroredModel.ts
Original file line number Diff line number Diff line change
Expand Up @@ -605,7 +605,7 @@ async function checkReleaseFiles(user: UserInterface, modelId: string, semvers:
failedScan: Array<{ name: string; id: string }>
} = { missingScan: [], incompleteScan: [], failedScan: [] }
for (const file of files) {
if (!file.avScan) {
if (!file.avScan || file.avScan.length === 0) {
scanErrors.missingScan.push({ name: file.name, id: file.id })
} else if (file.avScan.some((scanResult) => scanResult.state !== ScanState.Complete)) {
scanErrors.incompleteScan.push({ name: file.name, id: file.id })
Expand Down
27 changes: 27 additions & 0 deletions backend/test/services/__snapshots__/file.spec.ts.snap
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,33 @@ exports[`services > file > downloadFile > success 1`] = `
exports[`services > file > getFilesByIds > success 1`] = `
[
{
"avScan": [],
"example": "file",
},
]
`;

exports[`services > file > getFilesByIds > success with scans mapped 1`] = `
[
{
"_id": "123",
"avScan": [
{
"fileId": "123",
},
{
"fileId": "123",
},
],
"example": "file",
},
{
"_id": "321",
"avScan": [
{
"fileId": "321",
},
],
"example": "file",
},
]
Expand Down
Loading
Loading