Scores are possibly off #9

giuliaguidi · 2020-02-24T20:12:11Z

Hi,

I ran libgaba using a linear gap penalty and regardless of the sequences similarity and scoring matrix, the scores are way off with respect to the ground truth.

Could you please let me know if I'm doing something wrong?

Here's the program I'm using:

int gabaAlign(int mat, int mis, int gap, int k, int xdrop, 
    std::string& targetSeg, std::string& querySeg)
{
gaba_t *ctx = gaba_init(GABA_PARAMS(
		// match award, mismatch penalty, gap open penalty (G_i), and gap extension penalty (G_e)
		GABA_SCORE_SIMPLE(mat, -mis, 0, -gap),
		gfa : 0,
		gfb : 0,
		xdrop : (int8_t)xdrop,
		filter_thresh : 0,
	));

	char const t[64] = {0};	// tail array
	gaba_section_t asec = gaba_build_section(0, (uint8_t const *)targetSeg.c_str(), (uint32_t)targetSeg.size());
	gaba_section_t bsec = gaba_build_section(2, (uint8_t const *)querySeg.c_str(),  (uint32_t)querySeg.size());
	gaba_section_t tail = gaba_build_section(4, t, 64);

	// create thread-local object
	gaba_dp_t *dp = gaba_dp_init(ctx);	// dp[0] holds a 64-cell-wide context

	// init section pointers
	gaba_section_t const *ap = &asec, *bp = &bsec;
	gaba_fill_t const *f = gaba_dp_fill_root(dp, // dp -> &dp[_dp_ctx_index(band_width)] makes the band width selectable
		ap, 0,					// a-side (reference side) sequence and start position
		bp, 0,					// b-side (query)
		UINT32_MAX				// max extension length
	);

	// until x-drop condition is detected
	gaba_fill_t const *m = f;

	while((f->status & GABA_TERM) == 0) {
		// substitute the pointer by the tail section's if it reached the end
		if(f->status & GABA_UPDATE_A) { ap = &tail; }
		if(f->status & GABA_UPDATE_B) { bp = &tail; }

		f = gaba_dp_fill(dp, f, ap, bp, UINT32_MAX);	// extend the banded matrix
		m = f->max > m->max ? f : m;					// swap if maximum score was updated
	}

	// alignment path
	gaba_alignment_t *r = gaba_dp_trace(dp,
		m,		// section with the max
		NULL	// custom allocator: see struct gaba_alloc_s in gaba.h
	);

    int score = r->score;
    
	// clean up
	gaba_dp_res_free(dp, r); gaba_dp_clean(dp);
	gaba_clean(ctx);    

    return score;
}

For example, using seq1.seq and seq2.seq as input sequences, x = 50, match = 1, mismatch = -1, and gap = -1, the score should be 3105, while libgaba return 4197.

You can find the benchmark code here: https://github.com/giuliaguidi/xavier/tree/master/benchmark. It can be compiled simply via:

make

and run as:

./test-score-exe <x-drop> <seq1-file> <seq2-file>

The ground truth can be generated using PyAligner and running it in the https://github.com/giuliaguidi/xavier/tree/master/benchmark folder as:

python3 run_pyaligner.py <seq1-file> <seq2-file> -x <x-drop>

Thank you very much!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scores are possibly off #9

Scores are possibly off #9

giuliaguidi commented Feb 24, 2020

Scores are possibly off #9

Scores are possibly off #9

Comments

giuliaguidi commented Feb 24, 2020