-
Notifications
You must be signed in to change notification settings - Fork 18
Home
This wiki is under construction and will be updated. Some of the answers might not be complete yet.
At first: Both algorithms will build the exact same index. GenMap offers two algorithms for suffix array construction (needed for building the FM index) with different resource requirements and running times.
In terms of running time Skew7 (-A skew
) performs much better on repetitive data, but most of the algorithm is not parallelized. Radixsort (-A radix
) however is fully parallelized but is not recommended for repetitive data.
In terms of space consumption Skew7 uses large amounts of secondary memory in your TMP directory, Radixsort uses large amounts of main memory. As long as you have enough secondary memory (approx. 20 times the size of the input fasta), we recommend using Skew.
GenMap can only handle nucleotide sequences (A, C, G, T/U, N). If you load files including other letters (such as IUPAC notation for ambiguous bases), GenMap will print a warning and convert them to N.
If you want to load the raw output with the frequency or mappability vector into your program, you can use the following snippet:
#include <vector>
#include <fstream>
#include <iostream>
#include <iterator>
template <typename value_t>
void load(std::vector<value_t> & vec, std::string && path)
{
std::ifstream file(path, std::ios::binary);
if (!file.eof() && !file.fail())
{
file.seekg(0, std::ios_base::end);
std::streampos fileSize = file.tellg();
vec.resize(fileSize / sizeof(value_t));
file.seekg(0, std::ios_base::beg);
file.read(reinterpret_cast<char*>(&vec[0]), fileSize);
file.close();
return;
}
// something went wrong ...
}
int main(int argc, char ** argv)
{
// load mappability vector
std::vector<float> mappability;
load(mappability, "c.map");
// print mappability vector
std::copy(mappability.begin(), mappability.end(), std::ostream_iterator<float>(std::cout, " "));
std::cout << '\n';
// load frequency vector (for freq16 please use uint16_t)
std::vector<uint8_t> frequency;
load(frequency, "c.freq8");
// print frequency vector
std::copy(frequency.begin(), frequency.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << '\n';
return 0;
}