Study note

Data structure

Array

Usage in the following cases:
1, The number of data items is known
2, Access randomly by index. O(1)
Extensions:
1, ArrayList: dynamic resizing array while still having O(1) access
2, StringBuffer: a structure which has the end pointer of a string

Linked list

Usage in the following cases:
1, The number of data items is unknown
2, Access time > O(1) because the pointer is only for one direction
Techniques:
1, Sentinels: a dummy object in the head or tail of the linked list. Purpose: clarity of the code
2, Runner

Queues

Usage in the following cases:
1, Breadth first search (BPF): shortest path, least number of moves
Extensions:
1, Priority queue: not pop oldest, but element with highest value; can be implemented by heap (e.g., always do max heap when push, and put the max value to the front;)
solution1: O(nlogn) + O(1)

push: heapsort each time[O(nlogn)]; // or other sort algorithm
pop: pop_back each time[O(1)]

solution2: O(n) + O (n)

push: buildheap [O (n)]
pop: swap(begin, end)(O(1)) + pop_back (O(1)) + buildheap (O(n)) [total:O (n) ];

Operations: enqueue, dequeue
Implementation: FIFO + linked list (if queue length is unknown), FIFO + array (if queue length is known)

Stacks

Usage in the following cases:
1, Recursive function: similar to per-ordered traversal
2, Depth first search
Operations: pop, push, peek
Implementation: LIFO + linked list

Heap

an array/a vector, with complete binary search tree feature. maxheapify->buildheap->heapsort.

Trees

Concept: root node and multiple child nodes
Extensions:
1, Binary tree: one root node has two child nodes
2, Binary search tree: binary tree that left < current < right key
3, Balanced binary search tree:

Red-black tree： using colour, height at most 2*lg(n+1); balance: insert, re-colour + rotation;
AVL tree

4, B-Tree: generalized binary search tree which can have more than two children
6, Tries
Operations: traversal (per-ordered, in-ordered, post-ordered), insert, search, delete

Graph

Operations:
1, DFS: per-ordered traversal
3, BFS: queue

Hash table

Concept: hash table = table + hash function + collision process;
Implementation(c++):
1, table:

array/vector: access time: O (1); e.g., std::vector<int, std::list > // use list as a chaining;
BST: access time O (logN)

2, hash function: index = f (key), key is nature number. If not, then need to map keys to nature numbers.

division: index = key mod m // m better uses a prime not too close to an exact power of 2， in order to low the collision possibility）
multiplication: index = floor (m * (keyA - floor （keyA）) ) // m = 2^p, A = (sqrt(5)-1)/2 = 0.6180339887 from "the art of computer programming"
djb2: in my test, the index is much larger than others
sdbm: in my test, the index is much larger than others
lose base

3, collision:

chaining(linked list, self-balanced tree): search table index (O(1)), then search chaining
open addressing: search table index (O(1)), then do probe sequence
NOTE:
chaining = open hashing, closed addressing
open addressing = closed hashing
Performance:
Open hashing is not good in high load factor (>0.7) due to clustering. Hence, the hash function must distribute the keys more uniformly over the buckets, and void clustering.

4, dynamically increase/decrease space for hash table

Operations:
1, insert, search, delete
Other concepts: load factor: the number of entries / the number of buckets

Basic algorithm

Bit operations

SetBit, GetBit, ClearBit

Sorting

Bubble sort
Selection sort
Merge sort: divide and conquer, all casAes O(NlogN)
Heap sort:
Quick sort: divide and conquer, worst case O(N^2), average/bes tO(NlgN)
Radix sort: example c++ code

Search

Binary search
Hash table

Skills

Start
you should consider following aspects:

write input and output in the comments at the beginning
clarify conditions, such as double or single linked list? ascii or unicode string? integer or float?
define interfaces and return types of function
ask interviewer can I write the logic idea first and write the codes later?
ask interviewer can I write the methods first and write the main function later?
List rules, special cases and main codes for general cases
For general cases, consider brute force solution first
write main function for testing: write test codes

Note:

special cases: 0, null, negative, max, min
always do the legally check for the input!
do not do assumption by yoursefl
modular coding

String

ASCII string （0 - 255, totally 256） or Unicode string （much much much more!）
compare strings: check comparison is 1) case sensitive? e.g., A and a are diff or same? 2) is white space sensitive? e.g., "a " and "a" are same or diff?
count the number of symbol in a string: a[key] or hash table[key] count method, or bit vector(only check duplicate)

bit vector: int checker = 0
for: each key value
////if ( checker & (1 << key) > 0 )
//////{
////////// the key-th position in the bit vector checker is already 1. Therefore, this is a duplicate key, do something here
//////}
////else
//////add the key-th position in bit vector to 1
//////checker = checker | (1<<key);

Linked list

Check: double vs single, head vs head+tail pointers
Runner technique
///slow = head;
///fast = head;
///while (fast != NULL && fast->next != NULL)
/////{
///////slow = slow->next;
///////fast = fast->next->next;
/////}
////
////if (fast != NULL)
//////{
//////// // there are odd number of nodes in the linked list
//////}

Sort
you should familiar with both space and time complexity in worst, average, best cases.
Stack and queue

Move one node from stack s to stack r: r.push (s.top()); s.pop(); // because void pop ()

Other tips

"assume efficient space" usually means start from the end of string/vector/...
In many cases, O(N^2) is acceptable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly