Skip to content
Dizhi Zhou edited this page Mar 15, 2014 · 47 revisions

Data structure

Array

Usage in the following cases:
1, The number of data items is known
2, Access randomly by index. O(1)
Extensions:
1, ArrayList: dynamic resizing array while still having O(1) access
2, StringBuffer: a structure which has the end pointer of a string

Linked list

Usage in the following cases:
1, The number of data items is unknown
2, Access time > O(1) because the pointer is only for one direction
Techniques:
1, Sentinels: a dummy object in the head or tail of the linked list. Purpose: clarity of the code
2, Runner

Queues

Usage in the following cases:
1, Breadth first search (BPF): shortest path, least number of moves
Extensions:
1, Priority queue: not pop oldest, but element with highest value; can be implemented by heap (e.g., always do max heap when push, and put the max value to the front;)
solution1: O(nlogn) + O(1)

  • push: heapsort each time[O(nlogn)]; // or other sort algorithm
  • pop: pop_back each time[O(1)]

solution2: O(n) + O (n)

  • push: buildheap [O (n)]
  • pop: swap(begin, end)(O(1)) + pop_back (O(1)) + buildheap (O(n)) [total:O (n) ];

Operations: enqueue, dequeue
Implementation: FIFO + linked list (if queue length is unknown), FIFO + array (if queue length is known)

Stacks

Usage in the following cases:
1, Recursive function: similar to per-ordered traversal
2, Depth first search
Operations: pop, push, peek
Implementation: LIFO + linked list

Heap

an array/a vector, with complete binary search tree feature. maxheapify->buildheap->heapsort.

Trees

Concept: root node and multiple child nodes
Extensions:
1, Binary tree: one root node has two child nodes
2, Binary search tree: binary tree that left < current < right key
3, Balanced binary search tree:

  • Red-black tree: using colour, height at most 2*lg(n+1); balance: insert, re-colour + rotation;
  • AVL tree

4, B-Tree: generalized binary search tree which can have more than two children
6, Tries
Operations: traversal (per-ordered, in-ordered, post-ordered), insert, search, delete

Graph

Operations:
1, DFS: per-ordered traversal
3, BFS: queue

Hash table

Concept: hash table = table + hash function + collision process;
Implementation(c++):
1, table:

  • array/vector: access time: O (1); e.g., std::vector<int, std::list > // use list as a chaining;
  • BST: access time O (logN)

2, hash function: index = f (key), key is nature number. If not, then need to map keys to nature numbers.

  • division: index = key mod m // m better uses a prime not too close to an exact power of 2, in order to low the collision possibility)
  • multiplication: index = floor (m * (keyA - floor (keyA)) ) // m = 2^p, A = (sqrt(5)-1)/2 = 0.6180339887 from "the art of computer programming"
  • djb2: in my test, the index is much larger than others
  • sdbm: in my test, the index is much larger than others
  • lose base

3, collision:

  • chaining(linked list, self-balanced tree): search table index (O(1)), then search chaining
  • open addressing: search table index (O(1)), then do probe sequence
    NOTE:
    chaining = open hashing, closed addressing
    open addressing = closed hashing
    Performance:
    Open hashing is not good in high load factor (>0.7) due to clustering. Hence, the hash function must distribute the keys more uniformly over the buckets, and void clustering.

4, dynamically increase/decrease space for hash table

Operations:
1, insert, search, delete
Other concepts: load factor: the number of entries / the number of buckets

Basic algorithm

Bit operations

SetBit, GetBit, ClearBit

Sorting

Bubble sort
Selection sort
Merge sort: divide and conquer, all casAes O(NlogN)
Heap sort:
Quick sort: divide and conquer, worst case O(N^2), average/bes tO(NlgN)
Radix sort: example c++ code

Search

Binary search
Hash table

Skills

  1. Start
    you should consider following aspects:
  • write input and output in the comments at the beginning
  • clarify conditions, such as double or single linked list? ascii or unicode string? integer or float?
  • define interfaces and return types of function
  • ask interviewer can I write the logic idea first and write the codes later?
  • ask interviewer can I write the methods first and write the main function later?
  • List rules, special cases and main codes for general cases
  • For general cases, consider brute force solution first
  • write main function for testing: write test codes

Note:

  • special cases: 0, null, negative, max, min
  • always do the legally check for the input!
  • do not do assumption by yoursefl
  • modular coding
  1. String
  • ASCII string (0 - 255, totally 256) or Unicode string (much much much more!)
  • compare strings: check comparison is 1) case sensitive? e.g., A and a are diff or same? 2) is white space sensitive? e.g., "a " and "a" are same or diff?
  • count the number of symbol in a string: a[key] or hash table[key] count method, or bit vector(only check duplicate)

bit vector: int checker = 0
for: each key value
////if ( checker & (1 << key) > 0 )
//////{
////////// the key-th position in the bit vector checker is already 1. Therefore, this is a duplicate key, do something here
//////}
////else
//////add the key-th position in bit vector to 1
//////checker = checker | (1<<key);

  1. Linked list
  • Check: double vs single, head vs head+tail pointers
  • Runner technique
    ///slow = head;
    ///fast = head;
    ///while (fast != NULL && fast->next != NULL)
    /////{
    ///////slow = slow->next;
    ///////fast = fast->next->next;
    /////}
    ////
    ////if (fast != NULL)
    //////{
    //////// // there are odd number of nodes in the linked list
    //////}
  1. Sort
    you should familiar with both space and time complexity in worst, average, best cases.

  2. Stack and queue

  • Move one node from stack s to stack r: r.push (s.top()); s.pop(); // because void pop ()
  1. Other tips
  • "assume efficient space" usually means start from the end of string/vector/...
  • In many cases, O(N^2) is acceptable.