dataTypes.txt

==common data types seen in Director/Shockwave formats==
format
{
shortened name(bytesize)
long name
aka
purposes/description
}

Uint32(4):
Unsigned 32-bit Integer
Unsigned Integer, Uint
commonly used for encoding lengths in binary files

Int32(4):
Signed 32-bit Integer
Signed Integer, Integer, Int
like a Uint32, but with positive and negative values, the first bit is a 'sign' bit,
that is, it's used to indicate if the number is negative or positive. These are
used for storing a multitude of values, excluding file/data sizes

Uint16(2)
Unsigned 16-bit Integer
Unsigned Short, Ushort
half the size of a Uint32, can store significantly smaller values, hence
being more commonly referred to as an unsigned "short". Used especially in
image dimensions because most images have yet to exceed width or heights exceeding 65,535

Int16(2)
Signed 16-bit Integer
Signed Short, Short
like an Int32, but two bytes, instead of four.

Uint8(1)
Unsigned 8-bit Integer
Unsigned Byte, Ubyte, Byte
For values that won't exceed the range 0-255
The Shockwave formats use these especially in encoding the length of shorter text strings

String(?)
String
String
Values that store text, these have an arbitrary length, which is defined somewhere before,
or even completely isolated from the actual data.
{modifiers}
Null-terminated: character data followed immediatly by a byte 0x00
Length-prefixed: character data prefixed by some number indicating its length
ASCII: text is encoded as ASCII, rather not go into super-specific details... character size: 1 byte

Varint(?)
Varying Length Integer
Variable-Length Quantity
a special type of integer that for all practical purposes, is comprised of a group of 7-bit values.
The first bit is used to signify if the next byte is part of the value. The DCR format uses these almost exclusively

Boolean(1)
Boolean
Bool
A value storing true(1) or false(0). The lowest adressable size in memory or on disk is a byte,
so these always take-up an entire byte when in use. When either not in use, or writing to a file,
these can sometimes be stored in Bitfields.

Bitfield(?)
Bitfield
Flags, Bitmask
Effectively, a super-compact array of Booleans
It's not always efficient to store entire booleans to
files, this was especially true of older systems, which
had limited storage space. As the name suggests, each
individual bit is its own value. These are typically read
using bitwise operators.

FourCC(4)
Four Character Code
FourCC
A string of exactly four characters, used by IFF and RIFF formats as ID codes for various sections/blocks

Tag
RIFF tag
RIFF tag, RIFF chunk, Chunk, Section
These are used to provide the basic stucture of shockwave files, as Shockwave is a variant of
R.I.F.F. : Resource Interchange File Format
The basic structure is this:
{
	FourCC ID
	Uint32 length
	<length> bytes data
}
the main tag of the entire file also has one extra fourCC in the first four bytes of data:
FourCC formID

DCRTag
DCRTag
DCRTag, DCRchunk, DCRsection
Similar to a RIFF tag, but these use Varints to encode their lengths

UTCDate(4)
Universal Time Coordinated Date/Time
Date
counts milliseconds since January 1, 1970

GUID
Globally Unique Identifier
UUID, Universally Unique Identifier
16 bytes representing a unique id for an object or codebase
Shockwave/Director makes use of these in XTRAs
{
	// for more info: https://en.wikipedia.org/wiki/Universally_unique_identifier#Format
	Uint32 time_low
	Uint16 time_mid
	Uint16 time_hi_and_version
	Uint16 clock_seq_hi_and_res clock_seq_low
	Uint48 node //6 bytes
}

Block
Block
Struct //specifically in languages like C
Some arbitrarily sized and formatted piece of data, is generally a composite of other data types, and
is effectively a small format in and of itself. Pretty much any custom format has these in some shape or
form. There's no universal rule on how to lay these out, and it varies largely between developers.

==other stuff==
[<big|little>-end] , [<big|little|lit>]
{
	used to indicate if a value's byte order is big-ended or little-ended
	By default, shockwave tends to use little-endian byte order, including the FourCC
}
[ghost]
{
	The actual data does not exist for this entry, but it's been pointed to for an unknown reason
}
/* */
{
	multi-line comment
	Not all that neccesarry in documentation, but none the less is used to
	keep sidenotes separate from actual information about the format
}
//
{
	single line comment, same purpose as multi-line comment, but only occur on one line
}
()
{
	primarily used to indicate possible values and meanings of certain data
	indicates a size if appended directly to a name like section(len)
}
@
{
	specific to a certain type of data
}
[loose]
{
	this structure is not concrete and might be split into pieces
}
Byte-Aligned
{
	Something has been forced to always be divisible by certain length in bytes
}

==RIFF/RIFX==
RIFF: http://www.johnloomis.org/cpe102/asgn/asgn1/riff.html
basically, 
(
	FourCC(4) chunkID
	Uint32(4) chunklength
	<chunklength> bytes chunkdata (plus extra byte if not even length)
)

special chunks (like "RIFF") have one extra field
(
	FourCC(4) chunkID
	Uint32(4) chunkLength
	<chunklength> bytes chunkdata (plus extra byte if not even length)
	(
		FourCC(4) formType ("WAVE", for example)
		sub-chunks
	)
)

RIFX:
network byte order / big-endian

XFIR:
network byte order / big-endian
EXCEPT
(
	chunkID
	chunkLength
	formType
), which are little-endian

==VLIST==
Vlist { // or Vector List
	// Thanks to Tommysshadow for figuring this out
	U32 length // inclusive
	/*
		There is no known way to magically identify a VList's type
		It must be manually analyzed in a hex editor or similar
		In any case, the only difference between a VLIST16 and a VLIST32 is
		the type of numbers stored in its numbers array
		
		The numbers Array, also, is not technically considered part of the VList
	*/
	@VLIST16 : {
		int numlength = ((length - 4) / 2)
		U16Array(numlength) numbers[]
	}
	@VLIST32 : {
		int numlength = ((length - 4) / 4)
		U32Array(numlength) numbers[]
	}
	u16 entrycount
	U32Array(entrycount) endpointers[]
	U32 listendpointer // or can just read the array with one extra value
	Array(entrycount) vlist [ // could call it entries, whatever...
		/*
			The length of a given entry <x> is calculated with
			endpointers[x + 1] - endpointers[x]
			This is also why including the listendpointer in the endpointers array
			is useful, because it avoids declaring an extra variable and needing an
			if statement to detect the one time that variable needs to be used

			It would probably be more accurate to call endpointers 'pointers',
			because the pointer corresponding to any given entry is in fact its
			relative offset from the base of the vlist. It is the next entry which
			marks the end of the current entry's data

			the base offset of the vlist is immediatly after listendpointer,
			in case that wasn't obvious
		*/
	]
}