[ This file is way out of date ]

Performance:

  - MMX ycrcb conversion to rgb 

  - combine multiplies across weight/quant/idct prescale (?)

  - parse:
        - re-arrange data so that coeff blocks are all one big array
          (alignment + one big memset (mmx!) at beginning of segment)
        - more efficient bookkeeping (vs current brute force mark and sweep) 
          in second and third passes of parse

  - still optimize vlc:
        - combine lookup tables that use the same index:
            - first level of classes, class_index_mask, class_index_rshift
	      (all indexed by maxbits)
            - vlc_lookups, vlc_index_mask, vlc_index_rshift (all indexed by class)
            - sign_mask, sign_rshift (indexed by vlc len)
	  
  - think about optimizing vlc/getbits interface based on a few 
    observations:
	- there are three lookups in vlc of the form ((bits & mask) >> shift) are really doing this:
              bitstream_show_skip(bs,skip,len) // show len bits, beginning skip bits from current position
        - if we add that interface, and then mmx getbits, this could free registers
          for better tuning the rest of the vlc lookup code.
            - note that start and len are bounded to the range 0-16, it might pay
              to ensure that after flush, show can always count on at least 16 bits
              remaining in bs->current_word
            - (there are multiple shows for each flush - eliminates branch in show)
	- since we parse a whole video segment before we do idcts, we can reserve
          mmx registers for getbits state for the entire duration of parsing a video segment
             - note that bitstream state is re-initialized everytime we start a new video segment

 - mmx version of 248 idct

  - tune cache footprint: access input and output withouth polluting L1

  - get everything working in Windows and use VTune to analyze and
    improve x86 performance.

Documentation:

  - there is none!

  - the contents of this file has/will move to the project task list
    on sourceforge.