[gimp-dds] / tags / release-2.0.7 / README.dxt Repository:
ViewVC logotype

View of /tags/release-2.0.7/README.dxt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 126 - (download) (annotate)
Fri Dec 12 19:25:26 2008 UTC (11 months, 1 week ago) by cocidius
File size: 8617 byte(s)
Release 2.0.7 tag
    1 The following notes are from an email I received from Fabian 'ryg' Giesen. They
    2 should help answer some questions regarding the code found in dxt.c
    3 
    4 ---
    5 
    6 mul8bit: This formula is equivalent to (a*b)/255 for a fairly large set of
    7 values (didn't bother finding out the exact bounds). It's a fairly well-known
    8 trick also used in other parts of the GIMP codebase (among others) and was
    9 first documented by Jim Blinn, I think. A good reference is his book "Dirty
   10 Pixels".
   11 
   12 ---
   13 
   14 lerp_rgb: The expression computed is exactly equivalent to mul8bit(a[i],255-f)
   15 + mul8bit(b[i],f) - I just verified that by brute force for -255 <= b[i]-a[i]
   16 <= 255 because I couldn't be bothered to find a derivation for this :) . You
   17 customarily use a factor between 0 and 256 incluse for LERPing if you can, but
   18 normal DXT blocks have colors placed at 1/3 and 2/3 between the two
   19 interpolated colors. 255 is divisible by 3, so lerp_rgb can later be used in
   20 eval_colors to determine the result of
   21 
   22   a*(1/3) + b*(2/3)  and  a*(2/3) + b*(1/3)
   23 
   24 exactly, which is nice :)
   25 
   26 ---
   27 
   28 dither_block: This is just Floyd-Steinberg dithering. Distributing the error
   29 terms to the adjacent pixels for each source pixel is the customary variant to
   30 write this, but since blocks are so small, it's nearly all boundary
   31 cases; "gathering" the error terms per source pixel turned out to be simpler.
   32 
   33 ---
   34 
   35 match_colors_block: This includes a few tricks. We want to map each source
   36 color to its nearest representable color (index), using the euclidean distance
   37 as a metric.
   38 
   39 The obvious, brute-force way is to just compare squared distances to the 4
   40 representable colors for each source pixel (using, for example,
   41 color_distance); this requires a lot of arithmetic operations.
   42 
   43 Instead, the code uses the fact that the 4 colors lie on a line in RGB space
   44 (only approximately in truth, since we have discrete steps). It's a well-known
   45 fact in geometry that if P is the closest point to the line L in 3D space, and
   46 Q is the point closest to P on L, then (P-Q) is orthogonal to the direction of
   47 L. So (in R3 at least), we can simply determine Q by projecting P onto the
   48 direction vector of L, which is done by the 16 dot products in the first for
   49 loop. Since the RGB values have discrete steps in reality, this is just an
   50 approximation, but a quite good one.
   51 
   52 The loop after that determines where the 4 actually representable colors lie
   53 along the line. After that, you simply need to determine which of those 4
   54 values your current pixel's dot product is closest to. Instead of doing a
   55 bunch of comparisions per pixel, the code computes the points along the line
   56 at which the decision would change (that's c0pt, halfpt and c3pt). This would
   57 still require 3 comparisions; by testing the middle point - which is halfpt -
   58 first, one point is always excluded from consideration, which reduces the
   59 number of compares to two in all cases. No big deal, but hey, why not :)
   60 
   61 Similarly, instead of dithering full RGB values, I just dither the dot product
   62 values. Again, by my experiments this works just as well and reduces the
   63 amount of work significantly.
   64 
   65 ---
   66 
   67 optimize_colors_block: This first determines min/max/mean for r,g,b and the
   68 covariance matrix for the color distribution. The latter is used to determine
   69 the principal component (=eigenvector with largest eigenvalue) of that color
   70 distribution, or the direction along which the colors in the block vary most
   71 (in layman's terms) - the eigenvector is determined using simple power
   72 iteration (a standard technique). That iteration needs a seed vector; I just
   73 use (max_r-min_r,max_g-min_g,max_b-min_b), which works well in practice. If
   74 the iteration converges to a vector with very small magnitude (or zero), which
   75 can happen sometimes, it just defaults to an approximation of the YCbCr Y
   76 vector (scaled appropriately to make sure no precision is lost with the dot
   77 products).
   78 
   79 This is then used as an initial approximation for the direction of the line
   80 through RGB color space that is used to select colors for that block. It
   81 simply uses the two most extreme points along that axis as the two colors
   82 stored in the block.
   83 
   84 ---
   85 
   86 refine_block: This takes a block and a chosen set of color indices, and tries
   87 to determine the optimal endpoints for these indices (i.e. the full process
   88 is: take block, use color distribution to get rough estimate of optimal
   89 direction, assign color indices accordingly, use these to get better
   90 endpoints, assign indices again). The computation just solves a least-squares
   91 system to minimize the square error between the actual pixels and the
   92 interpolated colors (solving for the two extremal colors). The least-squares
   93 computation turns out to boil down to solving a 2x2 system of linear equations
   94 for each of the RGB color channels; the actual solution is computed using
   95 Cramer's rule.
   96 
   97 The code is somewhat weird (especially the "prods"/"akku" thing), but that's
   98 just to reduce the amount of computation done (this piece of code is a hot
   99 spot, so it's worth it).
  100 
  101 The (!yy || !xx || xx * yy == xy*xy) part checks whether the system of linear
  102 equations is degenerate. After pondering about this some months ago, I found
  103 out that the only case in which this can ever happen is when all pixels of the
  104 source block get mapped to the same color value. But that case can be handled
  105 better in any case, by just using the single-color lookup tables. I've
  106 attached the new version of refine_block using that observation - it both
  107 increases performance (a bit) and image quality, so it's pretty neat.
  108 
  109 ---
  110 
  111 encode_alpha_block_DXT5: The only thing that shouldn't be obvious is the index
  112 computation. This just uses some two's complement arithmetic and bit shuffling
  113 to avoid computing
  114 
  115   7 * (in_alpha - min_alpha) / (max_alpha - min_alpha)
  116 
  117 which would be more expensive. (The extra calc with idx is just because of the
  118 weird DXT color numbering).
  119 
  120 ---
  121 
  122 Some more notes on the general flow:
  123 
  124 The computation without dithering is just as I explained in the part about
  125 refine_block:
  126 1. Calc initial endpoints directly from block colors
  127 2. Determine color indices for these endpoints
  128 3. Optimize endpoints given color indices
  129 4. Determine new color indices given optimized endpoints
  130 
  131 With dithering, there's a twist: The first two steps are done using a version
  132 of the block dithered to colors that are representable using 16-bit 565 RGB
  133 values. I've found that this significantly improves visual quality with
  134 dithering; if you don't do this, the colors inside a block typically vary too
  135 little for dithering to be useful. This process decreases objective quality
  136 but typically looks notably better.
  137 
  138 The single-color match (omatch5/omatch6) trick: If the block only contains a
  139 single color (or, for the improved version of refine_block, if the color
  140 values are sufficiently close to all map to the same index), an optimal
  141 solution can be used instead.
  142 
  143 You normally want solid-color blocks to map to solid-color block (because
  144 dithering patterns are very obvious otherwise). This means that all color
  145 indices for the block are going to be identical, i.e. all 0, 1, 2 or 3. All-0
  146 is symmetrical to all-1 (with the endpoints flipped), and all-2 is symmetrical
  147 to all-3 (again with the endpoints flipped). So you only need to consider
  148 all-0 or all-2 indices for the block. Furthermore, all-0 means that the first
  149 endpoint specified in the block gets used for all pixels; you can get the same
  150 result by setting both endpoints to the same value and using index 2 for
  151 everything.
  152 
  153 In short, you can always set all indices to 2 without sacrificing any quality
  154 whatsoever. For any of the color components R,G,B, you then want to determine
  155 5-bit or 6-bit values a and b such that
  156 
  157   expand[a]*(2/3) + expand[b]*(1/3) is as close as possible to R/G/B
  158 
  159 and that's exactly what's in omatch5 (for 5-bit values) and omatch6 (for 6-bit
  160 values).
  161 
  162 If you use the 3-color+transparency mode of DXT1, you need separate versions
  163 for omatch5/omatch6 for this case, since the interpolated value is exactly
  164 halfway between the two endpoints instead of 1/3 of the way along. But I
  165 recommend against that mode, because even if your top-level mipmap has 1-bit
  166 transparency, mipmaps will have more than 2 distinct values, and the DXT mode
  167 is selected per texture. That's why my original code doesn't support the
  168 3-color mode of DXT1 at all: I don't think it's useful in practice.
  169 
  170 Not sure if all of this is useful to you or not, but I guess it might make
  171 sense to at least put this in a seperate text file or whatever, because
  172 otherwise it's really quite hard to see what's going on in some places.
  173 
  174 Cheers,
  175 -Fabian "ryg" Giesen

ViewVC Help
Powered by ViewVC 1.0.4