Parent Directory
|
Revision Log
Release 2.0.3 tag
1 The following notes are from an email I received from Fabian 'ryg' Giesen. They 2 should help answer some questions regarding the code found in dxt.c 3 4 --- 5 6 mul8bit: This formula is equivalent to (a*b)/255 for a fairly large set of 7 values (didn't bother finding out the exact bounds). It's a fairly well-known 8 trick also used in other parts of the GIMP codebase (among others) and was 9 first documented by Jim Blinn, I think. A good reference is his book "Dirty 10 Pixels". 11 12 --- 13 14 lerp_rgb: The expression computed is exactly equivalent to mul8bit(a[i],255-f) 15 + mul8bit(b[i],f) - I just verified that by brute force for -255 <= b[i]-a[i] 16 <= 255 because I couldn't be bothered to find a derivation for this :) . You 17 customarily use a factor between 0 and 256 incluse for LERPing if you can, but 18 normal DXT blocks have colors placed at 1/3 and 2/3 between the two 19 interpolated colors. 255 is divisible by 3, so lerp_rgb can later be used in 20 eval_colors to determine the result of 21 22 a*(1/3) + b*(2/3) and a*(2/3) + b*(1/3) 23 24 exactly, which is nice :) 25 26 --- 27 28 dither_block: This is just Floyd-Steinberg dithering. Distributing the error 29 terms to the adjacent pixels for each source pixel is the customary variant to 30 write this, but since blocks are so small, it's nearly all boundary 31 cases; "gathering" the error terms per source pixel turned out to be simpler. 32 33 --- 34 35 match_colors_block: This includes a few tricks. We want to map each source 36 color to its nearest representable color (index), using the euclidean distance 37 as a metric. 38 39 The obvious, brute-force way is to just compare squared distances to the 4 40 representable colors for each source pixel (using, for example, 41 color_distance); this requires a lot of arithmetic operations. 42 43 Instead, the code uses the fact that the 4 colors lie on a line in RGB space 44 (only approximately in truth, since we have discrete steps). It's a well-known 45 fact in geometry that if P is the closest point to the line L in 3D space, and 46 Q is the point closest to P on L, then (P-Q) is orthogonal to the direction of 47 L. So (in R3 at least), we can simply determine Q by projecting P onto the 48 direction vector of L, which is done by the 16 dot products in the first for 49 loop. Since the RGB values have discrete steps in reality, this is just an 50 approximation, but a quite good one. 51 52 The loop after that determines where the 4 actually representable colors lie 53 along the line. After that, you simply need to determine which of those 4 54 values your current pixel's dot product is closest to. Instead of doing a 55 bunch of comparisions per pixel, the code computes the points along the line 56 at which the decision would change (that's c0pt, halfpt and c3pt). This would 57 still require 3 comparisions; by testing the middle point - which is halfpt - 58 first, one point is always excluded from consideration, which reduces the 59 number of compares to two in all cases. No big deal, but hey, why not :) 60 61 Similarly, instead of dithering full RGB values, I just dither the dot product 62 values. Again, by my experiments this works just as well and reduces the 63 amount of work significantly. 64 65 --- 66 67 optimize_colors_block: This first determines min/max/mean for r,g,b and the 68 covariance matrix for the color distribution. The latter is used to determine 69 the principal component (=eigenvector with largest eigenvalue) of that color 70 distribution, or the direction along which the colors in the block vary most 71 (in layman's terms) - the eigenvector is determined using simple power 72 iteration (a standard technique). That iteration needs a seed vector; I just 73 use (max_r-min_r,max_g-min_g,max_b-min_b), which works well in practice. If 74 the iteration converges to a vector with very small magnitude (or zero), which 75 can happen sometimes, it just defaults to an approximation of the YCbCr Y 76 vector (scaled appropriately to make sure no precision is lost with the dot 77 products). 78 79 This is then used as an initial approximation for the direction of the line 80 through RGB color space that is used to select colors for that block. It 81 simply uses the two most extreme points along that axis as the two colors 82 stored in the block. 83 84 --- 85 86 refine_block: This takes a block and a chosen set of color indices, and tries 87 to determine the optimal endpoints for these indices (i.e. the full process 88 is: take block, use color distribution to get rough estimate of optimal 89 direction, assign color indices accordingly, use these to get better 90 endpoints, assign indices again). The computation just solves a least-squares 91 system to minimize the square error between the actual pixels and the 92 interpolated colors (solving for the two extremal colors). The least-squares 93 computation turns out to boil down to solving a 2x2 system of linear equations 94 for each of the RGB color channels; the actual solution is computed using 95 Cramer's rule. 96 97 The code is somewhat weird (especially the "prods"/"akku" thing), but that's 98 just to reduce the amount of computation done (this piece of code is a hot 99 spot, so it's worth it). 100 101 The (!yy || !xx || xx * yy == xy*xy) part checks whether the system of linear 102 equations is degenerate. After pondering about this some months ago, I found 103 out that the only case in which this can ever happen is when all pixels of the 104 source block get mapped to the same color value. But that case can be handled 105 better in any case, by just using the single-color lookup tables. I've 106 attached the new version of refine_block using that observation - it both 107 increases performance (a bit) and image quality, so it's pretty neat. 108 109 --- 110 111 encode_alpha_block_DXT5: The only thing that shouldn't be obvious is the index 112 computation. This just uses some two's complement arithmetic and bit shuffling 113 to avoid computing 114 115 7 * (in_alpha - min_alpha) / (max_alpha - min_alpha) 116 117 which would be more expensive. (The extra calc with idx is just because of the 118 weird DXT color numbering). 119 120 --- 121 122 Some more notes on the general flow: 123 124 The computation without dithering is just as I explained in the part about 125 refine_block: 126 1. Calc initial endpoints directly from block colors 127 2. Determine color indices for these endpoints 128 3. Optimize endpoints given color indices 129 4. Determine new color indices given optimized endpoints 130 131 With dithering, there's a twist: The first two steps are done using a version 132 of the block dithered to colors that are representable using 16-bit 565 RGB 133 values. I've found that this significantly improves visual quality with 134 dithering; if you don't do this, the colors inside a block typically vary too 135 little for dithering to be useful. This process decreases objective quality 136 but typically looks notably better. 137 138 The single-color match (omatch5/omatch6) trick: If the block only contains a 139 single color (or, for the improved version of refine_block, if the color 140 values are sufficiently close to all map to the same index), an optimal 141 solution can be used instead. 142 143 You normally want solid-color blocks to map to solid-color block (because 144 dithering patterns are very obvious otherwise). This means that all color 145 indices for the block are going to be identical, i.e. all 0, 1, 2 or 3. All-0 146 is symmetrical to all-1 (with the endpoints flipped), and all-2 is symmetrical 147 to all-3 (again with the endpoints flipped). So you only need to consider 148 all-0 or all-2 indices for the block. Furthermore, all-0 means that the first 149 endpoint specified in the block gets used for all pixels; you can get the same 150 result by setting both endpoints to the same value and using index 2 for 151 everything. 152 153 In short, you can always set all indices to 2 without sacrificing any quality 154 whatsoever. For any of the color components R,G,B, you then want to determine 155 5-bit or 6-bit values a and b such that 156 157 expand[a]*(2/3) + expand[b]*(1/3) is as close as possible to R/G/B 158 159 and that's exactly what's in omatch5 (for 5-bit values) and omatch6 (for 6-bit 160 values). 161 162 If you use the 3-color+transparency mode of DXT1, you need separate versions 163 for omatch5/omatch6 for this case, since the interpolated value is exactly 164 halfway between the two endpoints instead of 1/3 of the way along. But I 165 recommend against that mode, because even if your top-level mipmap has 1-bit 166 transparency, mipmaps will have more than 2 distinct values, and the DXT mode 167 is selected per texture. That's why my original code doesn't support the 168 3-color mode of DXT1 at all: I don't think it's useful in practice. 169 170 Not sure if all of this is useful to you or not, but I guess it might make 171 sense to at least put this in a seperate text file or whatever, because 172 otherwise it's really quite hard to see what's going on in some places. 173 174 Cheers, 175 -Fabian "ryg" Giesen
| ViewVC Help | |
| Powered by ViewVC 1.0.4 |