Details | Last modification | View Log | RSS feed
Rev | Author | Line No. | Line |
---|---|---|---|
4349 | Serge | 1 | The official guide to swscale for confused developers. |
2 | ======================================================== |
||
3 | |||
4 | Current (simplified) Architecture: |
||
5 | --------------------------------- |
||
6 | Input |
||
7 | v |
||
8 | _______OR_________ |
||
9 | / \ |
||
10 | / \ |
||
11 | special converter [Input to YUV converter] |
||
12 | | | |
||
13 | | (8bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 ) |
||
14 | | | |
||
15 | | v |
||
16 | | Horizontal scaler |
||
17 | | | |
||
18 | | (15bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 ) |
||
19 | | | |
||
20 | | v |
||
21 | | Vertical scaler and output converter |
||
22 | | | |
||
23 | v v |
||
24 | output |
||
25 | |||
26 | |||
27 | Swscale has 2 scaler paths. Each side must be capable of handling |
||
28 | slices, that is, consecutive non-overlapping rectangles of dimension |
||
29 | (0,slice_top) - (picture_width, slice_bottom). |
||
30 | |||
31 | special converter |
||
32 | These generally are unscaled converters of common |
||
33 | formats, like YUV 4:2:0/4:2:2 -> RGB12/15/16/24/32. Though it could also |
||
34 | in principle contain scalers optimized for specific common cases. |
||
35 | |||
36 | Main path |
||
37 | The main path is used when no special converter can be used. The code |
||
38 | is designed as a destination line pull architecture. That is, for each |
||
39 | output line the vertical scaler pulls lines from a ring buffer. When |
||
40 | the ring buffer does not contain the wanted line, then it is pulled from |
||
41 | the input slice through the input converter and horizontal scaler. |
||
42 | The result is also stored in the ring buffer to serve future vertical |
||
43 | scaler requests. |
||
44 | When no more output can be generated because lines from a future slice |
||
45 | would be needed, then all remaining lines in the current slice are |
||
46 | converted, horizontally scaled and put in the ring buffer. |
||
47 | [This is done for luma and chroma, each with possibly different numbers |
||
48 | of lines per picture.] |
||
49 | |||
50 | Input to YUV Converter |
||
51 | When the input to the main path is not planar 8 bits per component YUV or |
||
52 | 8-bit gray, it is converted to planar 8-bit YUV. Two sets of converters |
||
53 | exist for this currently: One performs horizontal downscaling by 2 |
||
54 | before the conversion, the other leaves the full chroma resolution, |
||
55 | but is slightly slower. The scaler will try to preserve full chroma |
||
56 | when the output uses it. It is possible to force full chroma with |
||
57 | SWS_FULL_CHR_H_INP even for cases where the scaler thinks it is useless. |
||
58 | |||
59 | Horizontal scaler |
||
60 | There are several horizontal scalers. A special case worth mentioning is |
||
61 | the fast bilinear scaler that is made of runtime-generated MMXEXT code |
||
62 | using specially tuned pshufw instructions. |
||
63 | The remaining scalers are specially-tuned for various filter lengths. |
||
64 | They scale 8-bit unsigned planar data to 16-bit signed planar data. |
||
65 | Future >8 bits per component inputs will need to add a new horizontal |
||
66 | scaler that preserves the input precision. |
||
67 | |||
68 | Vertical scaler and output converter |
||
69 | There is a large number of combined vertical scalers + output converters. |
||
70 | Some are: |
||
71 | * unscaled output converters |
||
72 | * unscaled output converters that average 2 chroma lines |
||
73 | * bilinear converters (C, MMX and accurate MMX) |
||
74 | * arbitrary filter length converters (C, MMX and accurate MMX) |
||
75 | And |
||
76 | * Plain C 8-bit 4:2:2 YUV -> RGB converters using LUTs |
||
77 | * Plain C 17-bit 4:4:4 YUV -> RGB converters using multiplies |
||
78 | * MMX 11-bit 4:2:2 YUV -> RGB converters |
||
79 | * Plain C 16-bit Y -> 16-bit gray |
||
80 | ... |
||
81 | |||
82 | RGB with less than 8 bits per component uses dither to improve the |
||
83 | subjective quality and low-frequency accuracy. |
||
84 | |||
85 | |||
86 | Filter coefficients: |
||
87 | -------------------- |
||
88 | There are several different scalers (bilinear, bicubic, lanczos, area, |
||
89 | sinc, ...). Their coefficients are calculated in initFilter(). |
||
90 | Horizontal filter coefficients have a 1.0 point at 1 << 14, vertical ones at |
||
91 | 1 << 12. The 1.0 points have been chosen to maximize precision while leaving |
||
92 | a little headroom for convolutional filters like sharpening filters and |
||
93 | minimizing SIMD instructions needed to apply them. |
||
94 | It would be trivial to use a different 1.0 point if some specific scaler |
||
95 | would benefit from it. |
||
96 | Also, as already hinted at, initFilter() accepts an optional convolutional |
||
97 | filter as input that can be used for contrast, saturation, blur, sharpening |
||
98 | shift, chroma vs. luma shift, ...><>><> |