Go to most recent revision | Details | Last modification | View Log | RSS feed
Rev | Author | Line No. | Line |
---|---|---|---|
4358 | Serge | 1 | |
2 | |||
3 | |||
4 | |||
5 |
|
||
6 | |||
7 | |||
8 | |||
9 | |||
10 | |||
11 |
|
||
12 | |||
13 | |||
14 | |||
15 | |||
16 | |||
17 |
|
||
18 | |||
19 | |||
20 | The Gallium llvmpipe driver is a software rasterizer that uses LLVM to |
||
21 | do runtime code generation. |
||
22 | Shaders, point/line/triangle rasterization and vertex processing are |
||
23 | implemented with LLVM IR which is translated to x86 or x86-64 machine |
||
24 | code. |
||
25 | Also, the driver is multithreaded to take advantage of multiple CPU cores |
||
26 | (up to 8 at this time). |
||
27 | It's the fastest software rasterizer for Mesa. |
||
28 | |||
29 | |||
30 | |||
31 |
|
||
32 | |||
33 | |||
34 | |||
35 |
|
||
36 | |||
37 | Support for SSE2 is strongly encouraged. Support for SSSE3 and SSE4.1 will |
||
38 | yield the most efficient code. The fewer features the CPU has the more |
||
39 | likely is that you run into underperforming, buggy, or incomplete code. |
||
40 | |||
41 | |||
42 | See /proc/cpuinfo to know what your CPU supports. |
||
43 | |||
44 | |||
45 | |||
46 |
|
||
47 |
|
||
48 | Intel AVX extensions (e.g. Sandybridge). LLVM's code generator will |
||
49 | fail when trying to emit AVX instructions. This was fixed in LLVM 2.9. |
||
50 | |||
51 | |||
52 | For Linux, on a recent Debian based distribution do: |
||
53 | |||
54 | |||
55 | aptitude install llvm-dev |
||
56 | |||
57 | |||
58 | For a RPM-based distribution do: |
||
59 | |||
60 | |||
61 | yum install llvm-devel |
||
62 | |||
63 | |||
64 | |||
65 | For Windows you will need to build LLVM from source with MSVC or MINGW |
||
66 | (either natively or through cross compilers) and CMake, and set the LLVM |
||
67 | environment variable to the directory you installed it to. |
||
68 | |||
69 | LLVM will be statically linked, so when building on MSVC it needs to be |
||
70 | built with a matching CRT as Mesa, and you'll need to pass |
||
71 | -DLLVM_USE_CRT_RELEASE=MTd for debug and checked builds, |
||
72 | -DLLVM_USE_CRT_RELEASE=MTd for profile and release builds. |
||
73 | |||
74 | You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86 |
||
75 | to cmake. |
||
76 | |||
77 | |||
78 | |||
79 | |||
80 |
|
||
81 | |||
82 | |||
83 | |||
84 | |||
85 |
|
||
86 | |||
87 | To build everything on Linux invoke scons as: |
||
88 | |||
89 | |||
90 | scons build=debug libgl-xlib |
||
91 | |||
92 | |||
93 | Alternatively, you can build it with GNU make, if you prefer, by invoking it as |
||
94 | |||
95 | |||
96 | make linux-llvm |
||
97 | |||
98 | |||
99 | but the rest of these instructions assume that scons is used. |
||
100 | |||
101 | For Windows the procedure is similar except the target: |
||
102 | |||
103 | |||
104 | scons build=debug libgl-gdi |
||
105 | |||
106 | |||
107 | |||
108 |
|
||
109 | |||
110 | On Linux, building will create a drop-in alternative for libGL.so into |
||
111 | |||
112 | |||
113 | build/foo/gallium/targets/libgl-xlib/libGL.so |
||
114 | |||
115 | or |
||
116 | |||
117 | lib/gallium/libGL.so |
||
118 | |||
119 | |||
120 | To use it set the LD_LIBRARY_PATH environment variable accordingly. |
||
121 | |||
122 | For performance evaluation pass debug=no to scons, and use the corresponding |
||
123 | lib directory without the "-debug" suffix. |
||
124 | |||
125 | On Windows, building will create a drop-in alternative for opengl32.dll. To use |
||
126 | it put it in the same directory as the application. It can also be used by |
||
127 | replacing the native ICD driver, but it's quite an advanced usage, so if you |
||
128 | need to ask, don't even try it. |
||
129 | |||
130 | |||
131 |
|
||
132 | |||
133 | |||
134 | To profile llvmpipe you should build as |
||
135 | |||
136 | |||
137 | scons build=profile <same-as-before> |
||
138 | |||
139 | |||
140 | |||
141 | This will ensure that frame pointers are used both in C and JIT functions, and |
||
142 | that no tail call optimizations are done by gcc. |
||
143 | |||
144 | |||
145 |
|
||
146 | |||
147 | |||
148 | On Linux, it is possible to have symbol resolution of JIT code with Linux perf: |
||
149 | |||
150 | |||
151 | |||
152 | perf record -g /my/application |
||
153 | perf report |
||
154 | |||
155 | |||
156 | |||
157 | When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with |
||
158 | symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm, |
||
159 | which can be used by the bin/perf-annotate-jit script to produce disassembly of |
||
160 | the generated code annotated with the samples. |
||
161 | |||
162 | |||
163 |
|
||
164 | Gprof2Dot. |
||
165 | |||
166 | |||
167 |
|
||
168 | |||
169 | |||
170 | Building will also create several unit tests in |
||
171 | build/linux-???-debug/gallium/drivers/llvmpipe: |
||
172 | |||
173 | |||
174 | |||
175 | |||
176 | |||
177 | |||
178 | |||
179 | |||
180 | |||
181 | Some of this tests can output results and benchmarks to a tab-separated-file |
||
182 | for posterior analysis, e.g.: |
||
183 | |||
184 | |||
185 | build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv |
||
186 | |||
187 | |||
188 | |||
189 |
|
||
190 | |||
191 | |||
192 | |||
193 | When looking to this code by the first time start in lp_state_fs.c, and |
||
194 | then skim through the lp_bld_* functions called in there, and the comments |
||
195 | at the top of the lp_bld_*.c functions. |
||
196 | |||
197 | |||
198 | The driver-independent parts of the LLVM / Gallium code are found in |
||
199 | src/gallium/auxiliary/gallivm/. The filenames and function prefixes |
||
200 | need to be renamed from "lp_bld_" to something else though. |
||
201 | |||
202 | |||
203 | We use LLVM-C bindings for now. They are not documented, but follow the C++ |
||
204 | interfaces very closely, and appear to be complete enough for code |
||
205 | generation. See |
||
206 | http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html |
||
207 | for a stand-alone example. See the llvm-c/Core.h file for reference. |
||
208 | |||
209 | |||
210 | |||
211 | |||
212 | |||
213 | !DOCTYPE> |