dri-faq/implementation.xml at main · hatarch/dri-faq · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version='1.0' encoding='iso-8859-1'?>

<sect1 id="implementation">
	<title>Implementation</title>

	<para>
	This section will try to give information on the implementation details of a DRI driver.
	The issues presented here follow loosely the same order by which information flows when a application is using a DRI driver,
	i.e., it mimics the graphics pipeline.
	</para>

	<sect2>
		<title>The DRI driver initialization process?</title>

		<para>
		This is a description of the DRI driver initialization process.
		<footnote>
			<para>
			Extracted and edited from a series of emails between Ian Romanick and Brian Paul
			</para>
		</footnote>
		</para>

		<itemizedlist>
			<listitem>
				<para>
				The whole process begins when an application calls <function>glXCreateContext</function>
				(<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/glx/glxcmds.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xc/lib/GL/glx/glxcmds.c</filename>
				</ulink>).
				<function>glXCreateContext</function> is just a stub that call
				<function>CreateContext</function>.  The real work begins when <function>CreateContext</function> calls
				<function>__glXInitialize</function>
				(<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/glx/glxext.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xc/lib/GL/glx/glxext.c</filename>
				</ulink>).
				</para>
			</listitem>

			<listitem>
				<para>
				The driver specific initialization process starts with <function>__driCreateScreen</function>.
				Once the driver is loaded (via <function>dlopen</function>), <function>dlsym</function> is used to get a pointer to
				this function.  The function pointer for each driver is stored in the
				<varname>createScreen</varname> array in the <structname>__DRIdisplay</structname> structure.  This initialization is
				done in <function>driCreateDisplay</function>
				(<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/dri/dri_glx.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xc/lib/GL/dri/dri_glx.c</filename>
				</ulink>), which is called by
				<function>__glXInitialize</function>.
				</para>

				<para>
				Note that <function>__driCreateScreen</function> really
				is the bootstrap of a DRI driver.  It's the only
				<footnote>
					<para>
					that's not really true- there's also the <function>__driRegisterExtensions</function>
					function that <filename>libGL</filename> uses to implement <function>glXGetProcAddress</function>.  That's another
					long story.
					</para>
				</footnote>
				function in a DRI driver
				that <filename>libGL</filename> directly knows about.  All the other DRI functions are accessed via
				the <structname>__DRIdisplayRec</structname>, <structname>__DRIscreenRec</structname>,
				<structname>__DRIcontextRec</structname> and <structname>__DRIdrawableRec</structname>
				structs	defined in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/glx/glxclient.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xc/lib/GL/glx/glxclient.h</filename>
				</ulink>).  Those structures are pretty well documented in the file.
				</para>
			</listitem>

			<listitem>
				<para>
				After performing the <function>__glXInitialize</function> step, <function>CreateContext</function> calls the
				<function>createContext</function> function for the requested screen.  Here the driver creates
				two data structures.  The first, <function>GLcontext</function> (extras/Mesa/src/mtypes.h),
				contains all of the device independent state, device dependent constants
				(i.e., texture size limits, light limits, etc.), and device dependent
				function tables.  The driver also allocates a structure that contains all
				of the device dependent state.  The GLcontext structure links to the
				device dependent structure via the DriverCtx pointer.  The device
				dependent structure also has a pointer back to the GLcontext structure.
				</para>

				<para>
				The device dependent structure is where the driver will store context
				specific hardware state (register settings, etc.) for when
				context (in terms of OpenGL / X context) switches occur.  This structure is
				analogous to the buffers where the OS stores CPU state where a program
				context switch occurs.
				</para>

				<para>
				The texture images really are stored within Mesa's
				data structures.  Mesa supports about a dozen texture formats which
				happen to satisfy what all the DRI drivers need.  So, the texture format/
				packing is dependent on the hardware, but Mesa understands all the
				common formats.  See Mesa/src/texformat.h.  Gareth and Brian spent a lot of
				time on that.
				</para>
			</listitem>

			<listitem>
				<para>
				<function>createScreen</function> (i.e., the driver specific initialization function) is called
				for each screen from <function>AllocAndFetchScreenConfigs</function>
				(<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/glx/glxext.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xc/lib/GL/glx/glxext.c</filename>
				</ulink>).
				This is also called from <function>__glXInitialize</function>.
				</para>
			</listitem>

			<listitem>
				<para>
				For all of the existing drivers, the <function>__driCreateScreen</function> function is just a
				wrapper that calls <function>__driUtilCreateScreen</function>
				(<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/dri/dri_util.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xc/lib/GL/dri/dri_util.c</filename>
				</ulink>)
				with a
				pointer to the driver's API function table (of type <structname>__DriverAPIRec</structname>).  This
				creates a <structname>__DRIscreenPrivate structure</structname> for the display and fills it in
				(mostly) with the supplied parameters (i.e., screen number, display
				information, etc.).
				</para>

				<para>
				It also opens and initialized the connection to DRM.  This includes
				opening the DRM device, mapping the frame buffer (note: the DRM
				documentation says that the function used for this is called <function>drmAddMap</function>, but
				it is actually called drmMap), and mapping the SAREA.  The final step is
				to call the driver initialization function for the driver (from the
				<structfield>InitDriver</structfield> field in the <structname>__DriverAPIRec</structname> (<structfield>DriverAPI</structfield> field of the
				<structname>__DRIscreenPrivate</structname>).
				</para>
			</listitem>

			<listitem>
				<para>
				The <function>InitDriver</function> function does (at least in the Radeon and i810 drivers) two
				broad things.  It first verifies the version of the services (XFree86,
				DDX, and DRM) that it will use.
				</para>

				<para>
				The driver then creates an internal representation of the screen and
				stores it (the pointer to the structure) in the private field of the
				<structname>__DRIscreenPrivate</structname> structure.  The driver-private data may include things
				such as mappings of MMIO registers, mappings of display and texture
				memory, information about the layout if video memory, chipset version
				specific data (feature availability for the specific chip revision, etc.),
				and other similar data.  This is the handle that identifies the specific
				graphics card to the driver (in case there is more than one card in the
				system that will use the same driver).
				</para>
			</listitem>

			<listitem>
				<para>
				After performing the <function>__glXInitialize</function> step, <function>CreateContext</function> calls the
				<function>createContext</function> function for the requested screen.  This is where it gets
				pretty complicated.  I have only looked at the Radeon driver.
				<function>radeonCreateContext</function>
				(<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/mesa/src/drv/radeon/radeon_context.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xc/lib/GL/mesa/src/drv/radeon/radeon_context.c</filename>
				</ulink>)
				allocates a <structname>GLcontext</structname> structure (actually <userinput>struct __GLcontextRec</userinput> from
				extras/Mesa/src/mtypes.h).  Here it fills in function tables for virtually
				every OpenGL call.  Additionally, the <structname>__GLcontextRec</structname> has pointers to
				buffers where the driver will store context specific hardware state
				(textures, register settings, etc.) for when context (in terms of
				OpenGL / X context) switches occur.
				</para>

				<para>
				The <structname>__GLcontextRec</structname> (i.e. <structname>GLcontext</structname> in Mesa) doesn't have any buffers
				of hardware-specific data (except texture image data if you want to be
				picky).  All Radeon-specific, per-context data should be hanging off
				of the struct radeon_context.
				</para>

				<para>
				All the DRI drivers define a hardware-specific context structure
				(such as structure radeon_context, typedef'd to be radeonContextRec, or
				structure mga_context_t typedef'd to be mgaContext).
				</para>

				<para>
				<structname>radeonContextRec</structname> has a pointer back to the Mesa <structname>__GLcontextRec</structname>
				and Mesa's <userinput>__GLcontextRec->DriverCtx</userinput> pointer points back to the
				<structname>radeonContextRec</structname>.
				</para>

				<para>
				If we were writing all this in C++ (don't laugh) we'd treat Mesa's
				<structname>__GLcontextRec</structname> as a base class and create driver-specific derived
				classes from it.
				Inheritance like this is actually pretty common in the DRI code,
				even though it's sometimes hard to spot.
				</para>

				<para>
				These buffers are analogous to the
				buffers where the OS stores CPU state where a program context switch occurs.
				</para>

				<para>
				Note that we don't do any fancy hardware context switching in our drivers.
				When we make-current a new context, we basically update all the hardware
				state with that new context's values.
				</para>
			</listitem>

			<listitem>
				<para>
				When each of the function tables is initialized (see radeonInitSpanFuncs
				for an example), an internal Mesa function is called.  This function
				(e.g., <function>_swrast_GetDeviceDriverReference</function>) both allocates the buffer and
				fills in the function pointers with the software fallbacks.  If a driver
				were to just call these allocation functions and not replace any of the
				function pointers, it would be the same as the software renderer.
				</para>
			</listitem>

			<listitem>
				<para>
				The next part seems to start when the createDrawable function in the
				<structname>__DRIscreenRec</structname> is called, but I don't see where this happens.
				</para>

				<para>
				<function>createDrawable</function> should be called via <function>glXMakeCurrent</function> since that's the
				first time we're given an X drawable handle.  Somewhere during <function>glXMakeCurrent</function>
				we use a DRI hash lookup to translate the X Drawable handle
				into an pointer to a <structname>__DRIdrawable</structname>.  If we get a <envar>NULL</envar> pointer that means
				we've never seen that handle before and now have to allocate the
				<structname>__DRIdrawable</structname> and initialize it (and put it in the hash table).
				</para>
			</listitem>
		</itemizedlist>
	</sect2>

	<sect2 id="mesa-internals">
		<title>Mesa internals</title>

		<sect3>
			<title>How does one writes a new Mesa driver?</title>

			<para>
			There are two basic aspects to writing a new driver.
			</para>

			<para>
			First, define the public OpenGL / window system API.  In the case of GLX,
			these are the <function>glx*()</function> functions.  For OSMesa these are the <function>OSMesa*()</function> functions seen in <filename>include/GL/osmesa.h</filename>.  You'll basically need functions for specifying
			frame buffer formats (bits per rgb, bits for Z, bits for stencil, etc.),
			functions for creating/destroying contexts, binding contexts to windows. etc.
			</para>

			<para>
			Second, implement the internal functions needed by the "DD" interface.
			Look at the <filename>osmesa.c</filename> file and grep for <quote>ctx->Driver. = </quote>.  This
			is where the driver hooks itselft into the core of Mesa.  In many cases
			we hook in fall-back functions (like <function>_swrast_DrawPixels</function>).
			</para>

			<para>
			This isn't simple (or even as straight-forward as is used to be) but the system's designed for efficiently, flexibility and modularity.
			If the device driver interface were made for simplicity above all else
			there would probably only be two driver functions: <function>ReadPixel()</function> and
			<function>WritePixel()</function>.
			</para>

			<para>
			The OSMesa driver is pretty simple.  The only complexity comes from supporting
			all the different frame buffer formats like RGB, RGBA, BGRA, ABGR, etc.
			I think the Windows driver is in pretty good shape too.  The XMesa driver
			(upon which Mesa's GLX is layered) is rather large because of lots of
			frame buffer formats and optimized point/line/triangle rendering functions.
			</para>
		</sect3>

		<sect3 id="mesa-3.4.x-internals">
			<title>Old Mesa 3.4.x Implementation Notes</title>

			<para>
			This document is an overview of the internal structure of Mesa and is meant for those who are interested in modifying or enhancing Mesa, or just curious.
			</para>

			<note>
				<para>
				Based on the original <ulink url="http://www.mesa3d.org/docs/Implementation.html">Mesa Implementation Notes</ulink> by Brian Paul.
				</para>
			</note>

			<sect4>
				<title>Library State and Contexts</title>

				<para>
				OpenGL uses the notion of a state machine.
				Mesa encapsulates the state in one large structure: <structname>gl_context</structname>, as seen in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/types.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>types.h</filename>
				</ulink>
				</para>

				<para>
				The <structname>gl_context</structname> structure actually contains a number of sub structures which exactly correspond to OpenGL's attribute groups.
				This organization made <function>glPushAttrib</function> and <function>glPopAttrib</function> trivial to implement and proved to be a good way of organizing the state variables.
				</para>
			</sect4>

			<sect4>
				<title>Vertex buffer</title>

				<para>
				The vertices between <function>glBegin</function> and <function>glEnd</function> are accumulated in the
				vertex buffer (see
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/vb.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>vb.h</filename>
				</ulink>
				and
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/vb.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>vb.c</filename>
				</ulink>
				).
				When either the vertex buffer becomes filled or a state change outside the
				<function>glBegin</function>/<function>glEnd</function> is made, we must flush
				the buffer.
				That is, we apply the vertex transformations, compute lighting,
				fog, texture coordinates etc.
				Then, we can render the vertices as
				points, lines or polygons by calling the <function>gl_render_vb()</function> function in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/render.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>render.c</filename>
				</ulink>.
				</para>

				<para>
				When we're outside of a <function>glBegin</function>/<function>glEnd</function> pair the information in this
				structure is retained pending either of the flushing events
				described above.
				</para>

				<note>
					<para>
					Originally, Mesa didn't accumulate vertices in this way.
					Instead, <function>glVertex</function> transformed and lit then buffered each vertex as it was received.
					When enough vertices to draw the primitive (1 for points, 2 for lines, &gt;2 for polygons) were accumulated the primitive was drawn and the buffer cleared.
					</para>

					<para>
					The new approach of buffering many vertices and then transforming, lighting and clip testing is faster because it's done in a <quote>vectorized</quote> manner.
					See <function>gl_transform_points</function> in
					<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/xform.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
						<filename>xform.c</filename>
					</ulink>
					for an example.
					Also, vertices shared between primitives (i.e. <varname>GL_LINE_STRIP</varname>) are only transformed once.
					</para>
				</note>

				<para>
				The only complication is clipping.
				If no vertices in the vertex buffer have their clip flag set, the rasterization functions can be applied directly to the vertex buffer.
				Otherwise, a clipping function is called before rasterizing each primitive.
				If clipping introduces new vertices they will be stored at the end of the vertex buffer.
				</para>

				<para>
				For best performance Mesa clients should try to maximize the number of vertices between <function>glBegin</function>/<function>glEnd</function> pairs and used connected primitives when possible.
				</para>
			</sect4>

			<sect4>
				<title>Rasterization</title>

				<para>
				The point, line and polygon rasterizers are called via the <structfield>PointsFunc</structfield>, <structfield>LineFunc</structfield>, and <structfield>TriangleFunc</structfield> function pointers in the <structname>dd_function_table</structname> driver function pointer table.
				Whenever the library state is changed in a significant way, the <structfield>NewState</structfield> context flag is raised.
				When <function>glBegin</function> is called <structfield>NewState</structfield> is checked. If the flag is set we re-evaluate the state to determine what rasterizers to use.
				Special purpose rasterizers are selected according to the status of certain state variables such as flat vs smooth shading, depth-buffered vs. non-depth- buffered, etc.
				The <function>gl_set_point|line|polygon_function</function> functions do this analysis.
				They in turn query the device driver for <quote>accelerated</quote> rasterizers.
				More on that later.
				</para>

				<para>
				In general, typical states (depth-buffered &amp; smooth-shading) result in optimized rasterizers being selected.
				Non-typical states (stenciling, blending, stippling) result in slower, general purpose rasterizers being selected.
				</para>
			</sect4>

			<sect4>
				<title>Pixel (fragment) buffer</title>

				<para>
				The general purpose point, line and bitmap rasterizers accumulate fragments (pixels plus color, depth, texture coordinates) in the PB (Pixel Buffer) structure seen in .
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/pb.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>pb.h</filename>
				</ulink>
				and
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/pb.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>pb.c</filename>
				</ulink>.
				When the pixel buffer is full or <function>glEnd</function> is called the pixel buffer is flushed.
				This includes clipping the fragments against the window, depth testing, stenciling, blending, stippling, etc.
				Finally, the pixel buffer's pixels are drawn to the display buffer by calling one of device driver functions.
				The goal is to maximize the number of pixels processed inside loops and to minimize
				the number of function calls.
				</para>
			</sect4>

			<sect4>
				<title>Pixel spans</title>

				<para>
				The polygon, <function>glDrawPixels</function>, and <function>glCopyPixels</function> functions generate horizontal runs of pixels called spans.
				Spans are processed in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/span.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>span.c</filename>
				</ulink>.
				Processing includes window clipping, depth testing, stenciling, texturing, etc.
				After processing the span is written to the frame buffer by calling a device driver function.
				</para>
			</sect4>

			<sect4>
				<title>Device Driver</title>

				<para>
				There are three Mesa data types which are meant to be used by device
				drivers:
				</para>

				<glosslist>
					<glossentry>
						<glossterm>GLcontext</glossterm>

						<glossdef>
							<para>
							this contains the Mesa rendering state
							</para>
						</glossdef>
					</glossentry>

					<glossentry>
						<glossterm>GLvisual</glossterm>

						<glossdef>
							<para>
							this describes the color buffer (rgb vs. ci), whether
							or not there's a depth buffer, stencil buffer, etc.
							</para>
						</glossdef>
					</glossentry>

					<glossentry>
						<glossterm>GLframebuffer</glossterm>

						<glossdef>
							<para>
							contains pointers to the depth buffer, stencil
							buffer, accum buffer and alpha buffers.
							</para>
						</glossdef>
					</glossentry>
				</glosslist>

				<para>
				These types should be encapsulated by corresponding device driver
				data types.  See
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/xmesa.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xmesa.h</filename>
				</ulink> and
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/xmesaP.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xmesaP.h</filename>
				</ulink> for an example.
				</para>

				<para>
				In OOP terms, <structname>GLcontext</structname>, <structname>GLvisual</structname>, and <structname>GLframebuffer</structname> are base classes
				which the device driver must derive from.
				</para>

				<para>
				The structure <structname>dd_function_table</structname> seen in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/dd.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>dd.h</filename>
				</ulink>,
				defines the device driver functions.
				By using a table of pointers, the device driver can be changed dynamically at runtime.
				For example, the X/Mesa and OS/Mesa (Off-Screen rendering) device drivers can co-exist in one library and be selected at runtime.
				</para>

				<para>
				In addition to the device driver table functions, each Mesa driver has its own set of unique interface functions.
				For example, the X/Mesa driver has the <function>XMesaCreateContext</function>, <function>XMesaBindWindow</function>, and <function>XMesaSwapBuffers</function>
				functions while the Windows/Mesa interface has <function>WMesaCreateContext</function>, <function>WMesaPaletteChange</function> and <function>WMesaSwapBuffers</function>.
				New Mesa drivers need to both implement the <structname>dd_function_table</structname> functions and define a set of unique window system or operating system-specific interface functions.
				</para>

				<para>
				The device driver functions can roughly be divided into four groups:
				</para>

				<orderedlist>
					<listitem>
						<para>
						pixel span functions which read or write horizontal runs of RGB or color-index pixels.
						Each function takes an array of mask flags which indicate whether or not to plot each pixel in the span.
						</para>
					</listitem>

					<listitem>
						<para>
						pixel array functions which are very similar to the pixel span functions except that they're used to read or write arrays of pixels at random locations rather than horizontal runs.
						</para>
					</listitem>

					<listitem>
						<para>
						miscellaneous functions for window clearing, setting the current drawing color, enabling/disabling dithering, returning the current frame buffer size, specifying the window clear color, synchronization, etc.
						Most of these functions directly correspond to higher level OpenGL functions.
						</para>
					</listitem>

					<listitem>
						<para>
						if your graphics hardware or operating system provides accelerated point, line and polygon rendering operations, they can be utilized through the <structfield>PointsFunc</structfield>, <structfield>LineFunc</structfield>, and <structfield>TriangleFunc</structfield> functions.
						Mesa will call these functions to <quote>ask</quote> the device driver for accelerated functions through the <function>UpdateState</function>.
						If the device driver can provide an appropriate renderer, given the current Mesa state, then a pointer to that function can be returned.
						Otherwise the <structfield>PointsFunc</structfield>, <structfield>LineFunc</structfield>, and <structfield>TriangleFunc</structfield> functions pointers can just be set to NULL.
						</para>
					</listitem>
				</orderedlist>

				<para>
				Even if hardware accelerated renderers aren't available, the device driver may implement tuned, special purpose code for common kinds of points, lines or polygons.
				The X/Mesa device driver does this for a number of lines and polygons.
				See the
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/xmesa3.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xmesa3.c</filename>
				</ulink> file.
				</para>
			</sect4>

			<sect4>
				<title>Overall Organization</title>

				<para>
				The overall relation of the core Mesa library, X device driver/interface, toolkits and application programs is shown in this diagram:
				</para>

<programlisting>
+-----------------------------------------------------------+
|                                                           |
|                   Application Programs                    |
|                                                           |
|          +- glu.h -+------ glut.h -------+                |
|          |         |                     |                |
|          |   GLU   |        GLUT         |                |
|          |         |      toolkits       |                |
|          |         |                     |                |
+---------- gl.h ------------+-------- glx.h ----+          |
|                            |                   |          |
|         Mesa core          |   GLX functions   |          |
|                            |                   |          |
+---------- dd.h ------------+------------- xmesa.h --------+
|                                                           |
|              XMesa* and device driver functions           |
|                                                           |
+-----------------------------------------------------------+
|                 Hardware/OS/Window System                 |
+-----------------------------------------------------------+
</programlisting>

			</sect4>
		</sect3>

		<sect3 id="mesa-4.x-internals">
			<title>
				Mesa 4.x
				<footnote>
					<para>
					The big changes in Mesa were made between
					Mesa 3.4.x and Mesa 3.5.  That's when Keith re-modularized the source
					code into separate modules for T&amp;L, s/w rasterization, etc.
					</para>
				</footnote>
				Implementation Notes
			</title>

			<para>
			This document is an overview of the internal structure of Mesa and is meant for those who are interested in modifying or enhancing Mesa, or just curious.
			</para>

			<note>
				<para>
				Based on the original <ulink url="http://www.mesa3d.org/docs/Implementation.html">Mesa Implementation Notes</ulink> and corrections by Brian Paul.
				</para>
			</note>

			<sect4>
				<title>Library State and Contexts</title>

				<para>
				OpenGL uses the notion of a state machine.
				Almost all OpenGL state is contained in
				one large structure: <structname>__GLcontextRec</structname> (typedef'd to <structname>GLcontext</structname>), as seen in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/mtypes.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>mtypes.h</filename>
				</ulink>. This is the central context data structure for Mesa.
				</para>

				<para>
				The <structname>__GLcontextRec</structname> structure actually contains a number of sub structures which exactly correspond to OpenGL's attribute groups.
				This organization made <function>glPushAttrib</function> and <function>glPopAttrib</function> trivial to implement and proved to be a good way of organizing the state variables.
				</para>
			</sect4>

			<sect4>
				<title>Vertex buffer</title>

				<para>
				The <structname>immediate</structname> represents everything that can take
				place between <function>glBegin</function> and <function>glEnd</function>
				being able to represent multiple <function>glBegin</function>/<function>glEnd</function> pairs.
				It can be used to losslessly encode this information in display lists.
				See
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/tnl/t_context.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>t_context.h</filename>
				</ulink>
				and
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/tnl/t_imm_api.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>t_imm_api.c</filename>
				</ulink>.
				</para>

				<para>

				When either the vertex buffer becomes filled or a state change outside the
				<function>glBegin</function>/<function>glEnd</function> is made, we must flush
				the buffer.
				That is, we apply the vertex transformations, compute lighting,
				fog, texture coordinates etc.
				The various vertex transformations are implemented as software pipeline
				stages by the
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/tnl/t_pipeline.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>tnl/t_pipeline.c</filename>
				</ulink>
				and <filename>tnl/t_vb_*.c</filename> files.
				</para>

				<para>
				When we're outside of a <function>glBegin</function>/<function>glEnd</function> pair the information in this
				structure is retained pending either of the flushing events
				described above.
				</para>

				<note>
					<para>
					Originally, Mesa didn't accumulate vertices in this way.
					Instead, <function>glVertex</function> transformed and lit then buffered each vertex as it was received.
					When enough vertices to draw the primitive (1 for points, 2 for lines, &gt;2 for polygons) were accumulated the primitive was drawn and the buffer cleared.
					</para>

					<para>
					The new approach of buffering many vertices and then transforming, lighting and clip testing is faster because it's done in a <quote>vectorized</quote> manner.
					See <function>gl_transform_points</function> in
					<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/math/m_xform.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
						<filename>math/m_xform.c</filename>
					</ulink>
					for an example.
					</para>
				</note>

				<para>
				For best performance Mesa clients should try to maximize the number of vertices between <function>glBegin</function>/<function>glEnd</function> pairs and used connected primitives when possible.
				</para>
			</sect4>

			<sect4>
				<title>Rasterization</title>

				<para>
				The point, line and polygon rasterizers are called via the <structfield>Point</structfield>,
				<structfield>Line</structfield>, and <structfield>Triangle</structfield> function pointers in
				the <structname>SWcontext</structname> structure in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/swrast/s_context.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
						<filename>swrast/s_context.h</filename>
				</ulink>.
				Whenever the library state is changed in a significant way, the <structfield>NewState</structfield> context flag is raised.
				When <function>glBegin</function> is called <structfield>NewState</structfield> is checked. If the flag is set we re-evaluate the state to determine what rasterizers to use.

				Special purpose rasterizers are selected according to the status of certain state variables such as flat vs smooth shading, depth-buffered vs. non-depth- buffered, etc.
				The <function>_swrast_choose_*</function> functions do this analysis.

				It's up to the device driver to choose optimized
				or accelerated rasterization functions to replace those in the general
				software rasterizer.
				</para>

				<para>
				In general, typical states (depth-buffered &amp; smooth-shading) result in optimized rasterizers being selected.
				Non-typical states (stenciling, blending, stippling) result in slower, general purpose rasterizers being selected.
				</para>
			</sect4>

			<sect4>
				<title>Pixel spans</title>

				<para>
				<function>Point</function>, <function>Line</function>, <function>Triangle</function>, <function>glDrawPixel</function>, <function>glCopyPixels</function> and <function>glBitmap</function>
				all use the <structname>sw_span</structname> structure and functions
				 in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/swrast/s_span.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>swrast/s_span.c</filename>
				</ulink> generate horizontal runs of pixels called spans.
				Processing includes window clipping, depth testing, stenciling, texturing, etc.
				After processing the span is written to the frame buffer by calling a device driver function.
				The goal is to maximize the number of pixel processed inside loops and to minimize
				the number of function calls.
				</para>

				<note>
					<para>
					Pixel buffers are no longer present in the latest Mesa code (4.1).

					All fragment (pixels plus color, depth, texture coordinates) processing is done via the span functions in <filename>swrast/s_span.c</filename>.
					</para>
				</note>

			</sect4>

			<sect4>
				<title>Device Driver</title>

				<para>
				There are three Mesa data types which are meant to be used by device
				drivers:
				</para>

				<glosslist>
					<glossentry>
						<glossterm>GLcontext</glossterm>

						<glossdef>
							<para>
							this contains the Mesa rendering state
							</para>
						</glossdef>
					</glossentry>

					<glossentry>
						<glossterm>GLvisual</glossterm>

						<glossdef>
							<para>
							this describes the color buffer (rgb vs. ci), whether
							or not there's a depth buffer, stencil buffer, etc.
							</para>
						</glossdef>
					</glossentry>

					<glossentry>
						<glossterm>GLframebuffer</glossterm>

						<glossdef>
							<para>
							contains pointers to the depth buffer, stencil
							buffer, accum buffer and alpha buffers.
							</para>
						</glossdef>
					</glossentry>
				</glosslist>

				<para>
				These types should be encapsulated by corresponding device driver
				data types.  See
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/xmesa.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xmesa.h</filename>
				</ulink> and
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dri/xc/xc/extras/Mesa/src/xmesaP.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>xmesaP.h</filename>
				</ulink> for an example.
				</para>

				<para>
				In OOP terms, <structname>GLcontext</structname>, <structname>GLvisual</structname>, and <structname>GLframebuffer</structname> are base classes
				which the device driver must derive from.
				</para>

				<para>
				The structure <structname>dd_function_table</structname> seen in
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/dd.h?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>dd.h</filename>
				</ulink>,
				defines the device driver functions
				<footnote>
					<para>
					Many of the functions which used to be in the dd_function_table are
					now moved into the tnl or swrast modules.
					</para>
				</footnote>.
				By using a table of pointers, the device driver can be changed dynamically at runtime.
				For example, the X/Mesa and OS/Mesa (Off-Screen rendering) device drivers can co-exist in one library and be selected at runtime.
				</para>

				<para>
				In addition to the device driver table functions, each Mesa driver has its own set of unique interface functions.
				For example, the X/Mesa driver has the <function>XMesaCreateContext</function>, <function>XMesaBindWindow</function>, and <function>XMesaSwapBuffers</function>
				functions while the Windows/Mesa interface has <function>WMesaCreateContext</function>, <function>WMesaPaletteChange</function> and <function>WMesaSwapBuffers</function>.
				New Mesa drivers need to both implement the <structname>dd_function_table</structname> functions and define a set of unique window system or operating system-specific interface functions.
				</para>

				<para>
				The device driver functions can roughly be divided into four groups:
				</para>

				<orderedlist>
					<listitem>
						<para>
						pixel span functions which read or write horizontal runs of RGB or color-index pixels.
						Each function takes an array of mask flags which indicate whether or not to plot each pixel in the span.
						</para>
					</listitem>

					<listitem>
						<para>
						miscellaneous functions for window clearing, setting the current drawing color, enabling/disabling dithering, returning the current frame buffer size, specifying the window clear color, synchronization, etc.
						Most of these functions directly correspond to higher level OpenGL functions.
						</para>
					</listitem>

					<listitem>
						<para>
						if your graphics hardware or operating system provides accelerated point, line and polygon rendering operations, they can be utilized through the <structfield>PointsFunc</structfield>, <structfield>LineFunc</structfield>, and <structfield>TriangleFunc</structfield> functions.
						Mesa will call these functions to <quote>ask</quote> the device driver for accelerated functions through the <function>UpdateState</function>.
						If the device driver can provide an appropriate renderer, given the current Mesa state, then a pointer to that function can be returned.
						Otherwise the <structfield>PointsFunc</structfield>, <structfield>LineFunc</structfield>, and <structfield>TriangleFunc</structfield> functions pointers can just be set to NULL.
						</para>
					</listitem>
				</orderedlist>

				<para>
				Even if hardware accelerated renderers aren't available, the device driver may implement tuned, special purpose code for common kinds of points, lines or polygons.
				The X/Mesa device driver does this for a number of lines and polygons.
				See the
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/X/xm_line.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>X/xm_line.c</filename>
				</ulink> and
				<ulink url="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mesa3d/Mesa/src/X/xm_tri.c?rev=HEAD&amp;content-type=text/vnd.viewcvs-markup">
					<filename>X/xm_tri.c</filename>
				</ulink> and files.
				</para>
			</sect4>

			<sect4>
				<title>Overall Organization</title>

				<para>
				The overall relation of the core Mesa library, X device driver/interface, toolkits and application programs is shown in this diagram:
				</para>

<programlisting>
+-----------------------------------------------------------+
|                                                           |
|                   Application Programs                    |
|                                                           |
|          +- glu.h -+------ glut.h -------+                |
|          |         |                     |                |
|          |   GLU   |        GLUT         |                |
|          |         |      toolkits       |                |
|          |         |                     |                |
+---------- gl.h ------------+-------- glx.h ----+          |
|                            |                   |          |
|         Mesa core          |   GLX functions   |          |
|                            |                   |          |
+---------- dd.h ------------+------------- xmesa.h --------+
|                                                           |
|              XMesa* and device driver functions           |
|                                                           |
+-----------------------------------------------------------+
|                 Hardware/OS/Window System                 |
+-----------------------------------------------------------+
</programlisting>

			</sect4>
		</sect3>

		<sect3>
			<title>Mesa's pipeline</title>

			<para>
			The work starts on <filename>t_pipeline.c</filename> were a driver configurable pipeline is run in response to either
			the vertex buffer filling up, or a statechange.
			</para>

			<para>
			 The pipeline stages operate on context variables (suchs
			as vertices coord, colors, normals, textures coords, etc), applying the
			necessary operations in a OpenGL pipeline (such as coord transformation,
			lighting, etc.).
			</para>

			<para>
			The last stage - rendering -, calls <function>*BuildVertices</function> in <filename>*_vb.c</filename> which applies the
			viewport transformation, perpective divide, data type convertion and packs the
			vertex data in the context (in the arrays <structfield>tnl->vb->*Ptr->data</structfield>) into a driver
			dependent buffer with just the information relevent for the current OpenGL
			state (e.g., with/without texture, fog, etc). The template <filename>t_dd_vbtmp.h</filename>  does
			this into a D3D alike vertex structure format.
			</para>

			<para>
			For instance, if we needed to premultiply the textures coordinates, as it is
			done in the tdfx and mach64 driver, we will need to make a costumized version of
			<filename>t_dd_vbtmp.h</filename> for that effect, or change it and supply a configuration
			parameter to control that behavior.
			</para>

			<para>
			This buffer is then used to render the primitives in <filename>*_tris.c</filename>. This vertex data
			is intended to be copied almost verbatim into DMA buffers, with a header
			command, in most chips with DMA.
			</para>

			<para>
			But in the case of Mach64, were the commands are interleaved with each of the
			vertex data elements, it will be necessary to use a different structure of
			*Vertex to do the same, and probably to come up with a rather different
			implementation of t_dd_vbtmp.h as well.
			</para>

			<para>
			Indeed, if the chip expects something quite different to the d3d vertices, one
			will certainly want to look at this.  In the meantime, it may be simplest to
			go with a <quote>normal-looking</quote> <filename>*_vb.c</filename> and do some extra stuff in the
			triangle/line/point functions.  The ffb and glint drivers are a bit like this,
			I think.
			</para>

			<para>
			All this mechanism is controlled with function pointers in the context which
			are rechosen whenever the OpenGL state changes enough. These functions
			pointers can also be overwritten with those in the <filename>sw_*</filename> modules to fallback to
			software rendering.
			</para>
		</sect3>
	</sect2>

	<sect2>
		<title>
		How about the main X drawing surface?  Are 2 extra "window
		sized" buffers allocated for primary and secondary buffers in a
		page-flipping configuration?
		</title>

		<para>
		Right now, we don't do page flipping at all. Everything is a blit from
		back to front. The biggest problem with page flipping is detecting when
		you're in full screen mode, since OpenGL doesn't really have a concept
		of full screen mode. We want a solution that works for existing
		games. So we've been designing a solution for it. It should get
		implemented fairly soon since we need it for antialiasing on the V5.
		</para>

		<para>
		In the current implementation the X front buffer is the 3D front
		buffer. When we do page flipping we'll continue to do the same
		thing. Since you have an X window that covers the screen it is safe for
		us to use the X surface's memory. Then we'll do page flipping. The only
		issue will be falling back to blitting if the window is ever moved from
		covering the whole screen.
		</para>
	</sect2>

	<sect2>
		<title>Clipping</title>

		<para>
		This section gives some notions about the several concepts associated to clipping.
		<footnote>
			<para>
			Contributed by Leif Delgass.
			</para>
		</footnote>
		</para>

		<sect3>
			<title>Scissors</title>

			<para>
			The scissors are register settings that determine a hardware clipping rect
			in window coords.  Any part of a primitive or other drawing operation that
			extends beyond the scissors is not drawn.  The scissors can be set through