WinGlide News Archive


Some of the links in the news archive may be broken.
2/28/00
Supersampling capable WinGlide released
I finished the supersample WinGlide release. It can provide four sample antialiasing in a window with certain applications when used with a Voodoo Graphics or Voodoo2 based card and the 3dfx MiniGL. Remember some of the key points I've mentioned in previous news updates when considering if it's worth downloading and trying out this release:
- Installation and uninstallation is complicated. See some of the previous news updates for more information.
- Only low resolutions are supported. Nothing above 400x300 antialiased will work.
- Performance is fairly low. A GLQuake timedemo demo1 will get around 10 frames per second on many P6 class systems with Voodoo2s when running at 400x300 antialiased. Going to 320x240 should bring this up to around 15 frames per second. A number of factors including multiple processors can move these numbers up or down, but don't expect to get over 20 frames per second at 320x240 antialiased even when running on a high end configuration.
- This release may be of interest to you if you can handle the difficult installation procedure and just wanted to see what it would look like if it could be done or are interested in looking at the details in various frames when supersample antialiasing is being used.

The installation instructions and files for this release are on the Supersample antialiasing WinGlide page.

2/15/00
Antialiasing update
I am close to having final code ready that will add a four sample antialiasing mode to WinGlide. Although it is possible to add a four sample antialiasing mode to WinGlide, it will have fairly low performance and will only be able to use low resolutions. Also, installing and later uninstalling it will be fairly complicated.

Intended audience
Because of the low performance and complicated installation procedure, this WinGlide modification will probably only be of interest to a small number of people. Installation is significantly more complex than just putting WinGlide's glide2x.dll in an application directory. You might want to try it out if you can handle the complex installation procedure and:
- you want to run a magnifying glass tool and look at the details of how four sample antialiasing affects image quality in various situations.
- you just want to see what it would look like if it were added to WinGlide (keeping in mind that it will be slow and only support low resolutions).

If you have a Voodoo2 based video card and simply want to see the benefits that more samples can have on image quality without doing a detailed analysis with a magnifying glass tool, don't bother with trying out my WinGlide antialiasing hack. It is much easier to just compare 512x384 full screen with 800x600 full screen.

Preliminary compatibility, performance, and installation information
I am designing the four sample antialiasing WinGlide mode to work with the 3dfx MiniGL (various version of the MiniGL should work) and GLQuake and Quake 2. Other OpenGL applications that support windowed video modes and work with the 3dfx MiniGL are likely to work although these will be the two I will run tests with.

I'm making it compatible with GLQuake because it's my favorite application for testing and developing things like this WinGlide mode. Unfortunately, the color scheme used throughout much of GLQuake may not show off the benefits of multisample antialiasing very well. For example, smoothing of jagged edges may be more apparent when looking out windows at the orange sky in Quake 2.

The maximum resolution will be limited to 400x300 with Voodoo2 based video cards and 320x240 with Voodoo Graphics based video cards. 400x300 using four sample antialiasing will be fairly slow. When using the four sample antialiasing mode on my system with a Voodoo2 and running GLQuake at 400x300, demo1 got 11.0 FPS with the SMP code enabled and 10.0 FPS without the SMP code (where it has performance close to that of a similar single processor system). While this is not a very playable frame rate, it is good enough for moving around and seeing how the multisample antialiasing affects various areas.

Installing WinGlide's four sample antialiasing mode will require more than just putting a new glide2x.dll in the application directory. It will also require renaming the 3dfx MiniGL and installing a different opengl32.dll in the application directory that will modify parameters passed to a few OpenGL API calls and then pass things on to the renamed 3dfx MiniGL. This patch file will trick the 3dfx MiniGL into rendering at four times the resolution so that WinGlide will have four samples to work with. As you may have figured out by now, uninstalling the four sample antialiasing demo will require removing the opengl32.dll patch file and renaming the 3dfx MiniGL back to its original name.

The four sample antialiasing comparison mode didn't work out
I attempted to implement a mode that would allow for easy comparison between a four sample antialiased image and a single sample image. The idea was to get the single sample image by choosing one pixel from every four pixel block in the larger image that contained four samples for each pixel used to get the four sample antialiased image. I would then display the one sample image next to the four sample image. The one sample image would be off by a sub pixel on both the x and y axis, but this would not be very noticeable. Unfortunately, texture filtering did not look right on the single sample images I derived from the larger images used to get the four sample image so I decided to drop this mode.

Some notes about the code
I wrote the standard version of the pixel blending code in C this time because the compiler did a fairly good job on code generation with it. There is also an MMX version of the pixel blending code written in assembly which can process twice as many pixels at a time.

I decided not to add the fitting the data in the cache optimization for the pixel blend stage that I had been thinking about earlier. Of course it would be nice to try out this optimization but implementing it gets a bit complicated because of how other parts of the code are written. My best estimate suggests that this optimization could only improve overall performance by 3-5% on many systems which is the other reason why I'm not going to spend time working on it.

It was easy to integrate the pixel blend code into WinGlide's existing SMP architecture. I only had to change a few lines of code to do this. This should make the time needed to do the extra pixel blend stage negligible on many dual processor systems.

Don't ask about a Glide 3 version of this WinGlide antialiasing mode for possible use with the new 3dfx OpenGL drivers
As many of you may already know, WinGlide for Glide 3.0 (the glide3x.dll version of WinGlide) does not work very well when used in conjunction with the new 3dfx OpenGL drivers. A special patch is required for it to work right at all and crashes are fairly common. I did spent a little time experimenting with the WinGlide four sample antialiasing mode and the new 3dfx OpenGL drivers but decided it just wasn't worth pursuing with all of the problems that were present. Things got even worse after I installed the latest NT 4 Voodoo2 drivers released around the end of January. I was already used to having frequent application crashes when working with the Glide3x version of WinGlide and the 3dfx OpenGL drivers. These could be fixed by just killing the crashed application in Task Manager. But now, I was getting frequent hard system lockups during Voodoo2 initialization after a few of the application crashes. These were the no blue screen and Ctrl-Alt-Del doesn't work kind of lockups. Because of the new even more severe crashes and because WinGlide never worked that well with the new 3dfx OpenGL drivers anyway, I don't have any more plans for WinGlide for Glide 3.0 besides one SMP related bug fix that has been delayed for quite a while.

12/19/99
Fix for using WinGlide with the new 3dfx OpenGL drivers released
First, note that this release has a number of compatibility and reliability problems with certain system configurations. Don't try installing it unless you are prepared for a fairly complicated installation procedure, some crashes, and some hangs.

The first release of the fix for using WinGlide with the new 3dfx OpenGL drivers that link to glide3x.dll has been released. Unfortunately, there are a number of compatibility problems with this fix. Although I have only tested it on a few hardware/software configurations, I believe it will cause hangs on a large number of systems. Some of these problems may be WinGlide related and it may be possible to fix them in a later release. Others appear to be caused by the 3dfx drivers and there is probably nothing I can do in WinGlide or this WinGlide fix to correct these problems. Although I was able to run the Quake III Demo Test windowed with my Voodoo Graphics card using this fix and WinGlide for Glide 3.0, installing the fix is a rather complicated process. Making a mistake while installing it can result in useless Quake III installation until things get cleaned up. I was hoping this fix might allow WinGlide and the new 3dfx OpenGL drivers to be used with other programs that use OpenGL, but it does not work right with a large number of programs I have tested it with. It works with a couple of games like the Quake III Demo Test and GLQuake, but fails with a number of simple OpenGL programs I used to use to test WinGlide with 3dfx OpenGL Beta 2.1.

If you decide to experiment with using this WinGlide fix, make sure to read ALL of the release notes and ALL of the installation instructions. It is likely that it will not work well on a large number of systems. Installing it incorrectly can cause problems. This WinGlide fix may work great with some systems, but fail badly on others. The release notes, installation instructions, and the fix for using WinGlide with the new 3dfx OpenGL drivers is on this page.

11/5/99
WinGlide with Quake3 compatible 3dfx OpenGL status update
I've compiled the first likely to be released build of the fix for using WinGlide with the Quake3 compatible 3dfx OpenGL drivers. Extensive testing of the fix has revealed some problems. Although the fix works great when everything is configured correctly, there are many things that can go wrong when trying to get a correct configuration set up. I may not be able to fix all of these problems if they are not caused by something in my code. I'm working on putting together improved documentation for dealing with these problems.

I was able to use this fix to get Q3Test to transition from full screen to windowed rendering with WinGlide on my system without having any major problems most of the time. Sometimes, something would hang, Q3Test would need to be terminated, and its graphics configuration would be left in an unusable state. Starting Q3Test again would cause a hang and the video output on my 3dfx card would often be stuck on. To fix things without restarting the system, I had to switch the video cables on the back of my machine around so I could get back to the main 2D display, terminate Q3Test, and work on repairing the configuration. Fixing things involved deleting the q3config.cfg file and starting over, or manually editing the q3config.cfg file so that it had a valid windowed or full screen configuration.

Testing with the fix did not go so well on my friend's system. Switching Q3Test from a full screen video mode to a window video mode with WinGlide on his system would always cause a hang to occur. Often times, the video output on the 3dfx card was stuck on and video cables had to be switched to get back to the main 2D display to fix things. Since a video mode switch with Q3Test from full screen to windowed with WinGlide always caused a hang, the only way we found to get WinGlide and the fix working on his system was to open the q3config.cfg file in a text editor, enter a valid windowed configuration, and then start Q3Test. Once this was done, Q3Test ran windowed with WinGlide just fine. One difference between this system and my system is that I have a Voodoo Graphics card, but my friend's system has a Voodoo2. I'm not sure if these video mode switching problems are related to differences in the drivers for these two cards, or if something completely different could be responsible for the problems.

On both our systems, it was much easier to get WinGlide, the fix, and the new Quake3 compatible 3dfx OpenGL drivers working with GLQuake. This was because GLQuake only loads opengl32.dll instead of having options to load multiple OpenGL drivers and selecting windowed and full screen video modes can easily be done from the command line.

10/22/99
Working on fixing the hang on exit problem with the new Quake3 compatible 3dfx OpenGL drivers
I think I found out why WinGlide has been having a hang on exit problem when used with the new Quake3 compatible 3dfx OpenGL drivers. It looks like changes to the 3dfx OpenGL drivers are conflicting with some of my thread exit code and causing a deadlock situation. The good news is that this problem does not happen when the multi-processor optimizations are not active and when the gamma correction dialog box is not used. The bad news is that fixing this problem will make my thread exit code a bit more complicated.

My fairly standard shutdown a worker thread code that looks something like this is part of what is causing the deadlock to occur:
   TellThreadToExit();
   WaitForSingleObject(hMyThread, INFINITE);
   CleanupSharedSynchronizationObjects();
The problem is that the WaitForSingleObject call on the thread is sometimes not returning even though I called ExitThread from within the thread I was waiting on so that it would exit. It looks like the 3dfx OpenGL driver is calling grSstWinClose from within its DLL entry point function and it is also set up to receive DLL_THREAD_DETACH notifications. This is what leads to the deadlock. Only one thread is allowed to be in a DLL entry point at a time. If my thread shutdown code is called from within the DLL entry point of the 3dfx OpenGL driver, no other thread can enter the 3dfx OpenGL driver's DLL entry point until I return so that it can return from its DLL entry point. Unfortunately, to get my thread to finish exiting so that my WaitForSingleObject call can return requires a DLL_THREAD_DETACH notification in the context of a different thread to go through the 3dfx OpenGL driver's DLL entry point. This DLL_THREAD_DETACH notification cannot happen until I return from my shutdown code in this case, but my code does not return until that notification goes through. This leaves a couple of threads deadlocked and the application being used with WinGlide hangs. Fixing this problem will require that I do not wait for my thread to exit, but I can't clean up my shared synchronization objects until the thread is done using them. I think I've figured out a solution for this problem that should work fairly well, but it will make the thread exit code less straight forward. I wish I could just use that WaitForSingleObject call on my thread, but it will have to go.

WinGlide's gamma correction dialog box also uses an extra thread and has problems with the new 3dfx OpenGL drivers. Although modifying its thread exit code can fix deadlock problems, it isn't enough to get it working correctly. The way these new 3dfx OpenGL drivers like to shutdown and reinitialize the 3dfx card every time the application window being with them loses and regains focus presents some problems for this WinGlide feature.

10/16/99
Some notes about antialiasing
I'm making this update to clarify a few things about WinGlide and the antialiasing methods I've been working with. If you are wondering why I'm trying to put antialiasing in WinGlide when Voodoo Graphics and Voodoo2 based cards already support antialiasing, it is because they support a different kind of antialiasing than what I have been experimenting with.

Just about all of the consumer level 3D accelerators available today support some form of edge antialiasing. There may also be some consumer level 3D cards that also support supersample antialiasing, but many of them do not. Edge antialiasing works differently than the supersample antialiasing I've been working on implementing in WinGlide. Some edge antialiasing implementations require that applications be written specifically to take advantage of them (some require that polygons be sorted in order to work correctly). Some other edge antialiasing methods may not work correctly when certain features on the 3D accelerator are in use. On the other hand, the supersample antialiasing I've been working on does not need applications to be written specifically to take advantage of it as the necessary changes can be handled in the drivers. It does not have any conflicts with certain 3D features that cause problems for some types of edge antialiasing. Supersampling can do more than just smoothing out jagged edges as it can also get rid of problems with polygon popping. Although supersample antialiasing does not have certain restrictions that may be present when using edge antialiasing, I believe there are many cases where available edge antialiasing support can provide better looking results than 2x2 supersample antialiasing.

I searched for antialiasing information about various video cards and found a few web pages of interest. Keep in mind that I was looking for information about supersample antialiasing. All of the following video cards support some kind of edge antialiasing.

NVIDIA RIVA TNT
It looks like the TNT is capable of doing supersample antialiasing. I found a good article on RivaZone called Anti-Aliasing via SuperSampling on the TNT. This article has some good high resolution screen shots comparing a non-antialiased image to a 2x2 supersample antialiased image. It also points out that the supersampling provides increased detail in addition to smoothing out jagged edges.

By taking a close look at the edges in some of the images, I was able to see the same kinds of visual artifacts I've been observing when doing 2x2 supersample antialiasing with WinGlide.

3dfx Voodoo3
It don't think the Voodoo3 can do supersampling right now. There is a mention of full scene antialiasing under Q&A #27 in the Voodoo3 FAQ which is probably referring to this type of antialiasing.

3dfx Voodoo2
A single Voodoo2 does not support supersample antialiasing. This is covered under Q&A #23 in the Voodoo2 FAQ.

It is possible to get realtime supersample antialiasing from a Mercury subsystem from Quantum 3D. This high end graphics product uses 8 Voodoo2 3 chip graphics subsystems to provide both good image quality and good performance.

10/9/99
Antialiasing with WinGlide
I've received a few emails asking about supersample antialiasing and WinGlide in the past. Working on it was not a high priority because I already knew that performance even at low resolutions would not be all that great. Now that I've had some extra time to play around with it some this past week, I've been experimenting with adding this kind of full screen antialiasing to WinGlide.

Before I mention some of the more technical details, here is a summary of what supersample antialiasing with WinGlide will and will not be able to do. It will be more of a tech demo than anything else because it will not be able to provide fast frame rates or high resolutions. My target resolution is 320x240 antialiased, which will require rendering at 640x480. 320x240 is a fairly low resolution, but my Monster 3D can't render any higher than 640x480 with the depth buffer enabled. Even at 320x240 antialiased, performance will be fairly low. Rendering each frame at 640x480 on the 3dfx card, having WinGlide copy each 640x480 frame off of the 3dfx card, antialias it, and then send it to the 2D card will not be fast. GLQuake timedemo demo1 frame rates for many systems should fall in and around the 10-15 fps range for 320x240 antialiased with WinGlide.

How it will work
I'll be doing 2x2 supersampling which will take every 2x2 block of pixels WinGlide reads from each frame and blend them into a single pixel in the destination frame. This will require that each frame be rendered with twice as many pixels in both the x and y direction relative to the destination image size. So, to get a 320x240 antialiased image, a 640x480 image will need to be rendered on the 3dfx card. Then WinGlide will read the 640x480 image back from the 3dfx card, blend the pixels, and send the 320x240 image to the primary display adapter.

Unfortunately, there are already some problems showing up here. An application that is set to run at 640x480 will expect to get a 640x480 image, but when the antialiasing process is done, I will only have a 320x240 image. Also, an application that is set to run at 320x240 is not going to render the 640x480 frame I need for the antialiasing process. I've found a couple of ways to solve these problems.

One way is to simply pixel double the image after I'm done antialiasing it. This is easy to implement and should work with anything WinGlide works with now, but there are some unfortunate side effects. Almost all aspects of the 640x480 frames antialiased to 320x240 and then pixel doubled back to 640x480 will look worse than if the original 640x480 frame was used. Performance will also be worse than if the original 640x480 frame was used. The image quality will be much better than 320x240 pixel doubled, which is what it should be compared to in order to see the benefits of the antialiasing.

Another solution I've been working on to solve this problem is to get the application being used with WinGlide to render an image that is twice as tall and twice as wide in the 3dfx card. Then, after WinGlide antialiases the image, it will be the size that the application is expecting. I've been using an opengl32.dll replacement to intercept certain OpenGL calls, modify certain parameters, and do a few other things to get the 3dfx MiniGL to render the larger image. So far, I only have this working with GLQuake (my favorite WinGlide test application). To get everything to draw correctly, I had to add some GLQuake at 320x240 specific code, so this fix may not work correctly at other resolutions or with other applications. I'm not too concerned about supporting other resolutions right now because the minimum resolution GLQuake runs at is 320x240, and my Monster 3D can only run at a maximum of 640x480. So, on my system, GLQuake at 320x240 antialiased with WinGlide is the only supported resolution.

Initial performance
I've been using GLQuake timedemo demo1 frame rates to measure performance as I've been working on supersample antialiasing with WinGlide. Right now, I'm getting around 13.0 fps at 320x240 antialiased. There are many things that can cause WinGlide performance to vary from system to system. Here are some of the things that may make WinGlide performance on my system faster or slower than on others:
- Dual Celerons running at 376 MHz. WinGlide's multi-processor optimizations help some although I was still able to get 12.5 fps when they were turned off.
- Front side bus slightly overclocked at 68.50 MHz which puts the PCI bus at 34.25 MHz. Faster PCI bus speeds should allow one of the slowest parts of WinGlide (which is reading each frame back from the 3dfx card) to go faster.
- Voodoo Graphics running at 57 MHz with a couple of environment variable tweaks that help performance slightly. The much higher fill rate of a Voodoo2 should be able to improve performance over this a bit.
- I haven't finished optimizing the pixel blending code in WinGlide yet. There are still a couple more optimizations left to try out. If I don't do any pixel blending at all, I get 14.0 fps instead of 13.0 fps, so these optimizations won't be able to speed things up by any more than 1.0 fps on my system. Right now, the pixel blending code works on each 640x480 frame after it gets copied to main memory. Because of this, each pixel it accesses will not be in the L1 or L2 cache. Note that these 640x480 frames will not fit in the 128KB L2 cache of a Celeron or the 512KB L2 cache of a Pentium II, Pentium III, or Athlon. But, if it were running on a Xeon with 1MB of L2 cache or more, the frames would fit in the L2 cache just fine. It is possible to have the pixel blending code run on smaller parts of each frame that fit in the L1 or L2 cache by processing these parts of each frame as they are read from the 3dfx card. This should speed things up some. I have both MMX and non-MMX versions of the pixel blending code. The MMX version is only slightly faster than the non-MMX version right now even though it can process approximately twice as many pixels per clock. It might be able to show more of a speed advantage if it were working on data from the cache rather than main memory. Another thing I'd like to try is handing off the pixel blending processing to a second processor in a multi-processor system. If the load balancing works out right, this could make a significant difference. It wouldn't be as good for keeping the data in the cache, but this might not matter if the second processor had time to spare while waiting on the first processor to prepare the next frame.

9/30/99
Working on making WinGlide work with the Quake3 compatible OpenGL drivers from 3dfx
I've made some progress towards fixing the compatibility problems with WinGlide and the Quake3 compatible 3dfx OpenGL drivers. The solution I'm working on now gets the 3dfx OpenGL driver to render images in the right place by modifying parameters passed to certain OpenGL functions. To modify these parameters, I've been using an opengl32.dll replacement that intercepts all of the calls to the 3dfx OpenGL driver. It works like WinGlide does for intercepting function calls to glide2x.dll or glide3x.dll.

Although the code I'm working on gets things rendered in the right place, there are still some stability problems when using WinGlide with the Quake3 compatible 3dfx OpenGL drivers. When running windowed with WinGlide, changing windows sometimes does not work correctly and causes the application using WinGlide and the 3dfx OpenGL driver to hang. Also, I've never been able to successfully exit the applications I've been using with WinGlide and the Quake3 compatible 3dfx OpenGL driver. I've always had to kill them after they hang each time while exiting. Some of these problems might be caused by certain things WinGlide does and, if this is the case, I can look into ways to fix any problems I might find. On the other hand, these stability problems may be caused by the 3dfx OpenGL driver in which case there is not much I can do to fix them. I already know these new Quake3 compatible 3dfx OpenGL drivers have some stability issues including those mentioned on 3dfx's Known Issues with Quake3(tm) compatible Drivers page.

9/2/99
FAQ Update
I added a couple more questions and answers to the WinGlide FAQ.

One new topic covers the Voodoo Rush, Voodoo Banshee, or Voodoo3 with WinGlide questions. The short answer is WinGlide is designed to be used with Voodoo Graphics and Voodoo2 based video cards. WinGlide should not be used with and is not designed to be used with video cards based on a 2D/3D chipset from 3dfx like those based on the Voodoo Rush, Voodoo Banshee, or Voodoo3 chipset. Windowed support for these 2D/3D cards should be provided entirely by the display drivers.

The second new topic talks about why the latest WinGlide release and many previous releases do not work when SLI mode is in use. It then lists some technical details about why not being able to use the Glide API to lock the back buffer for read access when SLI mode is in use presents a couple of major problems for certain WinGlide features and optimizations.

7/25/99
WinGlide v1.02 released
WinGlide v1.02 has been released. This version of WinGlide fixes a thread synchronization bug in some of the error handling code that could cause problems when multiprocessor optimizations are enabled. This bug does not affect any previous versions of WinGlide when the multiprocessor optimizations are not enabled.

WinGlide v1.02 can be downloaded from the [WinGlide] section of the page.

WinGlide v1.02 for Glide 3.0 can be downloaded from the WinGlide for Glide 3.0 page.

6/25/99
Found a minor multiprocessor bug in WinGlide v1.01
I found a couple of cases where WinGlide's threads could deadlock if multiprocessor optimizations are enabled and if one of a couple of Glide functions it calls during a buffer swap fails. This is not a major problem since these functions are very unlikely to fail. I've already figured out how to fix the problem and implementing the fix should be fairly easy to do.

5/17/99
Why WinGlide is not working correctly with the new Quake3 compatible drivers from 3dfx
Philip Langdale discovered a problem with windowed output being displayed in the wrong place or not at all when using WinGlide with the new Quake3 compatible OpenGL drivers from 3dfx. I was able to verify that the same problem occurred on my system and after running some tests, I have a fairly good idea of what may be causing the problem.

The problem
If an application is run windowed with WinGlide for Glide 3.0 and the Quake3 compatible 3dfx OpenGL driver, the top part of each frame may be displayed in the bottom part of the window and the top part of the window will be blank. How far the output from WinGlide appears to be slid downward in the window varies depending on what windowed resolution is used. At low windowed resolutions, nothing may be visible at all.

Some background information
The Quake3 compatible OpenGL drivers from 3dfx are the first 3dfx OpenGL driver release I know of where the OpenGL driver is linked to glide3x.dll rather than glide2x.dll. This means that WinGlide for Glide 3.0 is used with these OpenGL drivers to provide windowed rendering support for Voodoo and Voodoo2 based cards, instead of the original WinGlide that works with Glide 2.x.

Test results and the likely explanation
Although the new 3dfx OpenGL drivers were released with Q3Test in mind, I used GLQuake for testing because it worked with the drivers, was easier to run tests with than Q3Test, and the same problem occurred when using it as the test application. One possible explanation is that WinGlide is reading back the wrong part of the frame buffer on the 3dfx card but my tests have shown that this is not the case. The best proof of this came when I modified WinGlide so that it did not disable the video output from my 3dfx card when running GLQuake windowed. When I hooked my monitor up to my primary display adapter, I saw the top part of the output displayed in the bottom part of the window. Then, when I hooked my monitor up to the 3dfx card, I saw the exact same thing I was seeing in the window. The top half of the screen was black and the top part of each frame was displayed in the bottom part of the screen. So, WinGlide is reading back the right part of the frame buffer from the 3dfx card and it is displaying it in the right place in the window.

This still does not explain why the top part of each frame was displayed in the bottom part of the 3dfx card when running GLQuake windowed at 640x480, but when running GLQuake full screen at 640x480, everything looked okay. When I switched my desktop resolution down to 640x480 from 1024x768 and ran GLQuake windowed at 640x480, the windowed output looked correct. Further testing at various different windowed and desktop resolutions revealed that each frame in the 3dfx card was slid down by the height of the desktop resolution minus the height of the windowed resolution. Full screen output worked okay because GLQuake switched the resolution of the primary display adapter to match the resolution on the 3dfx card so there was no difference in height when running full screen.

Conclusion
It looks like the Quake3 compatible 3dfx OpenGL driver is not working correctly when used with WinGlide for Glide 3.0 because the 3dfx OpenGL driver is making a bad decision about what to render and where to render based on the height of the desktop resolution. This is not the kind of thing that could be easily worked around in WinGlide, if at all, because of where it fits in at a level below the 3dfx OpenGL driver. I can't think of a good reason why the 3dfx OpenGL driver would want to render frames at the wrong place on the 3dfx card based on the height of the desktop resolution. It would seem more logical for it to decide where to render things based on the height of the window whose handle is passed to the OpenGL driver. Perhaps this issue could be fixed in a future release of the 3dfx OpenGL driver.

5/3/99
WinGlide v1.01 released
WinGlide v1.01 has been released. This version of WinGlide has some minor improvements over version 1.00.

Some changes have been made to WinGlide v1.01 that could prevent crashes in certain, hopefully not so common, situations. There are also a couple of changes related to how it interacts with the window identified by the window handle passed to grSstWinOpen.

WinGlide v1.01 can be downloaded from the [WinGlide] section of the page.

4/16/99
WinGlide v1.01 for Glide 3.0 released
I finished the initial release of WinGlide designed to work with 3dfx's Glide 3.0 library. This version of WinGlide for Glide 3.0 is based on the WinGlide v1.00 code and contains a few new additions that are aimed at improving stability and preventing crashes in certain situations. This version of WinGlide will allow compatible applications that use Glide 3.0 with a Voodoo Graphics card to provide windowed output (with a performance hit of course). This release has not been tested with a Voodoo2 based card but there should not be any reasons why it would have trouble when used with a Voodoo2 based card rather than a Voodoo Graphics based card.

Testing of this initial release has been limited to use with a couple of test applications I wrote and tests run on my computer which has a Voodoo Graphics card (a Diamond Monster 3D). As far as compatibility goes, I don't know of any Glide 3.0 applications besides my tests that work with this version of WinGlide right now since I didn't have any available for testing. There are some notes about how WinGlide decides if it is going to run an application windowed or just let it run full screen on the WinGlide for Glide 3.0 page. These notes may be of interest if you are programming a Glide 3.0 application and want it to work with WinGlide.

This version of WinGlide uses a file called glide3x.dll to intercept calls to the glide3x.dll in the system directory. Remember, WinGlide is not a replacement for the glide3x.dll from 3dfx. It just needs to have the same name so it can intercept calls to this file.

This first release of WinGlide for Glide 3.0 can be downloaded from the WinGlide for Glide 3.0 page.

4/6/99
FAQ Updated
A few new questions and answers have been added to the WinGlide FAQ.

3/25/99
WinGlide for Glide 3.x
The current WinGlide works with Glide 2.x and it intercepts function calls that go to glide2x.dll. I received a couple of requests for a WinGlide for Glide 3.x around a month ago and I have been working on programming a new WinGlide for Glide 3.x. Using the prototype version of WinGlide I have been working on and a small test program I wrote that draws a single triangle with Glide 3.x, I was able to have my Monster 3D draw the triangle and have WinGlide display it in window. Although the simple test application only draws a single triangle, this test goes a long way towards verifying that the WinGlide part of things is working correctly.

Substantial parts of WinGlide needed to be rewritten in order to make it work with glide3x.dll rather than glide2x.dll. Fortunately, much of the code used by the current WinGlide can be used unchanged in the new WinGlide for Glide 3.x so it can easily inherit the many new features I've added to WinGlide.

3/3/99
WinGlide v1.00 released
WinGlide v1.00 has been released. This version of WinGlide adds new capabilities to the Overlay WinGlide mode. It is now possible for Overlay WinGlide to clip overlays on video cards that use the ET6000 chipset and this version of Overlay WinGlide can work with video cards that use the PERMEDIA 2 chipset. No changes have been made to the other WinGlide modes in this release.

A note about PERMEDIA 2 overlay support
When I ran some WinGlide tests a month or two ago on a Diamond Fire GL 1000 Pro in Windows 98, I did not see DirectDraw overlay support being reported when using the latest drivers from Diamond. After installing the latest reference drivers from 3Dlabs, DirectDraw overlay support was present. If you get the error message "No overlay support detected." from Overlay WinGlide, the drivers you are using do not have DirectDraw overlay support. If you have a PERMEDIA 2 and find that the latest drivers from your card's vendor are not supporting DirectDraw overlays, you may want to consider using the Off-screen Surface WinGlide mode instead. This mode provides performance and features similar to Overlay WinGlide and it should be able to work with just about all of the PERMEDIA 2 based video cards out there.

WinGlide v1.00 can be downloaded from the [WinGlide] section of the page.

2/24/99
Overlay clipping detection
The next version of Overlay WinGlide which is currently being worked on supports two methods of clipping the overlay surface it uses. Overlay clipping through the use of destination color keying is supported by the current release of WinGlide and has been supported in past releases. The new clipping support clips the overlay by using a DirectDraw clipper. Many video cards only support clipping an overlay using destination color keying but some video cards can only clip an overlay with a DirectDraw clipper. Since WinGlide will be able to clip overlays using two different methods, the Clipping key from the [Overlay] section of wg.ini will be split into two keys: the ColorKeyClipping and WindowClipping keys. With the new overlay clipping detection code, if both of these keys are not present in the wg.ini, Overlay WinGlide will detect what kind of overlay clipping the hardware supports and use it automatically.

Some PERMEDIA 2 Overlay WinGlide test results
I was able to use Kris' PERMEDIA 2 based video card over the weekend for testing some new Overlay WinGlide code. Overlay WinGlide was able to run on the card with the new overlay clipping support that uses a DirectDraw clipper (not available in the current Overlay WinGlide release but will be available in the next one). I was only able to spend a short amount of time running tests and for the most part things worked well. The only problem I encountered was with restoring the overlay surface when it was lost.

If another application that uses DirectDraw is run or when video modes are switched, the overlay surface being used by Overlay WinGlide can be lost and needs to be restored so that the overlay can be displayed again. In one of the tests I ran that caused the overlay surface to be lost, Overlay WinGlide was not able to restore the surface and I needed to restart the application being used with Overlay WinGlide in order to get things working again. I didn't have any time to investigate this issue at the time I ran the tests but if turns out to be much of a problem in certain cases, the YUY2 off-screen surface mode that works with PERMEDIA 2 based video cards could be used instead (this mode has features and performance very similar to the overlay mode on these video cards). I'm already thinking about a work around for this type of problem that should be fairly easy to implement, so if I hear about this type of problem occurring after the next Overlay WinGlide release, I can look into putting a work around together for testing.

I was thinking about doing a WinGlide beta release so that the new overlay clipping with a DirectDraw clipper could be tested by more people with ET6000 or PERMEDIA 2 based video cards but since I was able to run tests on a PERMEDIA 2 over the weekend, this new feature will be made available in the next regular WinGlide release.

2/9/99
Overlay clipping
I've spent some more time working on adding overlay clipping capabilities that use a DirectDraw clipper to Overlay WinGlide. Thijs Terlouw ran some tests on an ET6000 for me with a test release of WinGlide which has these new clipping capabilities. Although there were a few unexpected results, there were no major problems. Since I don't think I'll be able to use a video card that can clip overlays with a DirectDraw clipper anytime soon, I'll probably put together a public WinGlide beta release with the new clipping capabilities so that more tests can be run. I'd like to add some more hardware detection capabilities to Overlay WinGlide that would allow it to automatically select overlay clipping using color keying or overlay clipping using a DirectDraw clipper based on what the video hardware supports so that this information does not have to be manually configured in wg.ini. Also, thanks to Petri Räty who helped test some Overlay WinGlide releases on his ET6000 and sent me some good information about the chipset back in July. Although I wasn't able to get anything working then, the information he sent me about the ET6000 and overlay clipping is coming in handy now that I have fixed certain problems with how Overlay WinGlide works with DirectDraw overlays.

1/25/99
WinGlide v0.99 released
WinGlide v0.99 has been released. This version of WinGlide adds two new modes and some other new features. Here's what's new:
- The 320x240 windowed fix for when WinGlide is used with 3Dfx MiniGL v1.46.
- A StretchBlt mode that uses GDI for stretching WinGlide's output.
- A mode that uses off-screen video memory surfaces and hardware features available on certain video cards for improved stretching performance. I have tested this mode with a PERMEDIA 2 based card and it may work with other video cards that support the right features. A couple of modifications need to made to the wg.ini that comes with WinGlide v0.99 in order to enable this mode so if you want to try it, make sure to read the details listed below that explain how to set it up.
- A dialog box that allows gamma correction to be adjusted while WinGlide is running.

I haven't added any new sections to the web page with information about the two new WinGlide modes yet. You can download WinGlide v0.99 (which supports the two new modes in addition to the B2 WinGlide mode and the Overlay WinGlide mode) from the B2 WinGlide section of the page. Additional details about the two new modes are listed below and included in the readme.txt file that comes with WinGlide v0.99.

Mode settings
To enable a certain WinGlide mode set the key in the [Mode] section of wg.ini corresponding to that mode equal to 1 and make sure that all other keys in the [Mode] section of wg.ini are set equal to 0. Additional notes about mode settings are included in the readme.txt file that comes with WinGlide v0.99.

The 320x240 windowed fix
Setting the 320x240Fix key in the [WinGlide] section of wg.ini equal to 1 will enable a fix that can make 32x0240 windowed work properly when WinGlide is used with the 3Dfx MiniGL v1.46.

The StretchBlt WinGlide mode
WinGlide's new StretchBlt mode (enabled through the StretchBltWinGlide key in the [Mode] section of wg.ini) stretches WinGlide's output with a GDI StretchBlt call. This mode should work with any primary display adapter but it is likely that WinGlide's stretching performance will be better when using the Overlay WinGlide mode or the new Off-screen Surface WinGlide mode. This mode should only be used with applications that display their output in a fixed sized window.

Off-screen Surface WinGlide mode
WinGlide's new off-screen surface mode (enabled through the OffScreenSurfaceWinGlide key in the [Mode] section of wg.ini) provides stretching capabilities by using off-screen surfaces in video memory and hardware StretchBlt support. Right now this mode only works with off-screen surfaces defined by the FourCC YUY2. This mode must be used with a video card that supports off-screen YUY2 surfaces and functions that can StretchBlt images from these surfaces to the primary surface. When using this mode, the UseFourCCSurface key in the [Overlay] section of wg.ini must be set equal to 1 and the FourCC key in the [Overlay] section of wg.ini must be set equal to YUY2.

This mode is designed to provide stretching support on video cards that currently do not work with Overlay WinGlide. I have tested this mode on a PERMEDIA 2 based video card and found that stretching performance was much better than when using WinGlide's StretchBlt mode. The information I have about certain video cards from Matrox that do not work with Overlay WinGlide suggests that they may have the hardware support needed to use this mode. I have not been able to run any tests myself but if you have a video card with a Matrox chipset that does not work with Overlay WinGlide, you may want to see if this mode can provide improved stretching performance on your video card.

There is a small amount of image degradation when using this mode because of the nature of the YUY2 pixel format. If you want to compare image quality differences, you can compare this mode with WinGlide's StretchBlt mode. The image degradation is caused by horizontally adjacent pixels in the YUY2 format sharing the same chrominance. If you see white text over a different colored background (one that does not have equal red, green, and blue values), you may see some blending. Fortunately, because of the way most 3D applications do their rendering, this small amount of color blending should not cause any severe image degradation.

The gamma correction dialog box
If WinGlide's SystemMenuOptions are enabled and gamma correction is enabled, there will be an item on the system menu of the application being used with WinGlide that can open the gamma correction dialog box. This dialog box allows WinGlide's gamma correction to be adjusted while in use. Any gamma correction adjustments made by using this interface will not be saved back to wg.ini so if you want to make any permanent gamma correction changes, you will have to modify wg.ini. I have tested this gamma correction dialog box with GLQuake and found no problems and only saw a small but not serious problem when using it with Quake II. I designed the gamma correction dialog box so that it should work well with many applications but there may be some applications that do not work properly when a new user interface component like this gamma correction dialog box shows up.

1/24/99
WinGlide RGB to YUY2 conversion performance and the P6 WinGlide performance anomaly strikes again
Since my video card supports both RGB 565 and YUY2 overlays, I ran a few benchmarks to compare the performance of WinGlide's RGB to YUY2 conversion code on my Pentium II 266. There are two versions of the optimized conversion code in WinGlide: one that uses MMX for improved performance and one without MMX. Using GLQuake to run the benchmarks and with WinGlide's MMX and gamma correction support enabled, my timedemo demo1 score when running at 320x240 only dropped 0.4 fps from 39.2 fps to 38.8 fps when enabling the RGB to YUY2 conversion. Of course, when using slower processors and processors without MMX, there may be more of a performance hit when using WinGlide's RGB to YUY2 conversion capabilities. It's just a question of when WinGlide's RGB to YUY2 conversion code becomes more of a bottleneck than reading data from the 3Dfx card and writing data to main memory or the primary display adapter.

I encountered the strange P6 WinGlide performance anomaly while running benchmarks with WinGlide's RGB to YUY2 conversion code. This is closely related to WinGlide performing much better on P6 systems when gamma correction is enabled rather than disabled even though this gives the CPU more instructions to execute per pixel transferred from the 3Dfx card. With MMX disabled and gamma correction enabled, my timedemo demo1 score when using 320x240 in GLQuake was 28.2 fps but when I enabled the RGB to YUY2 conversion, it went up to 31.0 fps. If you have a Pentium Pro system and are using Overlay WinGlide with a video card that supports YUY2 overlays in addition to RGB 565 overlays, you might find that enabling WinGlide's YUY2 conversion capabilities could trade a little bit of image quality for improved performance. If you have a Pentium II system, using MMX and enabling gamma correction already offers much better performance than when not using MMX and it also seems to lessen the effect of the P6 WinGlide performance anomaly because adding the additional YUY2 conversion calculations made the frame rate go down rather than up on my Pentium II 266 system when using these WinGlide settings.

More PERMEDIA 2 information
Tom Forsyth from 3Dlabs sent me some useful information about the PERMEDIA 2 including news that there is overlay support in newer drivers. The DirectDraw overlay support in these drivers will not work with the current version of Overlay WinGlide because I'm not using a DirectDrawClipper but there is a good chance I can get a future version of Overlay WinGlide working with them. Even though Overlay WinGlide will not work with the new drivers right now, the YUY2 off-screen surface support with StretchBlt will be available in the next WinGlide release and should be almost as good. If I can use the overlay support instead of off-screen YUY2 StretchBlt, it will be possible to skip the RGB 565 to YUY2 conversion calculations and the small image quality loss in the RGB 565 to YUY2 conversion will not happen.

1/19/99
Run time adjustable gamma correction
I finished adding a new feature to WinGlide which will allow gamma correction to be adjusted while WinGlide is in use and SystemMenuOptions are enabled. Selecting the new gamma correction option from the system menu will bring up a dialog box with three sliders allowing red, green, and blue gamma correction values to be adjusted. This is the last new feature planned for the next release.

The gamma correction table calculation code has been optimized because adjusting the gamma correction with the new gamma correction dialog box can cause the table to be frequently recalculated. Sliding the sliders can cause a table recalculation to happen every frame. The table calculation code from WinGlide v0.98 took around 300 milliseconds to calculate a new table on my computer. This does not cause problems for this version of WinGlide since the table only gets calculated once when initializing WinGlide, but when sliding the new gamma correction sliders, this will drop the frame rate to a couple of frames per second until the adjustments stop. Now the gamma correction table calculation code is many times faster so that it is possible to see smooth gamma correction adjustments in the application being used with WinGlide.

I ended up creating a separate thread to manage the new gamma correction dialog box. This allowed me to use a modeless dialog box instead of a modal dialog box and still handle dialog box keyboard controls since I could put the IsDialogMessage call in the message loop for the thread I created. Using a separate thread also helps keep the gamma correction dialog box and the application being used with WinGlide from causing each other to be unresponsive. Although there are benefits to using the separate thread, it also causes some complications. The parent window of the gamma correction dialog box is the main window of the application being used with WinGlide which has its message queue in a different thread. This causes the input queues of the two threads to be attached and causes a minor but annoying problem to occur. In the tests I've run with WinGlide and a couple of applications, when the mouse is clicked in the gamma correction dialog box in order to switch the focus to it from the main application window, there is about a half second delay before it is focused and becomes responsive. Switching from the gamma correction dialog to the main application window happens instantaneously like it should. If I don't give the gamma correction dialog box a parent window, the input queues of the two threads are not attached and the problem does not occur. Unfortunately, this causes a new problem. Since the gamma correction dialog box does not have the main application window as its parent, it can easily become separated from the main application window and get lost behind other windows. I might try to fix this problem in a future release by not giving the gamma correction dialog box a parent window and then seeing if I can add code to WinGlide that will make it work as if the main application window were the parent window of the gamma correction dialog box. It would have been nice if attaching those thread input queues didn't cause this problem so I could just let the operating system take care of things like this.

1/4/99
ET6000 overlay clipping revisted
Overlay WinGlide currently supports clipping an overlay by using color keying which is the overlay clipping method supported by most video cards. Video cards based on the Tseng Labs ET6000 chipset clip overlays differently by using clip lists instead of color keying. I sent out a few WinGlide test releases to people with ET6000s a while ago in an attempt to get overlay clipping working on video cards with this chipset but was not successful. I learned a few new things about DirectDraw and clipping while working on getting WinGlide to use off screen YUY2 surfaces and StretchBlt capabilities on a video card with the PERMEDIA 2 chipset. This gave me a few ideas that could help me get overlay clipping working on video cards with the ET6000 chipset. If anyone has a video card with an ET6000 (and of course a Voodoo or Voodoo2) and would like to try to get WinGlide overlay clipping working, send me an email regarding this and I'll see if I can put together another test release. There is only so much I can do without having the actual hardware available but I think there is a good chance I can get it working this time.

YUY2 StretchBlt on a PERMEDIA 2
I was able to use Kris' computer at the LAN party a couple of weeks ago for a little bit of WinGlide development. He has a Diamond Fire GL 1000 Pro (PERMEDIA 2) and a Monster 3D. The PERMEDIA 2 is one of the graphics chipsets that does not support overlay surfaces through DirectDraw but does support off screen surfaces with the YUY2 pixel format (YUV color space) and has hardware support for color space conversion and StretchBlts so that the images from these surfaces can be transferred to the primary display surface.

I added a new mode to WinGlide that has it attempt to create an off screen surface in video memory with the YUY2 pixel format. WinGlide can transfer data from the 3Dfx card to this off screen surface by using the RGB to YUY2 color space conversion functions that I put together earlier. WinGlide then tells the video card to do a video memory to video memory StretchBlt from this off screen surface to the primary surface to finish displaying each frame. Although there is some image quality loss due to the nature of the YUY2 pixel format, there may be some benefits to using these graphics capabilities. Compared to doing a StretchBlt from a DIB section through the Win32 API, these benefits might include:
  1. The possibility of faster performance since it should be possible for the StretchBlt to take place asynchronously and each frame does not pass through system memory.
  2. The image quality of the stretched image might be better if the video card does a different kind of filtering with this type of surface compared to what takes place with an RGB StretchBlt.
  3. Tearing could be reduced since the source of the StretchBlt is in video memory rather than system memory and a flag can be set that tells the video card to schedule the blit so that tearing is avoided. I did see some tearing in the tests I ran but I do not know if this is because the driver or video card could not schedule the blit to avoid tearing or if I just made the window large enough so that there was not enough blitter bandwidth to avoid tearing at the current resolution, bit depth, and refresh rate.

I don't have any hardware available right now that supports off screen YUY2 surfaces and StretchBlt capabilities that work with these surfaces so I don't know for sure how this new WinGlide mode will compare to the StretchBlt mode I recently added to WinGlide. Both this new mode and the StretchBlt mode should be available in the next WinGlide release so that anyone interested in comparing the performance and quality of these two modes can do so.

1/2/99
Starting a WinGlide FAQ page
I started working on a WinGlide FAQ page. It's fairly short right now but I'm planning on adding more to it in the future.

Dana Neff sent in some config files for Quake II that make it easy to switch between full screen with 3DNow! and windowed without 3DNow! (very useful since the 3DNow! enhanced version of Quake II does not work with WinGlide). These config files are listed in the FAQ.

12/31/98
StretchBlt
I've added StretchBlt capabilities to WinGlide. This will allow WinGlide to do stretching on any video card rather than just video cards that support the type of overlay used by Overlay WinGlide (although if the StretchBlt gets software emulated, performance will be bad). Stretching frames with StretchBlt comes at a cost so it will be best to use Overlay WinGlide for stretching if it works with your video card because overlays can often be stretched at no cost. This new WinGlide mode that does StretchBlts has the same multi-processor optimization that B2 WinGlide has.

I ran some benchmarks on my computer that compare performance when stretching frames from GLQuake to various sizes. The StretchBlt is being handled by my OEM Diamond Stealth 3D 2000 (S3 ViRGE) with the slow 50 ns memory and 55 MHz clock so there are probably a lot of video cards out there that can do a faster StretchBlt. The following benchmarks were run with GLQuake set to use a resolution of 400x300 and timedemo demo1 was used to get the frame rates.

Single processor   Multi-processor   
Shrunk to 320x24014.317.1
Shrunk to 396x29711.312.9
Not stretched at 400x300   27.430.9
Stretched to 404x30320.325.3
Stretched to 512x38418.722.8
Stretched to 640x48016.219.0
Stretched to 800x60013.415.2

At 400x300, since there is no stretching, performance is the same as B2 WinGlide which uses a BitBlt instead of a StretchBlt. The low numbers from shrinking the image suggest that shrinking may be getting software emulated on my system (or the hardware is just doing it extremely slowly), but this is not much of a problem since stretching will be used in most cases. The 404x303 numbers show that doing even a little bit of stretching results in a large performance hit on my video card.

12/19/98
Done with finals
With final exams and other stuff keeping me busy at the end of the quarter, I didn't have much time to work on WinGlide these past two weeks. I've made some progress on designing a dialog box for adjusting gamma correction while WinGlide is in use. If it doesn't take too long to get it working, I'd like to add it to the next release so there are more new features than just the 320x240 fix for the 3Dfx MiniGL v1.46.

12/4/98
320x240 windowed not working with the latest MiniGL
Michael C found a problem with WinGlide and the recently released 3Dfx MiniGL v1.46. Further investigation revealed that 320x240 windowed no longer works when WinGlide is used with this new MiniGL but other resolutions still work okay. I found out what this new MiniGL is doing differently than the previous one and I should be able to add a work around to WinGlide that will allow 320x240 windowed to be used with it.

If a resolution of 512x384 or less is requested with the previous MiniGL, it will tell grSstWinOpen to use a resolution of 512x384. With the new MiniGL, if the resolution 320x240 is requested, it will tell grSstWinOpen to use a resolution of 320x240 but I think most common Voodoo and Voodoo 2 based boards do not support this full screen resolution so the grSstWinOpen call will fail. Since WinGlide intercepts all calls to glide2x.dll, I was able to work around the problem by telling grSstWinOpen to use 512x384 instead of 320x240 when the new MiniGL passes this resolution to grSstWinOpen.

12/2/98
WinGlide v0.98 released
WinGlide v0.98 has been released. This version of WinGlide combines B2 WinGlide and Overlay WinGlide into one file. A new [Mode] section has been added to the ini file wg.ini which allows either B2 WinGlide mode or Overlay WinGlide mode to be selected.

Each available WinGlide mode will have a key in the [Mode] section of wg.ini. Only one of these keys should be set equal to 1 at any one time. Right now, the two keys B2WinGlide and OverlayWinGlide can be added to the [Mode] section of wg.ini.
- To use B2 WinGlide, set the B2WinGlide key equal to 1 and set the OverlayWinGlide key equal to 0, or comment the OverlayWinGlide key out by putting a semicolon in front of it, or do not include it in the ini file at all.
- To use Overlay WinGlide, set the OverlayWinGlide key equal to 1 and set the B2WinGlide key equal to 0, comment it out, or do not include it in the ini file.

The B2 WinGlide part of WinGlide v0.98 now uses the slightly optimized MMX gamma correction code that was added to Overlay WinGlide v0.97 but was not present in B2 WinGlide v0.96.

Better support for loading a renamed glide2x.dll has been added to this version of WinGlide. Merging B2 WinGlide and Overlay WinGlide into one file may also be beneficial to those interested in loading the renamed glide2x.dll. You can read more about how to use these new options in the Advanced options section of the page.

Previous versions of WinGlide searched for glide2x.dll in both the Windows and System directories. This version of WinGlide only searches for glide2x.dll in the System directory.

The Overlay WinGlide part of WinGlide v0.98 includes some new code that can convert the RGB data from the 3Dfx card to the YUY2 pixel format. Unless there are video cards that support overlay surfaces with the YUY2 pixel format but do not support 16 bit RGB (565) overlay surfaces, this new code will not be useful right now because there is a little bit of image degradation when converting from RGB to YUY2 and the conversion requires more processing than doing gamma correction alone. There are some video cards that support off-screen YUY2 surfaces but do not support overlay surfaces and have hardware accelerated color conversion and stretching routines that can transfer and convert data from off-screen YUY2 surfaces to the RGB pixel format currently being used for the desktop resolution. If a future version of Overlay WinGlide could be modified so it could use these off-screen YUY2 surfaces, it may be possible to bring Overlay WinGlide features to video cards that currently to not work with Overlay WinGlide. The keys that can be used to tell Overlay WinGlide to try to use an overlay surface with a YUY2 pixel format are documented in the Advanced options section of the page.

11/22/98
Additional Overlay WinGlide usage information
I added a few more notes about what kinds of applications Overlay WinGlide should or should not be used with to the Overlay WinGlide section of the page. Also, if you have problems with a certain application and Overlay WinGlide, try using B2 WinGlide instead and see how it compares. There are fewer restrictions on what kinds of applications B2 WinGlide will work with compared to Overlay WinGlide.

11/16/98
Multiprocessor detection
I added multiprocessor detection to the almost finished combined version of B2 WinGlide and Overlay WinGlide that I've been working on.

11/8/98
Two versions of WinGlide in one file
I've begun working on combining B2 WinGlide and Overlay WinGlide into one file. Instead of having a separate glide2x.dll for each of these two different versions of WinGlide, there will be one glide2x.dll for both of them and ini file settings will select B2 WinGlide mode or Overlay WinGlide mode.

10/28/98
FourCC overlay update
My video card supports overlay surfaces that use the YUY2 pixel format. I've added code to the version of Overlay WinGlide I've been working on that can convert the RGB data from the 3Dfx card to the YUY2 pixel format used by the overlay surface. There is some image quality loss in the conversion but it is not very noticeable in GLQuake or Quake 2. The YUY2 pixel format gives each pixel its own luminance value but pairs of horizontally adjacent pixels share the same chrominance values. This is the main cause of the image degradation when converting the RGB data from the 3Dfx card to the YUY2 pixel format used by the overlay surface.

If you already have a video card that works with Overlay WinGlide, and therefore supports a 16 bit RGB overlay surface, this new pixel format conversion code will not be beneficial because it requires extra processing to do the RGB to YUY2 conversion and some image quality loss can occur when converting to the YUY2 pixel format. If you have a video card that does not currently work with Overlay WinGlide because it does not support the 16 bit RGB overlay surface used by the current version of Overlay WinGlide, then this new pixel format conversion code should allow Overlay WinGlide to work with your video card if it supports overlay surfaces using the YUY2 pixel format.

Although I wanted to support other 16 bit YUV pixel formats similar to YUY2 in the next release, I will probably only support YUY2 because having to average chrominance values made the conversion code a bit more complicated and my video card does not support overlay surfaces that use other common 16 bit YUV pixel formats.

10/21/98
Using WinGlide with the Infinite Worlds beta
I received a few emails regarding problems with the Infinite Worlds beta and the latest v0.97 WinGlide release. I downloaded the beta and ran some tests with it. The way the Infinite Worlds beta handles displaying OpenGL in a window, it is not the kind of program that is meant to be used with Overlay WinGlide. Overlay WinGlide is designed to work with programs that display their OpenGL output in a fixed sized window with a title bar (games like GLQuake for example). The latest B2 WinGlide release (v0.96) worked well with the Infinite Worlds beta. WinGlide compatibility has become quite a bit more complicated now that it is being used with so many more programs than just GLQuake and Quake 2. I'll add some additional information about what types of programs different versions of WinGlide should be used with some time soon.

Although Overlay WinGlide was not designed to work with the Infinite Worlds beta because of the way it displays OpenGL in a window, I was able to get it to display some output in a window under certain circumstances. Even though I was able to get the overlay to display, there were still a few problems so B2 WinGlide should be used instead. Since the window displaying the OpenGL output could not be stretched, a minimum overlay stretch factor that is less than or equal to 1 was required or else the overlay cannot be displayed. I had to lower my resolution and refresh rate in order to get a minimum overlay stretch factor of 1.000 on my video card so I could run these tests. Unfortunately, Overlay WinGlide was not able to draw its error messages or the background used for color keying when used with the Infinite Worlds beta. I was able to get the overlay to be displayed by disabling clipping but this presented some problems because the overlay remained after returning to town or the main menu and this prevented the upper right corner of the screen from being seen. Since there was always a white background displayed where the Overlay WinGlide error messages or the background color used for color keying should have been displayed, I was able to use clipping if I told Overlay WinGlide to use an all white background color (Red, Green, and Blue set to 255) since this matched the background color that was already there. There were some side effects from using white for color keying because the overlay was still displayed over white pixels in town or on the main menu but this was much better than having the overlay completely cover the upper right corner of the screen. When using Overlay WinGlide, the OpenGL output was not displayed in the correct place. It was a little bit lower than it should have been. I think this occurred because when Overlay WinGlide changes the window style of the window it is being used with to be resizable, it will give it a title bar if it doesn't have one already and the OpenGL output looked like it was about a title bar's height too low.

Some information about Overlay WinGlide not working with certain Matrox video cards
I found some information on the Chips page of the The Almost Definitive FOURCC Definition List indicating that the Matrox Millenium and Matrox Mystique do not support overlays but instead support off screen YUV surfaces and BitBlt functions that can do color space conversion. This would explain why Overlay WinGlide does not work with these video cards. I'm not sure if newer cards from Matrox have similar limitations or if they might support an overlay surface with a common YUV pixel format. More information can be gathered when I get an Overlay WinGlide release with support for common YUV pixel formats put together.

10/18/98
Overlay WinGlide v0.97 released
Overlay WinGlide v0.97 has been released.
Overlay WinGlide can now use triple buffered overlays which can improve performance over the double buffered overlay support available in the previous version of Overlay WinGlide. The DoubleBuffering key in the [Overlay] section of wg.ini has been changed to the BackBufferCount key. Set the BackBufferCount key equal to 0 for a single buffered overlay, 1 for a double buffered overlay, or 2 for a triple buffered overlay.

Error handling has changed some. Instead of allowing the application being used with Overlay WinGlide to run without video if the overlay surface could not be created, Overlay WinGlide will cause the grSstWinOpen call to fail when this error and other similar errors occur.

The minimum overlay stretch factor is checked to see if it will prevent the overlay from being displayed even if the window being used with Overlay WinGlide is maximized.

Setting the new SystemMenuOptions key in the [WinGlide] section of wg.ini equal to 1 will allow WinGlide to add menu items to the system menu. Right now it only adds one item to the system menu that will reset the window size of the application being used with Overlay WinGlide so that the overlay is not stretched. More options may be added in the future. If you are using WinGlide with an application that adds its own items to the system menu, this option should be disabled because it could cause WinGlide to interfere with the processing of those menu items.

10/14/98
Improving performance with triple buffered overlays
When Overlay WinGlide is using a double buffered overlay, it is possible for the next frame to be drawn in the back buffer of the 3Dfx card before the overlay gets flipped but it is not possible to copy a completed frame from the back buffer of the 3Dfx card to the back buffer overlay before the overlay gets flipped. If the overlay can be flipped in a shorter amount of time than it takes for the 3Dfx card to draw the next frame, there will not be a performance hit when using a double buffered overlay instead of a single buffered overlay. If the 3Dfx card can draw the next frame before the overlay can be flipped, there will be a performance hit when going from a single buffered overlay to a double buffered overlay because Overlay WinGlide will have to wait for the overlay to be flipped before copying the completed frame from the back buffer of the 3Dfx card to the back buffer overlay.

I ran some benchmarks on my computer with GLQuake and Overlay WinGlide and found that at 512x384, the frame rate did not decrease when going from single to double buffering. At 400x300, the frame rate did decrease when switching from single to double buffering. These performance decreases can be eliminated by using a triple buffered overlay which will be able to provide tear free output with a frame rate equivalent to the frame rate when using a single buffered overlay (as long as the frame rate is lower than the refresh rate of the 2D card displaying the overlay). I've tested triple buffering with Overlay WinGlide on my computer and verified that it was able to provide the same frame rate as single buffering in all of the benchmarks I ran where there had been a performance decrease when using double buffering instead of single buffering.

10/8/98
Overlay WinGlide v0.96 released
Overlay WinGlide v0.96 has been released.
Overlay WinGlide v0.96 has the option to use a double buffered overlay which prevents tearing. A few bugs in previous versions of Overlay WinGlide have been fixed.

What else is new in Overlay WinGlide v0.96
- More detailed error messages are displayed when an overlay cannot be created.
- Minor optimizations in the MMX gamma correciton code.
- Fast pass through to glide2x.dll.
- MMX detection. If the MMX key is not present in wg.ini, WinGlide will detect the presence of MMX support and use MMX instructions automatically if they are available. If you want to, it is possible to override the MMX detection. See the readme file for details.
- Better error checking and handling in the window subclassing code.

10/7/98
Working double buffered overlays
Double buffered overlays are working on my video card this time. The next Overlay WinGlide release will have the option to use a double buffered overlay surface which will be able to provide output in a window without tearing.

10/3/98
New Monster 3D
My replacement Monster 3D arrived yesterday. I was able to test the new MMX gamma correction code I had been working on and it had no problems. I'm going to start working on adding double buffering to Overlay WinGlide next.

10/1/98
Back to school
The first day of classes is today.

A day late
My replacement Monster 3D arrived the day after I left for Davis so it will need to get mailed again.

9/23/98
Minor MMX gamma correction optimizations
While thinking about ways to do 16 bit RGB to YUV conversions, I've found some ways to optimize the MMX gamma correction code. When I first added the MMX gamma correction code to WinGlide, I didn't have an MMX system to test it on so I was more concerned about making sure it worked right the first time rather than aggressively optimizing it. Although I should be able to significantly reduce the number of instructions needed to process each pixel, there will probably be little or no performance increase (depending on processor type and clock speed) because of the throughput bottleneck when reading from the 3Dfx card. I still haven't replaced my broken 3Dfx card yet but there is a good chance I will be able to get this taken care of this week. Once I get the new card, I'll be able to test the new MMX gamma correction code and I'll be able to start testing double buffered overlay support.

9/14/98
RGB, YUV, FOURCC
Support for overlay surfaces and supported pixel formats varies considerably. Right now Overlay WinGlide only supports 16 bit RGB (565) overlay surfaces. If an overlay surface with that pixel format is not supported by your video card, Overlay WinGlide will not work with it. If you see a video card in the Overlay WinGlide hardware compatibility list that did not work with Overlay WinGlide at all, this does not necessarily mean that the video card has no hardware overlay support. It just means the video card did not support the 16 bit RGB overlay surface used by Overlay WinGlide. It is also possible that the video card supported the correct overlay surface pixel format but that there was not enough memory left on the video card to store the overlay surface. I'd like make future versions of Overlay WinGlide report when the creation of an overlay surface failed because of insufficient video memory (right now Overlay WinGlide does not say why it was unable to create an overlay surface). This would make it easier to tell if Overlay WinGlide did not work because the pixel format it wanted was not supported or because there was not enough video memory.

Support for overlay surfaces with non-RGB pixel formats seems to be fairly common. The pixel formats of these surfaces are identified by Four-Character Codes (FOURCC). Video data is often stored in the YUV color space and many video cards support overlay surfaces with YUV pixel formats. I've been wondering if some video cards support certain 16 bit per pixel YUV overlay surfaces but do not support 16 bit per pixel RGB overlay surfaces. If this is the case, it would be possible to make Overlay WinGlide work with more video cards if I could convert the 16 bit per pixel RGB data read from the 3Dfx card to a supported YUV pixel format. The common 16 bit YUV pixel formats that I've seen sample Y values every pixel and sample U and V values every other pixel. This will make converting the RGB data to these formats a bit more complicated than if Y, U, and V were sampled for every pixel. If these formats used 16 bits per pixel instead of 16 bits per pixel on average, I could simply calculate a gamma correction table that would do gamma correction and color space conversion (RGB to YUV) at the same time and performance would be the same as when doing gamma correction alone. I'm hoping I can find a way to do the RGB to YUV conversion without reducing performance very much. I'm also not sure how much image quality loss there will be in the RGB to YUV conversion but hopefully it will be minimal.

For more information about YUV pixel formats and FOURCCs, visit The Almost Definitive FOURCC Definition List.

9/11/98
Another hard drive with bad bearings
Last week, my hard drive began making very loud high pitched whining noises. Fortunately it was still operational when my replacement drive arrived yesterday and I was able to transfer everything to the new drive. My last hard drive failure was also caused by bad bearings but it happened more suddenly. I was using my computer and it started making some noises that sounded like they might be coming from the speaker but they were actually coming from the hard drive. Soon after the noises began, I got a disk error. Everything on that drive was lost.

9/3/98
Double buffered overlays
Overlay WinGlide has more problems with tearing than B2 WinGlide because of the low throughput when reading from the 3Dfx card. I'm thinking about adding an option to a future version of Overlay WinGlide that will tell it to use a double buffered overlay surface. When I first started working on Overlay WinGlide, I tried using a double buffered overlay surface with my video card but it didn't work right (there were some strange image quality problems). Hopefully other video cards will be able to correctly flip the type of overlay surface used by Overlay WinGlide.

8/27/98
B2 WinGlide v0.96 released for testing
My Monster 3D is pretty much completely broken now so I haven't been able to do much testing with this version of WinGlide. Since I won't be able to get a suitable replacement within a week, I've decided to release B2 WinGlide v0.96 for testing. I've carefully reviewed the changes to this version of WinGlide and I am fairly confident that it will work as expected. Earlier, I was able to run some longer tests with some of the new features in this version of WinGlide when my Monster 3D was able to run for a minute or two before failing. By the time I compiled the final build, GLQuake was only able to display a couple of garbled images in a window before my Monster 3D locked up. Although this did help verify that some parts of this WinGlide release were working, I would have liked to have tested the final build more extensively. If you have any problems or if this version of WinGlide works okay on your system, either way I'd like to know, so if you get a chance, send me an email.

B2 WinGlide replaces Gamma WinGlide
With this release of B2 WinGlide, B2 WinGlide is now a replacement for Gamma WinGlide. B2 WinGlide is Gamma WinGlide plus support for window resizing when used with 3Dfx OpenGL Beta 2.1. Since the window resizing support should not interfere with programs that use fixed sized windows, B2 WinGlide should be compatible with all of the programs that currently work with Gamma WinGlide. Only adding new features to B2 WinGlide instead of both B2 WinGlide and Gamma WinGlide will make WinGlide development easier for me. Just in case there are any fixed sized window programs that do not work with B2 WinGlide but do work with Gamma WinGlide, the next release of B2 WinGlide may include an option to disable B2 WinGlide's window resizing support (this would make it work like Gamma WinGlide).

What's new in B2 WinGlide v0.96
- Fast pass through to glide2x.dll. This version of WinGlide significantly reduces the number of instructions needed to pass calls through to the glide2x.dll in the system directory. On systems where the CPU is more of a limiting factor than the 3Dfx card, this should improve performance a little bit. The performance increase is not all that big and will be most easily noticed when WinGlide is installed with a program running in a full screen video mode.
- MMX detection. If the MMX key is not present in wg.ini, WinGlide will detect the presence of MMX support and use MMX instructions automatically if they are available. If you want to, it is possible to override the MMX detection. See the readme file for details.
- Better error checking and handling in the window subclassing code.
- An advanced option has been added to this version of WinGlide.


8/20/98
MMX detection
Seeing that I didn't have any MMX detection built into WinGlide, Andy Murdoch sent me some MMX detection code. I'm finally getting around to adding this to WinGlide. The next B2 WinGlide release will feature MMX detection and if it works out well, it will be added to other new versions of WinGlide. If desired, it will be possible to override the MMX detection.

When Plug N Play attacks
I spent a few hours over past couple of days trying to fix some of my Plug N Play problems. Plug N Play can be great when it works right but when it doesn't...

The routine sound driver update gone bad
I tried to update the drivers for my Sound Blaster 16 PnP in WinNT 4.0 from revision 7 to revision 8 and didn't expect any major problems, but after installing the new drivers, the 16 bit DMA channel wasn't working. I could still get some sound, but it didn't sound very good and sound was choppy in games. Then I tried going back to the revision 7 drivers that were installed before but the 16 bit DMA channel was still not working. I finally got my 16 bit sound working again by disabling the PnP ISA enabler, installing some really old drivers, and instead of using PnP, I just told the driver what IO ports, IRQ, and DMA channels to use. I might try getting the new drivers, or maybe the revision 7 drivers that were working before, to work again, but after 20-30 reboots trying various things in an attempt to get these newer drivers working, I'm going to stick with the old ones for a while.

Setting up the printer port
I wanted to try out my printer with my new computer but the printer port wasn't working in Win95. I'm not sure what Plug N Play had done with it but to fix this, I went into the BIOS and told it to use IRQ 7 and put its IO port at 378. Now the printer port didn't have the yellow exclamation point next to it in Win95 but when I tried to go into WinNT 4.0, my computer just hung. Taking a closer look at what was going on during boot up, I saw that telling the printer port to use IRQ 7 had caused my SCSI controller and video card to share an interrupt. I had a feeling that the drivers for one or both of these devices might not be designed to share interrupts. I was finally able to get NT to start up again by disabling one of my serial ports (fortunately, I had a serial port that was not in use). That my moved my SCSI controller to IRQ 4 but now everything seems to work okay. I think the IDE controller on my SB16 is wasting one of my interrupts. I'm going to be working on the other computer at home and it has a SB16 with jumpers. Maybe I'll trade my new PnP card for the old one with jumpers so I can easily turn off the IDE controller and get my interrupt back. Perhaps if NT 4.0 had better Plug N Play support, this might not be a problem.

8/15/98
My Monster 3D is dying
Although there have been times when I have crashed my computer a lot while working on WinGlide, I didn't think the lockups I have been getting over the last couple of days were being caused by the next version of WinGlide that I've been working on. Some tests with full screen games (and no WinGlide installed) confirmed that my Monster 3D is dying. It works for a while and then it locks up (sometimes displaying a few messed up textures first). Hopefully this won't interfere with the next WinGlide release too much more than it has already. Maybe it will work reliably a little longer if I under clock it.

8/13/98
WinGlide and full screen performance
Installing WinGlide can reduce full screen performance a little bit. This is because each call to the glide2x.dll in the system directory passes through WinGlide. GLQuake is fill rate limited on my computer (PII 266 with Monster 3D) so I was not able to notice a difference in full screen performance when comparing timedemos with or without WinGlide installed in my Quake directory. On systems where the CPU rather than the 3Dfx card is more of a limiting factor, small full screen performance decreases may be observed when WinGlide is installed.

With the next release of WinGlide, I'm planning on adding some optimizations that will reduce the amount of processing done by WinGlide when passing functions through to the glide2x.dll in the system directory. On systems where there is a small full screen performance decrease with WinGlide installed, this should make this already small performance decrease even smaller. These changes will also make WinGlide smaller. The size of the B2 WinGlide glide2x.dll I've been working on went from 84KB to 80KB.

8/10/98
User feedback
Using force triple color buffer can cause problems
Daniel Selman reports that having force triple color buffer enabled in the advanced settings dialog for his Orchid Righteous 3D II caused problems with WinGlide and a windowed OpenGL application. The application ran okay with force triple color buffer disabled. Using 3 color buffers on the 3Dfx card will not benefit WinGlide performance but it does increase full screen performance when V Sync is enabled. I'm thinking about ways to fix this problem and if I am able to find a solution, I will add it to a future version of WinGlide.

WinGlide does not work with the 3DNow! optimized 3Dfx miniGL for Quake II
Dana Neff sent word that the 3DNow! 3Dfx miniGL for Quake II from AMD does not work with WinGlide. We discovered that it does not work with WinGlide because Glide v2.5 was integrated into the miniGL driver. With the integrated version of Glide, it does not load glide2x.dll and therefore does not load WinGlide.

8/7/98
GLQuake and Quake 2 WinGlide benchmarks
I finished the WinGlide benchmarks. These benchmarks compare the performance of single-threaded Gamma WinGlide, multi-threaded Gamma WinGlide, and Overlay WinGlide.

7/30/98
More benchmarks soon
I'm planning on running a set of WinGlide benchmarks on my computer with Quake 2 and GLQuake soon. These benchmarks will compare WinGlide with multi-threading disabled, Overlay WinGlide, and WinGlide running with multi-threading enabled on a dual processor system. I also plan to compare WinGlide performance at common resolutions and to include full screen performance numbers for supported resolutions.

7/24/98
Gamma WinGlide benchmarks
Here are some benchmarks that time the three main parts of Gamma WinGlide on my computer. These numbers show that the low read throughput when reading data from my 3Dfx card contributes most to WinGlide's slow frame rate. Some of the benchmarks I received while testing MMX gamma correction in the past suggest that common Voodoo 2 based cards have similar read throughput compared to my Monster 3D (Voodoo 1).

System Information:
Two 266MHz Pentium II processors
66 MHz bus speed (33MHz PCI bus speed)
128 MB SDRAM
Monster 3D running at 57MHz
Diamond Stealth 3D 2000 set to use 16 bit color
Windows NT 4.0
SET SST_FASTMEM=1
SET SST_FASTPCIRD=1
Although I have a dual processor system, WinGlide's multi-threading enhancements were disabled while running these benchmarks so that each of the three parts could be timed accurately. MMX and gamma correction were both enabled for optimal performance on my computer.

GLQuake 0.97 and timedemo demo1 were used to obtain the numbers presented below.
A timer with one millisecond resolution was used to time the three parts of Gamma WinGlide.

512x384 windowed
TimeThroughput
GLQuake: 16 ms
3Dfx to main memory: 33 ms 11.9 MB/s
Main memory to 2D card: 8 ms 49 MB/s
Rounding errors and overhead: 1 ms
Total time per frame: 58 ms
Frame rate reported by GLQuake: 17.2 fps
(1000 ms/s) / (58 ms/frame) = 17.2 fps which matches the frame rate reported by GLQuake.

512x384 full screen with VSYNC disabled
Frame rate reported by GLQuake: 62.6 fps
As expected, the full screen GLQuake frame rate of 62.6 fps = 16 ms per frame which matches the time per frame reported by the modified benchmarking version of Gamma WinGlide.


Does writing to main memory slow down WinGlide's 3Dfx to main memory throughput very much?
No. To make sure that writing to main memory was not contributing much to low 3Dfx to main memory throughput, I modified the benchmarking version of Gamma WinGlide so that it would read data from the 3Dfx card but not write it to main memory. With these modifications, the read throughput from the Monster 3D on my computer was 12.3 MB/s which is only slightly higher than the 3Dfx to main memory throughput of 11.9 MB/s listed above. This means that even if the CPU is used to transfer frames directly to the 2D card (like in Overlay WinGlide), the low read throughput from the 3Dfx card will still limit WinGlide performance a lot.

Are there any ways that WinGlide can read data faster from the 3Dfx card?
Yes, but not much faster right now.
- Overclocking the PCI bus will increase read throughput some but since it is already so low, it will still limit WinGlide performance a lot.
- PCI cards that use bus mastering to write data directly to main memory can often get much better throughput compared to using the CPU to transfer the data to main memory. This can also leave the CPU free to do other processing while the transfer is taking place. Unfortunately, Glide does not support this type of transfer.
- It might be possible to get better read throughput if a bus mastering video card could read data directly from the 3Dfx card. Although certain video cards support using bus mastering to read data from system memory, I don't know if they would work correctly if I gave them the address of the locked frame buffer on the 3Dfx card which is on the PCI bus. DirectX 5 would be required to try something like this and some limitations in DirectX 5 would make things difficult. Even if it did work, I don't know if it would be faster, slower, or the same speed as using the CPU to do the transfer. When I get my new video card, I may spend some time looking into this if it supports using bus mastering to read data from system memory.

7/22/98
Timing the three parts of Gamma WinGlide
I've made some modifications to Gamma WinGlide so I can see how long each of the three main parts of Gamma WinGlide takes.
These are the three parts:
1. The time it takes the game to draw a frame (1 divided by the full screen frame rate)
2. Copying the frame from the 3Dfx card to memory
3. Copying the frame from memory to the 2D card

Using the CPU to read data from video cards on the PCI bus is much slower than using the CPU to write data to video cards on the PCI bus. The low throughput when using the CPU to read data from the 3Dfx card is what slows WinGlide down the most. My initial testing using GLQuake at 400x300 and timedemo demo1 showed part 2 taking longer than parts 1 and 3 combined. I calculated my 3Dfx card (Monster 3D) to main memory throughput to be around 12 MB/s. The bad news is that this is not very fast. The good news is that my WinGlide performance enhancements have doubled this throughput compared to what it was before.

For comparison, I wrote a program that uses DirectDraw to lock the frame buffer on my 2D card (Diamond Stealth 3D 2000) and then uses the MMX copy function from Gamma WinGlide to move data to and from video memory. My video card was set to use high color (16 bits per pixel) and 400x300 images were copied to and from video memory.
Here are the numbers:
Main memory to video memory throughput: 50 MB/s
Video memory to main memory throughput: 7 MB/s

I'm planning on posting some more detailed results and analysis soon.

7/12/98
Multi-threading has been added to B2 WinGlide and Mesa WinGlide
B2 WinGlide v0.95 and Mesa WinGlide v0.95 now support multi-threading. Enabling WinGlide's multi-threading feature can improve performance on multi-processor systems.

7/3/98
New computer
I've been working on putting my new computer together and moving everything to the new hard drive.

6/25/98
Gamma WinGlide v0.94 and Overlay WinGlide v0.94 released
Gamma WinGlide v0.94 and Overlay WinGlide v0.94 have been released.
Multi-threaded WinGlide is now a part of Gamma WinGlide (Multi-threading can be enabled in Gamma WinGlide's ini file).

These releases fix a bug that caused the wrong part of the frame buffer in the 3Dfx card to be copied to the 2D video card when using 3Dfx OpenGL Beta 2.1 with WinGlide at certain resolutions. This bug is not present in B2 WinGlide v0.94.

6/16/98
WinGlide and 3Dfx OpenGL Beta 2.1 do not support programs that use OpenGL to render to multiple windows
3Dfx OpenGL Beta 2.1 and the Mesa Voodoo driver only support programs that use OpenGL to render to a single window. Since multiple windows are not supported, many programs that use OpenGL will not work with WinGlide and 3Dfx OpenGL Beta 2.1 or the Mesa compatible version of WinGlide and the Mesa Voodoo driver. I have received many emails asking if 3D Studio MAX R2 will work with WinGlide. It will not work because it uses OpenGL to render to multiple windows.

6/8/98
Faster Mesa compatible version of WinGlide
I added my WinGlide performance enhancements to Gerhard Pfeiffer's version of WinGlide that works with Mesa. This new version of WinGlide can be found in the [Mesa WinGlide] section of the page.

If you are using WinGlide with a Pentium Pro or Pentium II, make sure to read about Gamma correction and P6 WinGlide performance in the [Improving WinGlide performance] section of the page.

WinGlide with window resizing support that works with 3Dfx OpenGL Beta 2
B2 WinGlide is designed to be used with 3Dfx OpenGL Beta 2 or 3Dfx OpenGL Beta 2.1 to support some programs that use OpenGL to render to a single window. Unlike Gamma WinGlide, B2 WinGlide will handle window resizing when used with 3Dfx OpenGL Beta 2 or 3Dfx OpenGL Beta 2.1. You can get more information about this version of WinGlide and download the first release in the [B2 WinGlide] section of the page.

If you are using WinGlide with a Pentium Pro or Pentium II, make sure to read about Gamma correction and P6 WinGlide performance in the [Improving WinGlide performance] section of the page.

6/5/98
3Dfx OpenGL Beta 2.1
3Dfx has released 3Dfx OpenGL Beta 2.1.

The latest releases of WinGlide do not always work correctly when used with 3Dfx OpenGL Beta 2.1. I'm currently working on fixing these WinGlide compatibility problems.

6/2/98
WinGlide v0.93 released
Gamma WinGlide v0.93, Overlay WinGlide v0.93, and Multi-threaded WinGlide v0.93 have been released.
WinGlide v0.93 now includes gamma correction support that uses MMX instructions. On Pentium MMX systems, the new MMX gamma correction is not much faster than the gamma correction in WinGlide v0.92. On Pentium II systems, the new MMX gamma correction is considerably faster than the gamma correction support in previous versions of WinGlide.

If you are using WinGlide with a Pentium Pro or Pentium II, make sure to read about Gamma correction and P6 WinGlide performance in the [Improving WinGlide performance] section of the page.

Overlay WinGlide v0.93 fixes a bug with the Close command on the system menu. In order for Overlay WinGlide to add minimize and maximize buttons to the GLQuake or Quake 2 window, it must also add a system menu. This does not cause problems with GLQuake because its window already has a system menu. The Quake 2 window does not have a system menu before Overlay WinGlide changes the style of its window. When using previous versions of Overlay WinGlide with Quake 2, selecting Close from the system menu (or clicking on the X in the upper right corner) will close the Quake 2 window but still leave Quake 2 running. Overlay WinGlide v0.93 will now disable the Close command on the system menu when used with Quake 2.

WinGlide and 3Dfx OpenGL Beta 2
I'm sure many of you have already discovered that WinGlide can be used with 3Dfx OpenGL Beta 2 to run some programs that use OpenGL (besides GLQuake and Quake 2) to render in a window. Right now, WinGlide does not handle window resizing but I'm working on modifying it so that window resizing is handled when WinGlide is used with programs that use OpenGL to render to a single resizable window. Although these modifications have been working well with some programs, many programs are still incompatible with WinGlide and 3Dfx OpenGL Beta 2. If you have been using WinGlide and 3Dfx OpenGL Beta 2 with a program that uses OpenGL to render to a single resizable window and the size of the image displayed does not change from its initial size when you change the size of the window, the modifications I'm working on may solve this problem. If there were other problems like no image being displayed at all, these modifications will probably not help fix things. If you've been using Gerhard's modified WinGlide with Mesa, I'm planning on adding my WinGlide performance enhancements to this version of WinGlide. If all goes well, both of these new WinGlides will be released this week.

5/31/98
MMX gamma correction performance
Thanks to all who helped test out my beta release of WinGlide with MMX gamma correction support. The MMX gamma correction code will be added to the next release of WinGlide. Here are the results of the benchmarks comparing the new MMX gamma correction code with WinGlide's current gamma correction code. Quake 2 framerates were used to make the comparisons.

On Pentium MMX systems, the new MMX gamma correction code did not make much of a difference when compared with WinGlide v0.92 gamma correction. It was between 0-2% faster. None of the benchmarks showed the new MMX gamma correction code running slower than the gamma correction in WinGlide v0.92.

On Pentium II system, the new MMX gamma correction code was much faster than the gamma correction in WinGlide v0.92. Part of this performance increase may have come from using MMX instructions but I think that much of the performance increase was a result of the MMX gamma correction code fixing some of the performance problems that P6 processors are having with WinGlide. When compared with the gamma correction in WinGlide v0.92, performance increases around 40-50% were seen, but when compared with WinGlide v0.91 with gamma correction enabled and UseP5Code=1, the new MMX gamma correction code was around 10% faster. Although WinGlide v0.91 with gamma correction enabled and UseP5Code=1 was supposed to be slower on P6 processors, based on the benchmarks I've received, this is currently the fastest version of WinGlide for P6 processors. I'm continuing to investigate WinGlide performance on P6 processors.

5/23/98
More information about WinGlide performance on P6 processors
Further testing has indicated that the differences between the way P5s and P6s handle caching is not having much of an impact on WinGlide performance when gamma correction is enabled. On Pentium II systems, test releases of WinGlide that used the small gamma correction table from WinGlide v0.91 had almost identical performance compared to similar test releases that used the big gamma correction table from WinGlide v0.92. All of the results I have received so far regarding WinGlide performance on P6 processors indicate that WinGlide runs much faster when gamma correction is enabled. I'm still not sure why this happening. If you have a Pentium Pro or Pentium II and find that WinGlide runs faster when gamma correction is enabled rather than disabled but you don't want gamma correction, you can enable gamma correction and set the gamma correction factor to 1.0. On Pentium and Pentium MMX processors, as expected, having gamma correction enabled reduces WinGlide performance a little bit.

5/19/98
WinGlide v0.92 gamma correction slower than WinGlide v0.91 gamma correction on Pentium Pro and Pentium II systems
I've received emails from a few people with P6s saying that the new gamma correction code in WinGlide v0.92 is slower than the gamma correction code in WinGlide v0.91. I'm working on fixing this problem and I may end up adding an option to the ini file that will select P6 optimized gamma correction code or P5 optimized gamma correction code.

I think the gamma correction in WinGlide v0.92 may have run slower than WinGlide v0.91 on P6 processors because P6s use a different caching strategy than P5s. The way an Intel Pentium or Pentium MMX processor handles write misses works much better with WinGlide's gamma correction feature than the way Intel's Pentium Pro and Pentium II processors handle write misses. By using the half size gamma correction table (where each 16 bit entry is aligned on 16 bit boundaries instead of 32 bit boundaries) from WinGlide v0.91, it is possible to fit twice as many entries from the gamma correction table in the processor's L1 cache compared to the gamma correction table used in WinGlide v0.92. Since the P6 caching strategy does not cache the gamma correction table used by WinGlide as well as P5s, the larger gamma correction table from WinGlide v0.92 was probably causing a lot more cache misses on P6s than the smaller table from WinGlide v0.91. Although the P6 caching strategy does not work as well as the P5 caching strategy does with WinGlide's gamma correction feature, it was used with the P6 processors because it increases performance in many cases.

The test release of WinGlide with MMX gamma correction works
The MMX gamma correction code works. This is good news because trying to figure out what's wrong with MMX code without a processor that has MMX instructions around to run it on could be difficult. Based on the benchmarks I've received so far, the new MMX gamma correction code has either been faster or as fast as the gamma correction code from WinGlide v0.92. I haven't gotten many benchmarks back yet so I'll post some more information about performance after more results have been sent in.

5/18/98
3Dfx OpenGL Beta 2
3Dfx has released 3Dfx OpenGL Beta 2.

5/17/98
WinGlide with MMX gamma correction testing begins
I have enough volunteers for testing the prototype version of WinGlide with MMX gamma correction support. If this release works and performance is improved on some systems, MMX gamma correction support will be added to the next release of WinGlide. If you emailed me about testing and benchmarking this prototype version of WinGlide, I'll be sending out the instructions for using it later tonight.

5/16/98
Using MMX for faster WinGlide gamma correction
I spent some time looking into speeding up WinGlide gamma correction by using MMX instructions. I have a prototype version of Gamma WinGlide with MMX gamma correction support that may be able to run faster on Pentium MMX and Pentium II systems but I don't have a system with MMX to test it out on. If you have a Pentium MMX or Pentium II system and would like to help test and benchmark this prototype version of WinGlide for me, send me an email asking about it and I'll tell you where to get the files and any other special information you need to know about this version of WinGlide. Although this version of WinGlide will actually have to execute more instructions than Gamma WinGlide v0.92, using MMX instructions to access memory may help it run faster overall.

5/13/98
WinGlide v0.92 released
Gamma WinGlide v0.92, Overlay WinGlide v0.92, and Multi-threaded WinGlide v0.92 have been released.
WinGlide v0.92 has faster gamma correction support than WinGlide v0.91. A few bugs have also been fixed in the v0.92 release.

Gamma WinGlide v0.92
Beginning with this release of Gamma WinGlide, Pentium WinGlide and MMX WinGlide are now part of Gamma WinGlide. If gamma correction is disabled, Gamma WinGlide will work like Pentium WinGlide if MMX support is not enabled or MMX WinGlide if MMX support is activated.

Overlay WinGlide v0.92
With previous versions of Overlay WinGlide, when used with Quake 2, the overlay would be displayed in the upper left hand corner of the screen until the Quake 2 window was moved or resized. This bug has been fixed.

Multi-threaded WinGlide v0.92
A bug that could have allowed memory to be freed before one of the threads was done accessing it has been fixed.

I ran some benchmarks on my computer using GLQuake and timedemo demo1 to compare the new gamma correction code in WinGlide v0.92 with the gamma correction code in WinGlide v0.91. The benchmarks were run using a 400x300 windowed resolution.
Gamma WinGlide
v0.91: 20.4 fps - gamma correction is always enabled with v0.91
v0.92: 21.0 fps - with gamma correction enabled
v0.92: 21.3 fps - with gamma correction disabled (Pentium WinGlide)
Overlay WinGlide with gamma correction enabled
v0.91: 21.9 fps
v0.92: 22.5 fps

I didn't run extensive benchmarks with Quake 2 or GLQuake at different resolutions but the quick comparisons I made using both Gamma WinGlide and Overlay WinGlide showed that the v0.92 gamma correction support was also a little bit faster than the v0.91 gamma correction support when used with Quake 2.

5/8/98
Coming soon - Better WinGlide gamma correction support
I'm working on improving WinGlide's gamma correction support. The new gamma correction code accesses the gamma correction table 32 bits at a time instead of 16 bits at a time. This can make the color lookups faster because slow prefixed opcodes are no longer being used but it also makes the gamma correction table twice as big (256KB instead of 128KB). On my computer, the new gamma correction code runs GLQuake a little bit faster. GLQuake 2 runs at about the same speed. This is probably because GLQuake 2 is more colorful than GLQuake and with the larger gamma correction table, this could cause more cache misses. Since this new code accesses memory 32 bits at a time, it will not cause partial register stalls on P6 processors and the UseP5Code option in the ini file will not be needed. The P6 friendly code in previous versions of WinGlide with gamma correction support may have actually run slower on some P6 systems than the P5 code because the extra memory accesses used to prevent partial register stalls may have hurt performance more than the partial register stall penalties.

5/3/98
Was not able to make fullscreen only games run windowed
I tried to get some fullscreen only games that use Glide to run windowed but found out that this was not possible because they use DirectDraw exclusive mode. It was possible to copy the frame from the Voodoo's frame buffer to the fullscreen only display on the 2D card but this doesn't help much because it is slower and it is still fullscreen. This would be beneficial if you had a Voodoo card with a broken video output system. I won't be spending any more time on this because of the compatibility problems that I had with games, besides GLQuake and Quake 2, that used Glide. If you have a Voodoo card with a broken video output and a 2D card that supports Overlay WinGlide, it is possible to run GLQuake or Quake 2 at a low resolution such as 320x240 or 400x300, stretch the output window to a good size, and still get a good framerate. It is also possible to use WinGlide to run GLQuake or Quake 2 in a fullscreen low resolution video mode such as 320x240 or 400x300 if your primary video card supports these DirectDraw video modes.

5/1/98
Future plans updated
I added a few things to the Future Plans section of the page. Future plans lists stuff I may be working on now or would like to work on in the future. Some of the things listed there may not happen any time soon or at all because of limited time and limited hardware support. I may release a new version of Gamma WinGlide with MMX gamma correction support soon that could speed things up by a few percent. Since I don't have an MMX capable system to try it out on, the first release would be a public alpha release.

4/26/98
Multi-threaded WinGlide v0.91 (Beta)
[Multi-threaded WinGlide] version 0.91 for multi-processor systems has been released. It includes MMX support, gamma correction support, more efficient thread synchronization, and a couple of minor bug fixes.

Since I don't have a multi-processor system to test it out on, it will be called a beta release until I hear that it's improving performance on multi-processor systems. If you have a multi-processor system and try out this version of WinGlide, please send me a quick set of benchmarks so I know that the new thread synchronization code I added is working properly. Make sure to mention how you benchmarked it and what resolution the benchmark was run at. This version of Multi-threaded WinGlide is similar to Multi-threaded WinGlide release 1 so it should improve performance by about 10-30% over the versions of WinGlide without multi-threading when run on a multi-processor system.
- If MMX and gamma correction are both disabled, compare this version of WinGlide with [Pentium WinGlide].
- If MMX is enabled and gamma correction is disabled, compare this version of WinGlide with [MMX WinGlide].
- If gamma correction is enabled, compare this version of WinGlide with [Gamma WinGlide].

4/23/98
Multi-threaded WinGlide Release 2 discontinued
Multi-threaded WinGlide Release 2 is being discontinued because of stability problems. Although it works okay for a while, it eventually causes Windows NT 4.0 to crash. Sometimes there is a blue screen and other times it just completely locks up. I'm going to continue work on Multi-threaded WinGlide Release 1 because it does not have the stability problems that Release 2 has and it can improve WinGlide performance considerably on multi-processor systems. Multi-threaded WinGlide v0.91 will be based on Multi-threaded WinGlide Release 1 and include better thread synchronization, MMX support, and gamma correction.

4/20/98
Two new versions of WinGlide with gamma correction support have been released
Gamma WinGlide v0.91 and Overlay WinGlide v0.91 with gamma correction support have been released.
WinGlide v0.91 fixes the problems that v0.9 was having with Quake 2 and Windows 95.

Gamma WinGlide v0.91
- Has adjustable gamma correction support.
- Not as fast as Pentium WinGlide or MMX WinGlide (3-5% lower framerate than Pentium WinGlide on my Pentium 133 with a Monster 3D).
- Use this version of WinGlide instead of Pentium WinGlide or MMX WinGlide if you need gamma correction support.

Overlay WinGlide v0.91 with gamma correction support
- Has all of the features included in Overlay WinGlide v0.8.
- Overlay WinGlide's previous gamma correction feature has been renamed to alpha blending.
- Includes the new gamma correction support from Gamma WinGlide v0.91. Since the gamma correction is done by the CPU, it will work on any system than supports Overlay WinGlide.
- The new gamma correction feature does reduce performance some (10-15% lower framerate than Overlay WinGlide with gamma correction disabled on my Pentium 133 with a Monster 3D). Even with this reduced framerate, Overlay WinGlide v0.91 with gamma correction enabled is still faster or at least as fast as Pentium WinGlide when used with GLQuake and Quake 2 on my computer.

Working on fixing the problem that WinGlide v0.9 is having with Quake 2 and Windows 95
I found out how to work around the problem that WinGlide v0.9 is having with Quake 2 and Windows 95. I should be able to complete and upload fixed versions of WinGlide (Gamma WinGlide v0.91 and Overlay WinGlide v0.91) later today. The workaround involves statically linking the C Run-time library. The side effect is that this makes WinGlide much larger (Gamma WinGlide's glide2x.dll goes from 23.5KB to 81.5KB).

Quake 2 and Windows 95 compatibility problem with Gamma WinGlide v0.9 and Overlay WinGlide v0.9
When using WinGlide v0.9 with Quake 2 and Windows 95, an invalid page fault will occur in KERNEL32.DLL when exiting Quake 2 and when switching video modes. GLQuake and WinGlide v0.9 do not have this problem in Windows 95 or Windows NT 4.0. Quake 2 and WinGlide v0.9 do not have this problem in Windows NT 4.0. Even though Windows 95 says the program will be shut down after the page fault occurs in KERNEL32.DLL, Quake 2 continues to run after pressing the Close button.

I'm working on fixing this compatibility problem and will upload fixed versions of WinGlide with gamma correction support as soon as possible. I've already determined that the problem is caused by initializing the C Run-time library. Previous versions of WinGlide did not have this problem because they did not need to use the C Run-time library. The gamma correction support added to WinGlide v0.9 does some floating point math that requires functions from the C Run-time library.

4/19/98
Two new versions of WinGlide with gamma correction support have been released
Gamma WinGlide v0.9 and Overlay WinGlide v0.9 with gamma correction support have been released.

Special update: Make sure you read about the compatibility problem that these two new versions of WinGlide have with Quake 2 and Windows 95 (It's mentioned in the 4/20/98 news). WinGlide v0.9 will not work unless the C Run-time library msvcrt.dll is installed in your system directory. This dll is redistributable so I'll either include it in the next release or it may not be necessary to include it if I end up statically linking the C Run-time library to the versions of WinGlide with gamma correction support. I think Windows NT 4.0 comes with msvcrt.dll but you may not have this file in Windows 95 if another application did not install it in your system directory.

Gamma WinGlide v0.9
- Includes adjustable gamma correction support.
- Not as fast as Pentium WinGlide or MMX WinGlide (3-5% lower framerate than Pentium WinGlide on my Pentium 133 with a Monster 3D).

Overlay WinGlide v0.9 with gamma correction support
- Has all of the features included in Overlay WinGlide v0.8
- Overlay WinGlide's previous gamma correction feature has been renamed to alpha blending.
- Includes the new gamma correction support from Gamma WinGlide v0.9. Since the gamma correction is done by the CPU, it will work on any system than supports Overlay WinGlide.
- The new gamma correction feature does reduce performance some (10-15% lower framerate than Overlay WinGlide with gamma correction disabled on my Pentium 133 with a Monster 3D). Even with this reduced framerate, Overlay WinGlide v0.9 with gamma correction enabled is still faster or at least as fast as Pentium WinGlide when used with GLQuake and Quake 2 on my computer.

4/17/98
More preliminary gamma correction information
I tried out my new gamma correction code with Overlay WinGlide. Since the gamma correction is done by the CPU, it will work with any video card that supports Overlay WinGlide even though the card may not support Overlay WinGlide's current gamma correction feature that uses destination alpha blending. I ran some benchmarks with GLQuake and Quake 2 at 512x384 and having the CPU do the gamma correction reduced the framerate by about 10-15% when compared with Overlay WinGlide v0.8. Even with the reduced framerate, Overlay WinGlide with the new gamma correction code was still faster or at least as fast as Pentium WinGlide v0.8 on my computer. The versions of WinGlide with this new gamma correction support will have the option of choosing between P6 friendly code that prevents partial register stalls and P5 optimized code. The P5 optimized code increased the framerate by 1-2% on my Pentium 133 when compared with the P6 friendly code.

4/15/98
Overlay WinGlide HCL update
I've added a few more video cards to the Overlay WinGlide hardware compatibility list. I'll try to keep the list updated about once a week if any new results have been submitted.

4/14/98
Gamma correction
I've put together a prototype version of WinGlide that supports gamma correction. I've already optimized it some and it's fast. I ran a few benchmarks on my computer (Pentium 133 with a Monster 3D) and compared to GLQuake and Quake 2 with Pentium WinGlide, using this prototype version of WinGlide with gamma correction only dropped my framerate by 3-5%. I'd like to get a first release of this new version of WinGlide out this week that will support setting the level of gamma correction in its ini file. Right now, some of the assembly code runs great on P5 processors (Pentium and Pentium MMX) but causes partial register stalls on P6 processors (Pentium Pro and Pentium II) so this will need to be fixed. I'm not sure how much the code that fixes the partial register stall problems on the P6's will affect P5 performance but if it makes much of a difference I will add a flag to the ini file that will select between using the P5 optimized code or the P6 optimized code. Other features I would like to add that may not make it into the first release include gamma correction for Overlay WinGlide because there is very limited support for Overlay WinGlide's current gamma correction feature, the ability to adjust the level of gamma correction while WinGlide is in use, and gamma correction support for Multi-threaded WinGlide.

4/11/98
New versions of WinGlide
Pentium WinGlide, MMX WinGlide, and Overlay WinGlide version 0.8 have been released.

What's new in v0.8?
Smaller code size - The WinGlide glide2x.dll for each of these three versions of WinGlide is about 16KB smaller than the previous version.
WinGlide now copies all of the pixels in each scanline regardless of width - If you ran GLQuake with previous versions of WinGlide on this page except for WinGlide v0.5 and selected a width that was not evenly divisible by 4, the last 1, 2, or 3 pixels on each scanline were not copied from the 3Dfx's frame buffer.

In addition to the v0.8 changes listed above, Overlay WinGlide v0.8 has some additional features.
MMX support - MMX support can be enabled in Overlay WinGlide's ini file.
More error messages - Overlay WinGlide now displays more error messages when it is having trouble displaying the overlay.

4/6/98
More information about the possibility of getting WinGlide to work with 3D Studio MAX R2
I ran some tests with WinGlide, the Mesa Voodoo driver, and 3D Studio MAX R2 and was able to see that these three programs have a couple of major problems working together right now. There may be a way to fix these problems and get one of the windows in 3DSMAX to support Voodoo accelerated rendering by using a modified version of WinGlide and a modified version of the Mesa Voodoo driver. Here's some more information about the problems I observed how I would try to fix them:
After starting 3DSMAX using software rendering support, I saw four windows: the Top, Left, Front, and Perspective windows. When using the Mesa Voodoo driver with 3DSMAX, it will switch to fullscreen and I can only see the Left window and the grid drawn in it but at least this shows that 3DSMAX is capable of rendering something using the Mesa Voodoo driver. When using the Mesa Voodoo driver with WinGlide, nothing is drawn in any of the four windows.
Problem #1
WinGlide is designed to copy the frame from the Voodoo to the 2D card when the front and the back buffer are swapped. Since it looks like 3DSMAX is only using the front buffer, WinGlide will never copy the image from the Voodoo to the 2D card. This could be fixed by modifying the Mesa Voodoo driver and WinGlide so that the image in the front buffer is copied to the 2D card when the image in the Voodoo is modified. It may be necessary to limit how often the frame from the Voodoo is copied to the 2D card because copying it too often could have a large negative impact on overall performance.
Problem #2
3DSMAX was using OpenGL to render to four windows but the Mesa Voodoo driver only supports one rendering context. While it may be possible to modify the Mesa Voodoo driver to support multiple windows, this could get very complicated so I'll just stick with describing how I would go about getting one of the windows working with the Mesa Voodoo driver and getting the other windows working with the default OpenGL support. Although the Mesa Voodoo driver only supports one rendering context, I'm wondering if it would be possible to modify it so that it keeps track of multiple rendering contexts, accelerates one of them using the Voodoo and passes calls related to the other rendering contexts to the default OpenGL driver in the system directory. If there is a way to get this working properly, it may be possible to accelerate the rendering in Perspective window and leave the rendering in the other windows to the default OpenGL driver.
----------
Although I'm not going to be looking into making the Mesa Voodoo driver, the version of WinGlide by Gerhard Pfeiffer that emulates a Voodoo Rush, and 3DSMAX work together any time soon, the source code for both Gerhard's version of WinGlide and the Mesa Voodoo driver is available if anyone else would like to try to get 3DSMAX to use a Voodoo for rendering in a window.

More RAM
I've been running with 48MB of RAM for about a week now and it has made a big difference compared to the 32MB I had before. Of course NT 4.0 would benefit from even more memory but I don't want to spend too much money on upgrading my old Pentium 133 system. With the extra RAM and a Monster 3D, it will still be fast enough for a while.

WinGlide compatibility page
Added the WinGlide compatibility page. It contains information about WinGlide compatibility with programs other than GLQuake or Quake 2. Since WinGlide doesn't work with many programs that use glide besides Quake, there is just a short explanation of why these other programs don't work with WinGlide. There's also a section on getting windowed OpenGL support working with WinGlide and the Mesa Voodoo driver. It's not fully compliant OpenGL support and it has certain limitations but it does work with some programs that use OpenGL.

4/3/98
WinGlide with 3D Studio MAX R2?
I told a couple people who were asking about using a Voodoo with 3D Studio MAX how to set up the Mesa Voodoo driver and a version of WinGlide by Gerhard Pfeiffer that emulates a Voodoo Rush. Using these two drivers, it is possible for some programs to use the Voodoo for windowed OpenGL rendering. Unfortunately, this combination that I thought would have the best chance of working with 3D Studio MAX did not work. If anyone has had any success getting 3D Studio MAX working on a Voodoo with WinGlide and would like to send in a solution, I'll add it to the page.

Fullscreen and windowed Voodoo accelerated OpenGL for some programs that use OpenGL
I'll add instructions for using WinGlide and the Mesa Voodoo driver to run some OpenGL programs on a Voodoo and in a window to the WinGlide compatibility page soon. Right now, the Mesa Voodoo driver is an alpha release and many programs that use OpenGL either don't work with it or don't display the proper output but some programs work just fine and compatibility may improve in future releases. You can use the Mesa Voodoo driver by itself (without WinGlide) to run some fullscreen OpenGL programs on your Voodoo and since it is fullscreen, performance is much better than when using WinGlide.
If you would like to learn more about Mesa, visit the The Mesa 3-D graphics library page.
If you don't want to compile Mesa yourself or you don't have a compiler, you can get the precompiled Mesa dlls from Phil Frisbie's Programming Page.

3/31/98
Found another video card that works with Overlay WinGlide
The ATI 3D PRO TURBO PC2TV works with Overlay WinGlide and has been added to the Overlay WinGlide hardware compatibility list. This card did not have the minimum overlay stretch factor limitations that my OEM Diamond Stealth 3D 2000 has but it looked like there was a bug in the current Windows 95 drivers that affected clipping support. The current Windows NT 4.0 drivers for this card worked properly with clipping enabled.

3/26/98
Improving WinGlide performance
Added a section to the page that has tips for Improving WinGlide performance.

3/24/98
WinGlide file archive
I've added a WinGlide file archive that contains links to old versions of WinGlide.

3/21/98
Multi-threaded WinGlide Release 2
Release 2 of Multi-threaded WinGlide for multi-processor NT 4.0 systems has been released. With some systems, it's possible that Release 2 will not perform better than Release 1. This will happen if copying each frame from system memory to the 2D card is slower than Quake drawing each frame in the 3Dfx. If Quake takes longer to draw a frame than it takes to copy a frame from system memory to the 2D card, then Release 2 should be able to speed things up. I'll post the results of the benchmarks after I receive four or five sets of numbers for this new version of Multi-threaded WinGlide.

Multi-threaded WinGlide Release 1 benchmarks
Based on the numbers I've received so far, Multi-threaded WinGlide Release 1 has been increasing the framerate by 10-30% over Pentium WinGlide with Quake 2 at 640x480 on dual processor NT 4.0 systems.

3/20/98
Overlay WinGlide hardware compatibility update
A few video cards have been added to the Overlay WinGlide hardware compatibility list. It looks like hardware overlays are not as widely supported as I first thought. So far, video cards with the S3 ViRGE chipset work with Overlay WinGlide but many other cards do not. That's the bad news. The good news is that cards with the S3 ViRGE chipset are fairly common.

3/18/98
What to expect in Overlay WinGlide version 0.8
The [Future Plans] section of the page now has a list of the new features I'm planning on adding to the next version of Overlay WinGlide.

3/17/98
The Overlay WinGlide hardware compatibility list is ready to receive data
I've added a page for submitting Overlay WinGlide hardware compatiblity information to the Overlay WinGlide hardware compatibility list.

3/14/98
New version of Overlay WinGlide
Version 0.7 of Overlay WinGlide is available for download. The new ini file support allows gamma correction and some other features to be configured.

Release 1 of Multi-threaded WinGlide is ready
Release 1 of Multi-threaded WinGlide for multi-processor NT 4.0 systems is finished and available for download. I'll post information about benchmarks received from people with multi-processor systems in a few days.

3/12/98
Invalid return addresses
I've received a couple of emails asking questions about WinGlide that had invalid return addresses. Make sure that you have your return address set up properly if you would like to receive a response.

Overlay WinGlide hardware compatibility list
I've started work on adding an Overlay WinGlide hardware compatibility list to the page. It's not done yet but if you would like to see what it looks like so far, click here. I still need to set up a page for submitting Overlay WinGlide compatibility information so I can start adding to the list.

3/10/98
- I received some responses from people interested in trying out a multi-threaded version of WinGlide that should be able to run faster on dual processor NT 4.0 systems. See the [Multi-threaded WinGlide] section of the page for more information about this new version of WinGlide that I'll be working on.
- I'm planning on releasing a new version of Overlay WinGlide sometime this weekend. It will have ini file support, allowing some of its feature like the level of gamma correction to be user configurable.

3/5/98
- Chris' WinGlide page is up.
- MMX WinGlide officially released.
- Three versions of Overlay WinGlide have been released. These versions of WinGlide are faster than Pentium WinGlide and come with new features including gamma correction and resizeable output windows that will stretch the image from the Voodoo card to fit the window. See the [Overlay WinGlide] section of the page for details.


Copyright 1998-1999 Chris Dohnal