Page 34 of 38
Posted: Sat Feb 16, 2013 9:58 pm
That was how my original "shut up and plot pixels" approach was done. The performance was atrocious. I was maxing out a thread on my i7 (!), and Visual Studio's profiler indicated that NVIDIA's OpenGL library was using 80% of the CPU time. It strained my processor enough that the audio was starting to stutter, so I didn't even bother trying it on a lower-end system. Keep in mind that in action-heavy boards, it would have to convert and send roughly 71 640x350 RGBA textures to the GPU, which winds up consuming about 64MB/s of bandwidth and a ton of CPU (when textures are loaded in OpenGL, they are transformed via software into a format that is most efficient for the GPU). OpenGL generally does better with textures that are powers of two, but 1024x1024 would have probably caused more CPU strain.
The new approach only uses about 10-15% of a processor thread with 50% of it going to NVIDIA's OpenGL library. There's still some overhead somewhere (then again, it could be the relatively high refresh rate) but considering how well it worked on my fairly low-end Mac, the results are promising.
If I could find a good way to colorize the blocks (palletized textures would probably work, but the overhead of implementing those would make up for any gains I made elsewhere) and reducing it down to one pass per block, that would save a fair amount of rendering time. I'm pretty positive that I could do it with shaders, but that's something that I would implement in another rendering path.
Posted: Sun Feb 17, 2013 10:26 pm
That's just so weird. I was able to do this with DirectDraw utilizing nearly 0 CPU power. I wonder why there's such a huge difference in power requirements.
Edit: one thing I will note is that the format of the pixels was forced to be the same in memory as it was when transferred to the GPU: 32-bit color w/ premultiplied alpha. The last part is important because it helps performance. That and we use full alpha the whole time, so there's actually no further processing on the pixels needed. That's actually the reason why 32Pargb is used as the pixel format: GDI+ uses this format natively and converts everything to it for processing.
Edit 2: Have you tested creating new textures every frame vs. locking pixels on an already-created screen texture?
Posted: Sun Feb 17, 2013 11:43 pm
I have a feeling that it's mostly because the image has to be processed a second time by the OpenGL implementation, considering that the OpenGL implementation is the process that eats all of the CPU. I have a feeling that there's a lot more flexibility in DirectDraw's texturing than there is with OpenGL. I think OpenGL textures are really meant to be loaded once and used repeatedly, which is probably why so much CPU time is dedicated to converting the texture format.
I think there are a few ways that I can improve, so I'm going to experiment with it and do a little bit of research. If I could reduce the number of quads that it draws (perhaps drawing areas with contiguous spans of a certain background color with a single quad) it should reduce the amount of data that the system has to send to the GPU.
Oh, also, I tested the GL renderer on my laptop's Intel HD 4000 and it performed very well, rendering at full speed at 2x scaling with no visual anomalies. Not too surprising, really; Ivy Bridge's GPU is surprisingly good. My mom's laptop has a GMA 965, so that should be a bit more interesting.
Posted: Mon Feb 18, 2013 1:25 am
I'm thrilled to see that you have so many platforms and machines to test on! It sure beats crowdsourcing the effort (which I have done until this point...)
If you're doing one quad for background and one textured quad for foreground, you might run into performance issues as all the foreground quads are going to have alpha on them which has crazy performance degradation on older chipsets. It might require a little work, but I think one method of solving this problem is to create every single texture that could be used. Render all 256 characters for every foreground color and every background color. This would actually run you 65536 textures (yes, you would want to do brights just for the option). In 32 bit, with a 8x16 font, it would occupy 32mb of video memory. If you use palettized, you could reduce this to a 2-bit format (I don't know if modern video cards like this). That actually cuts down the necessary video memory to a more manageable 2mb.
Edit: As for missing out on smoothing.. the only solution in this case is a screen filter.
Posted: Mon Feb 18, 2013 8:13 am
I think most people would argue that I have entirely too much hardware. :p It does come in handy for projects like this, though!
That's not a bad method, honestly. Considering I had a 32MB graphics card as early as 1998, it wouldn't cause problems on any modern systems. The only systems that I think would be problematic are ones that came out during that awkward period before the Vista compatibility specs were released. I seem to recall some systems made in around 2004/2005 that had two VRAM options: 1MB and 8MB. Then again, considering the performance of some of those UniChrome and early Intel "GPUs," they might see better performance with GDI. It would definitely take some investigation and profiling to see if the performance gain outweighs the heavy VRAM usage.
Implementing that would actually be quite simple with the routines that I'm using now. All I would have to do is do one texture load pass for each foreground and background color, then call the texture like this:
Glyphs[char + (0x100 * color)]
Regarding palettized textures, those do tend to be very problematic even with older drivers. I remember having issues with those as early as 1998. I believe GLQuake converted textures to 15-bit prior to loading them into video memory due to issues with palettized textures in video card drivers (I'd have to check the Quake source code again; it's been a while since I looked at its texture loading routines). I also remember palettized textures being a huge problem for the earlier versions of Final Fantasy VII for PC. The NVIDIA patch corrected that (I believe it simply converted the textures prior to loading them).
One more thing that I could try is bumping it up to the next highest power of two. Right now, 8x14 fonts are being loaded as 8x14 textures, which could be causing some issues. I haven't done any profiling on 8x8 and 8x16 fonts at this time.
I'm going to have to look into getting a screen filter or something of that nature implemented soon. It kind of bothers me that there's a feature that simply doesn't work with the renderer at this time. I'll also have to manage dynamic font and palette changes (I will most likely do this by adding an event to TextModeVideo -- this would be handy even for the GDI renderer, as the UI could be signaled to change its size if a different sized font is loaded), but I don't think I'll lose sleep over that one at this time. ;)
Oh, also, I added the ability to override config settings via the command line, such as:
lyon +Video.Renderer=OpenGL +Video.Font=RED.COM +Video.Scale=2 +Audio.Enabled=1
Verbs will be specified using a hyphen, such as:
...but those aren't implemented yet.
You can also bypass the FileOpenDialog at the beginning and jump right into a world by simply typing in the filename. You can also combine this with other settings:
lyon \zzt\bugtown.zzt +Video.Renderer=OpenGL
That should makes fonts and palettes a bit less annoying to load and will allow games that use custom fonts/palettes to be easily loaded via a batch file.
Posted: Mon Feb 25, 2013 9:46 pm
After some discussion I think it's probably a good idea to have each instance of Roton run in its own thread. This provides many benefits. Here are the main two:
- Code accuracy: ZZT has a lot of loops aside from the main game loop. This doesn't work well if the game is running in the same thread as the host application: if we make it run precisely like ZZT, it'll hang the host application in these loops. On its own thread, it can suspend the thread as long as it wants.
- Scalar performance: if you are running multiple Roton sessions, separating them to their own thread makes it possible to make use of multiple cores and processors.
I am in the process of building a framework that will also contain the memory-layout functionality in a branch called Threaded. After this branch matures enough, I'll begin migrating what we have in the trunk, piece by piece, until we can safely merge it. A lot of this framework was already put together in the last 24 hours.
The way the framework will work is like this: The ZZT code won't get to see much of the other layers as they will be hidden behind other interfaces. These interfaces can be tied to anything you want. This makes it easy to substitute any interaction between ZZT and the system for any other. For example, if we want to substitute disk based storage with online storage; if we want to change graphics/audio/whatever frameworks, it can be done with little effort.
Posted: Tue Feb 26, 2013 5:48 am
So these openGL solutions (i.e. dependencies) are for the output of Lyon, and don't actually affect the running of zzt(roton) proper. so I could run it on some sort of raspberry pi or similar device and have the zzt oop move an actual robot around? Or serve the program to text terminals?
Posted: Tue Feb 26, 2013 8:09 am
Commodore wrote:So these openGL solutions (i.e. dependencies) are for the output of Lyon, and don't actually affect the running of zzt(roton) proper. so I could run it on some sort of raspberry pi or similar device and have the zzt oop move an actual robot around? Or serve the program to text terminals?
As long as you can build using Mono, you don't need any other libraries to use the Roton core by itself. You don't even need to hook up video, audio or controls to it. Even if you do need these features, there are fallback ones built in that use only the .NET framework.
Edit: Text terminals might require a bit of work- theoretically it is possible. If you could somehow tie a console window to a terminal, it'd be something that a wrapper can be coded for: it'd be in charge of relocating the cursor, etc..
Posted: Thu Feb 28, 2013 2:35 am
https://code.google.com/p/roton-zzt/sou ... Wrapper.cs
These values might be useful to someone wanting to hack the original executable.
Posted: Thu Feb 28, 2013 11:45 pm
I think the main complication with console output is going to be finding libraries to help out with that. I haven't had any luck finding any stable .NET bindings for PDCurses, NCurses, and the like. I believe there is some support for text positioning and color in System.Console, but I don't recall its support being low-level enough to allow for any sort of sane game development. I'll have to look into it again to be sure.
Meanwhile, I'm working on an updated audio system that should do a better job staying synched up. Yaaaaaay.
Posted: Sat Mar 02, 2013 3:18 am
Any chance we can get some binaries for the mac version? (and anything else if there's more recent developments ready to be live)
A poor friend of mine who never heard of ZZT has gotten into it when I forced him to play town.
Posted: Mon Mar 11, 2013 9:48 pm
Well, the audio system is a bit crap at the moment, but aside from the need to put up with some crackles and occasionally double-tap B it more or less does its job. The OpenGL renderer works fine with my extremely humble late-2010 MacBook Air, so I think it's safe to say that just about any Intel Mac will run it just fine. One little caveat: the save feature isn't currently functional, either, so that might make or break this whole thing. Also, don't introduce him to Chrono Wars just yet...there's an edge case that we're still working out with the OOP interpreter.
He is definitely going to have to install the Mono runtimes (it should be available here
, but the page isn't loading for me at the moment, so I'm not sure where to go once you get there).
Finally, I'll need to put together some sort of package. For some stupid reason there's no easy way of launching .NET applications by default on OS X (you need to go "mono Lyon.exe" from the command line), but at worst I should be able to rig up a shell script that will launch it and throw a DMG together.
As an added bonus, the same EXE will work on all supported platforms, so yay for that at least.
Posted: Wed Mar 20, 2013 11:58 pm
Google Code had some problems with pushing updates the last couple days. Seems to have been resolved. Work will continue now..
Posted: Fri Jun 21, 2013 6:04 am
I really, really want to prove myself wrong about "I can't finish jack shit." Given enough time everything eventually gets done, but I get sidetracked.
Well okay, the last few months were due to seeking professional help for other issues. We're back to the regular programming now.
Posted: Tue Jun 25, 2013 6:29 pm
Saxxon wrote:I really, really want to prove myself wrong about "I can't finish jack shit."
Story of my life...