jesperengine - Tumblr blog

jesperengine · 8 years ago

Text

Now that I have about 200 different values each monitoring an average bpm. It’s time to mix all of these together into one value that tells us what the bpm is. I’m plotting all these values in a graph, you can see this graph in the picture below.

The green lines is the raw audio data from the fast Fourier transform, the white graph at the top of this window are the plotted values from the 200 frequency bins. The faster a beat is, the further it gets plotted to the left. The more accurate a beat is, the more it gets plotted to the top. I’m also applying an algorithm that checks how many values are near each other, the more values are near each other the higher their accuracy gets, so it gets plotted higher as well. This graph has a logarithmic scale, so every time a beat gets twice as fast, it’s plotted exactly one unit to the right. The reason for this is because usually songs have multiple beats. Each beat being twice as fast as the other one.

The song I’m using in this screenshot is a mix from Riley Reinhold (at about 5:50). This song has a very strong beat which is why I’m using it in this example, though it works with songs that have a less noticeable beat as well.

As you can see in this graph there are two very clear white lines in the graph. The white line on the right side represents all the thick green lines you see below the white graph. The white line on the left represents all the smaller green lines.

I’m trying to take the three highest points of this graph, which is kind of difficult because more often than not there are multiple values very close to each other, so I can’t just sort the values and take the 3 highest ones. Using some magic and by basically looking at the distance between an earlier taken value I managed to get three high points. As you can see below it clearly detects the two white lines. I made it so it draws a red green and blue line on the three highest points. For some reason it also detects a high point a bit on the left of the graph, where the blue line is. This is good though, apparently the graph is a bit higher there and it seems like all three lines have the same distance. Which makes sense because when multiple beats are detected they usually are twice as fast as a slower beat. And since this is a logarithmic scale this means all detected beats should have a distance of exactly one from each other, which is the case now.

Finally, I use these 3 values to calculate an average beat. This isn’t really possible though since there can be multiple beats. So instead I created a value ranging from zero to one, representing the offset of the beat. I then use this offset to draw a bunch of lines, as you can see below. I’ve also added 10 beat visualisers that make these lines blink based on what beat they represent. Each visualiser also detects an offset for when exactly the line should blink, so it hits the beat right on time.

All of this is just a visual representation of what is happening behind the scenes. I made this to get an understanding myself of what is happening. It’s simply a lot easier to debug something like this instead of a bunch of numbers in a terminal.

I’ve linked all this real time information to the shader I had earlier, so what I have right now is a bunch of fractals that are sort of dancing on music. It looks at both the beat of the music and if there are big changes in the music. I’ve also made some parameters change based on the energy of the music.

0 notes

jesperengine · 8 years ago

Text

Beat detection

Detecting a beat in a piece of music seems easy, most people can easily find the beat in a piece of music easily. Unfortunately, this is a lot more difficult to do for computers. Detecting a beat requires you to have a feeling for the rhythm of a song and computers aren’t really good at feeling stuff.

While searching the internet, trying to come up with a good way of analysing the beat of a song I’ve come across a lot of articles and papers saying I should use the autocorrelation approach. This approach involves shifting the entire song a few samples to the left, and then comparing the difference. Then keep shifting it to the left until you’ve found the least amount of difference.

This works very well when you have one specific song you want to analyse, though this doesn’t really work very well in my case, since I’m not analysing one specific song, I’m analysing the current stream of music in real time. I only have 16ms (or 11ms in VR) per frame to detect the beat. So I had to come up with a bit more optimized approach.

What I’ve come up works like this: You analyse the beat per frequency bin, after all that’s kind of how humans do this as well. You’ll be able to detect a beat from the bass for example, even though it’s surrounded by a lot of noise in all the other frequency bins.

For each frequency bin, I analyse the volume, similarly to how I analysed the amount of overall change in the song. Except that now I’m just looking at the energy in a frequency bin, rather than using the five magic formulas. For each frequency bin, I create a number that is incrementing every frame. Then once a big change in volume is detected in that frequency bin, this number is reset to zero. Every time this number is reset, an average of this number will be calculated. So you’ll end up with an ‘average amount of frames per beat’ for each frequency, rather than a bpm for each frequency.

When plotting this this is the result.

Once again the green stuff is the Fast Fourier transform. The red lines are these numbers we are tracking. As you can see every time a beat is detected, the value jumps to zero (black) then slowly increases, causing it to become more red.

0 notes

jesperengine · 8 years ago

Text

Finding differences

That’s some nice variables you’ve got there. But the next step is analysing these five values and detecting when there’s a sudden change. Which isn’t easy, because these values are already always changing very rapidly.

There’s one thing I noticed though, these values always are changing rapidly between a minimum and a maximum. So what I’ve done now, is monitor these five values and detect their average, their average minimum and their average maximum. Then I compare the value of the min, max and average to the value of these about half a second ago. The amount of difference is then calculated by subtracting these values. This is done for all 5 variables. Resulting in a single number each frame, saying how much of a difference there is.

When you plot this number in a graph, this is roughly what it will look like, The green stuff in the background is the Fast Fourier transform from the previous post. The white line is the amount of change. Specifically this graph is from the Smash hit OST (part 2 at 1:38)

So now that this is working, it’s time to start working on the beat detection.

0 notes

jesperengine · 8 years ago

Text

rloudness The last 3 weeks I’ve been working on getting sound into my engine. Not the ability to play sound, that’s already built in the SDL library, but rather the ability to monitor the sound that is currently being played by windows. It took longer than I expected but with lots of help from this example I’ve finally got it working. There were some issues that I didn’t expect, so it took me a lot of time to debug them. The most important one being the format in which I expected to receive the audio waveform. I expected every byte to be one sample, but this obviously wasn’t the case. So I spent a lot of time figuring out why the data that I was getting looked so random. Another issue was not being able to send this data to the GPU via OpenGL properly, because of the same reason, the values had to be converted to a float array. But there were missing gaps in my waveform, this was because I wasn’t taking enough data from the array. On OS X the exact opposite happened, I was taking too much data from the array, causing extra samples to be generated. Usually all zero values or weird corrupt memory.

The next step was to turn the waveform into something more usable, I know I had to apply a Fast Fourier transform to the data, though after doing this my values still looked very weird and unusable. I looked at the implementation from web browsers, since all latest web browsers have built in functionality to convert a waveform to frequency data. Apparently you have to apply a Blackman window to the waveform first, this filters out useless data.

After having done all of this, this is the result:

This is the spectrogram from the Song Cult of the Zealous from Disasterpeace.

Now the task was to simplify this data, in order to see where big changes occur in the music. Luckily I’m not the first one to try to do this. This paper by George Tzanetakis and Perry Cook was very helpful. It tells us that 5 equations are needed to see what the texture of a piece of music is. After writing algorithms to detect these 5 features, I’ve put these into a graph. This same piece of music then looks like this.

The 5 green lines show from top to bottom:

loudness (amplitude)

zero crossings (amount of noise)

flux (difference per frame)

rolloff (tone sharpness)

centroid(average frequency)

When I look at these five values it doesn’t really look like it’s something I can work with. They still all seem a bit random. Though according to the paper I should be able to calculate the Mahalonobis distance between these 5 values. I’ll try to add that and see if it gives any useful results.

0 notes

jesperengine · 8 years ago

Text

Anti-aliasing

fb(Anti-aliasing makes pictures look good, in games it makes wires not look as jittery and makes textures look smooth. It’s the difference between this

and this

But in VR you don’t need anti-aliasing because it looks nice, there are several reasons why anti-aliasing is preferred in VR.

It makes the screen resolution of your HMD look higher.

You have two eyes and if two images are aliased they look different from each other, which is confusing for your brain.

Your head is constantly moving, so aliased edges are especially noticeable in VR compared to traditional games.

The traditional way of anti aliasing was by rendering to a very large texture, and then downscaling this texture so the edges get smooth. The downside of this technique is that it is very expensive. If you render to a texture twice your original screen resolution, you’ll need to render four times the amount of pixels.

In my engine this is even less possible, the size of the render texture is determined by your framerate. So it is already using the best texture size possible, if we want to double the texture size we’re going to get framedrops for sure. So we’ll need another way to anti-alias the final image.

Luckily there are lots of ways to anti-alias an image in real time. One method that is commonly used is MSAA. It looks at the vertices in your scene and only renders the required pixels with multiple samples. Unfortunately, we’re working with screen space shaders here. So there are no vertices. Besides, MSAA is still very expensive most of the time.

After doing a bit of research I’ve come to the conclusion that the best method to go with is SMAA. Similarly to FXAA it is a filter that you apply after the final image is rendered. It detects aliased edges and fixes them.

So I’ve spent the last couple of days implementing SMAA into my engine. Which took a while since I’m not very experienced with graphics libraries yet. Yesterday I’ve spent all day figuring out why I wasn’t getting the expected results. In the end it turned out you had to set all textures to linear filtering, and there were a couple of textures set to point filtering. I found out about this by looking at the implementation of SMAA for this unity plugin.

Anyway, these are the steps I’ve done so far. This is the aliased version of the engine.

As you can see when you zoom in, the edges don’t look smooth at all.

In the first pass you have to detect the edges, resulting in an image like this.

When a pixel is red that means there’s an edge on the left side of this pixel, when a pixel is green that means there’s an edge below this pixel. And if a pixel is yellow, that means there’s a combination of these two edges.

Then you can use this image to create something like this.

The algorithm measures how long edges are and does some more SMAA magic, this basically shows how much a pixel should be blended with the pixel on the other side of the edge. Which, after the final pass, will result in this.

But don’t take my word for it, I got a lot of info from the SMAA github page and this OpenGL port of SMAA.

If you’re implementing SMAA in OpenGL, make sure you flip the area and search textures horizontally. That is, over the y axis. So left becomes right and right becomes left. That’s what I initially did wrong, I flipped the textures vertically.

Also make sure you have set all textures and FBOs to linear filtering. I misread the instructions and set most textures to nearest neighbor (point filtering) and it took me way too long to realise my mistake.

If you can’t get SMAA working right, try inspecting the SMAA unity plugin. Changing some of the code in that plugin immediately made me realise what I did wrong when importing the textures, because when I rendered the area texture to the screen to see what that looked like, I noticed it had linear filtering rather than point filtering.

0 notes

jesperengine · 8 years ago

Text

C++ is cool, but very complicated. Sure it’s easy to write simple, or even complex program. But once your project becomes bigger than a certain number of lines of code, it becomes difficult to keep track of what code does what and where you need to be looking when you need a specific part of your code. That’s why it’s important to keep your code organised.

You could make one giant file with all your code in it, without using any classes. This is something I do for another project (splix.io) where I have one giant javascript file of 6640 lines of code right now. I’m not using any classes, and it becomes quite a mess once you want to add something, because you’re never sure where exactly to add it.

With this project I’m doing it a little bit different, like you’re supposed to do it. I’m using multiple files with my code grouped in these files. Though until now I hadn’t used any classes yet, at first I thought they were too complicated, and I could achieve the same thing without using them. Though now I’ve converted my shader loading methods to classes. It does look a lot cleaner and while I still don’t quite understand how you’re supposed to use classes, I think it’s a huge improvement. I’m not sure if I did it right though, I’m using a lot of static methods but I’m not sure if there’s a better way to do it.

0 notes

jesperengine · 8 years ago

Video

tumblr

I’ve been working a lot on this particular shader lately. It feels pretty random and uses the system time as seed. I think this is a pretty good example of what the final version will look like, except more random and more complex. This already works very well in VR, so it can only get better.

0 notes

jesperengine · 8 years ago

Text

Fractal test

I created another shader to better understand how fractals work. I'm starting to understand fractals a bit. Although this is not the kind of fractal you're able to fly trough, I think it's a good start.

0 notes

jesperengine · 8 years ago

Text

I think I may just have fixed all the things that needed fixing. Well at least as far as the adaptive quality is concerned. It turns out the performance glBlitFramebuffer doesn’t count as ‘your application’ in the SteamVR frame timing graphs. So I was wondering why the gpu was spending so much time on ‘other’. I changed my code so it’s now not using glBlitFramebuffer anymore, instead it’s using a full screen quad and rendering the old render target to the new one that way.

This somehow fixed the asynchronous reprojection issue, because I’m still having a good framerate with that setting enabled.

I also increased the max size of the off screen render target to twice the recommended size. That way you get some neat anti aliasing when the scene isn’t very heavy.

0 notes

jesperengine · 8 years ago

Text

Alright SDL has way more features and support for sounds, which I’ll need at some point, so I’ll stick with SDL for now.

0 notes

jesperengine · 8 years ago

Text

Just tried out the SDL library and it was a huge pain to get it working. Also I was still getting high cpu times. It turns out I was calling glFlush() and glFinish(), not sure why. Now I’m not sure whether I should stay with SDL or revert to glfw3, I don’t want to throw away the work I’ve done just now and it might actually be more useful than glfw, but on the other hand I might be getting more linking problems since I’m not sure whether sdl is currently statically linked or dynamically.

0 notes

jesperengine · 8 years ago

Text

I made a bit more progress with the adaptive quality system. Someone on github responded so now I know how to get the refresh rate of the Hmd and I don’t have to hardcode it.

I switched to monitoring the total gpu time, though for some reason the framerate starts getting really low now when asynchronous reprojection is enabled. I think it’s because the total gpu time doesn’t actually get higher than 11ms. Once that happens it jumps back to 0 basically, so when a frame is dropped OpenVR will report a total gpu time of 2ms or so.

I also noticed the cpu time is incredibly high when my engine is running, compared to other engines. This is what my frame timing looks like in the profiler that comes with SteamVR:

At first I thought this was because I’m using OpenGL but when I run the hellovr_opengl sample it runs fine:

This example is using sdl though. And I’m using glfw right now. So I’ll try out sdl and see if that changes anything, it might just fix all issues I’m currently having.

0 notes

jesperengine · 8 years ago

Text

Adaptive quality

I think I’ve got adaptive quality working now, well almost. There’s still a couple of issues. Setting the ray precision wasn’t really a success since the difference between two quality levels are well visible. So the size of objects would be jumping a lot. Changing the precision or iteration count doesn’t really speed up the render times either so in the end I just decided to change the resolution depending on the frame time. This is working remarkably well, though there’s still a couple of things that need to get fixed.

If you enable asynchronous reprojection in the SteamVR settings for example, the quality gets very low for some reason. I think it’s because of the way I’m monitoring the render times. I’m currently using m_flCompositorIdleCpuMs from the Compositor_FrameTiming struct. This value changes when async reprojection is enabled.

I had a look at the the The Lab Renderer just now. Because I realised they were using the same system. Apparently they’re monitoring the total gpu time. Though you’ll need to know the refresh rate of the hmd in order to be able to use this. Unity has an option built in for this. Though I couldn’t find anything like this in the OpenVR SDK. I’ll hardcode it to 90fps for now, and see if anyone on github responds to my request.

0 notes

jesperengine · 8 years ago

Text

I’ve put some more time in fixing some of the smaller issues. I gave a friend an early alpha version of the engine, which didn’t work for him because a dll was missing. This issue is now fixed.

I also added the spectator window, which was a bit of a hassle because it was trying to vsync on my main monitor, which is 60hz so steamVR would report lots of dropped frames. After a while I realised you can do glfwSwapInterval(0); to disable vsync.

Another issue that is now fixed is that the engine doesn’t render unnecessary pixels anymore, at first I simply tested the distance from the center of the screen and rendered a black pixel if it got too far a way. But now I’m using the actual hidden area mesh from the OpenVR SDK as stencil mask.

I’m currently working on adaptive quality. I’m monitoring frame timings and when the engine feels like a frame is about to get dropped it automatically adjusts the quality to prevent this from happening. This makes sure the engine runs at 90fps no matter what video card you use (as long as it’s a decent one) and if you have a better video card it will actually upgrade the quality.

At first I tried changing the ray precision depending on the quality, but this will make all the objects grow and shrink rapidly and it just looks weird. Changing the ray precision also doesn’t really gain that much performance. It turned out changing the resolution worked much better.

I’m not finished yet, there’s still a dropped frame occasionally and the resolution is usually a bit below the recommended resolution, so there’s still a bit of work left.

The talks from Alex Vlachos have really helped a lot so far, if you’re interested in this kind of stuff I suggest you watch them here and here.

0 notes

jesperengine · 9 years ago

Text

It seems like we’re getting somewhere. I’ve been playing around with the current state of the engine for a bit, trying to figure out how ray marched shaders work exactly. To give myself a bit of a goal, I tried to recreate the desert from the game NaissanceE. This is what it currently looks like:

And here’s the original:

When trying to build the desert I realized I needed noise functions. So I searched the internet for a bit on how these noise functions work exactly. It turns out you can just get away with a bunch of cosine waves, though loading a texture with noise is usually a much better idea. So I implemented a way to add textures to the shaders.

I’m still not quite sure how I’m going to combine multiple shaders, if I’m going to do that at all. But I’ve been thinking about a way to put a bunch of different shaders in a folder with a json file included that tells the program which textures to load.

0 notes

jesperengine · 9 years ago

Video

youtube

I’ve finally managed to get the matrices right. And now all the objects are displayed exactly where they need to be. At first the objects were moving in exactly the same direction as your head was moving, while they should actually move in the opposite direction, so they appear to be standing still. Other times the movement was right, but there was something weird going on with the field of view or aspect ratio that made everything look distorted. But now I managed to get everything stuck to it’s position where it needs to be.

I used some of the source code of this article as reference. It’s written in lua and has a pretty different workflow than mine but it was pretty useful to figure out how to get the matrices right.

Now that the basics are working I can start writing shaders. Of course there are still other things that need to be added like a sound engine and a way to load textures. Perhaps functionality to combine different shaders. But I think this is a really good start and I’ll be spending most of my time working in GLSL now.

0 notes

jesperengine · 9 years ago

Text

I’ve finally managed to get a working build for windows. I also added the OpenVR SDK. All is working fine, I managed to send frames to the HMD but I have not found a way yet to render images with the right perspective. I’ve done some research on how matrices work. I feel like I’ve learned a lot so I’ll try to use this information to get the right perspective rendered.

0 notes