diff --git a/README.md b/README.md index 47a8c52..5e8e93e 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,57 @@ -Instructions - Vulkan Grass Rendering +University of Pennsylvania, CIS 565: GPU Programming and Architecture, +Project 4 - Vulkan Grass Rendering ======================== +* Tabatha Hickman + * LinkedIn: https://www.linkedin.com/in/tabatha-hickman-335987140/ +* Tested on: Windows 10 Pro, i7-5600U CPU @ 2.60GHz 16GB, GeForce 840M (personal computer) + +![](img/withWind.gif) + +This project involved simulating and rendering a large quantity of blades of grass using Vulkan. The simulation and several optimizations used in the project are described in the paper, [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). The general pipeline which will be described in further detail includes a compute shader where vertex positions are modified based on forces and blades are culled as necessary, a vertex shader for the grass blades, a tessellation control shader which prepares the vertex information to be tesselated using input inner and outer tesselation levels, a tesselation evaluation shader where given computed tesselation coordinates are used to find the positions of points on the blade, and a fragment shader to color the blades. + +## Blade Geometry and Tesselation + +Each blade in the simulation is represented by a Bezier curve with three control points. The blades also contain information about the direction of their height and width, the value of their height and width, and the stiffness of the blade. This information is passed into the tesselation control shader from the vertex shader. In the tesselation control shader we set the inner and outer levels of tesselation. In this case, since the blade will curve along its height and its height is much greater than its width, inner level 1 is much greater than inner level 0 and outer levels 0 and 2 are much greater than outer levels 1 and 3. (Inner level 1 and outer levels 0 and 2 relate to the vertical dimension of the blade, while the other relate to the horizontal.) After the tesselation control shader completes, the tesselation engine, which cannot be modified, computes the tesselation. Then in the tesselation evaluation shader, per tesselation coordinate positions are computed using bilinear interpolation and equations to mold the blade into a specific shape. I used the triangle-tip shape provided in the paper for my simulation. The tesselation evaluation shader also sets gl_Position for this coordinate using camera matrices to project the point into camera space. Finally, I pass the normal of the point and the height coordinate from tesselation on to the fragment shader. Here I color the blades by interpolating between a darker and lighter green, with the darker closer to the plane. I also attempted Lambertian shading using the normals, but I didn't like the artifacts I was getting toward the tips of the grass so I took it out. Below is the result of this work. + +![](img/geomNoForces.JPG) + +## Simulating Forces + +**Gravity:** The first and most basic force added was gravity. An environmental gravity was clculated by multiplying the direction of the force by its magnitude. This was added to the contribution of gravity with respect to the front face of the blade. That force made sure the blades would fall such that they curved in the direction of the front face. Below is the result of adding just this force. + +![](img/withGravity.JPG) + +**Recovery:** Since grass often has some resilience to the forces acting on it, a recovery force is added that inclines the blade toward its initial position. This is easy because the initial position of the blade's tip is simply the blade's height over its base position. Adding this force, I got this result. + +![](img/withRecovery.JPG) + +**Wind:** Finally, wind is added to keep the simulation lively. I made my wind direction a function of the time which has elapsed since the beginning of the simulation. The direction has a magnitude in the x direction related to the cos of that time, and in the z direction related to the sin of that time. This creates a subtle circular repetitive motion. Because wind acts more powerfully on blades of grass which directly face its direction vs blades that are perpendicular, we multiply the wind direction by a wind alignment term defined in the paper. This term is a combination of the relative direction of the blade's front and the wind (computed with a dot product) and the uprightness of the blade calculated by comparing the height of the tip of the blade to its initial height. After adding this force, I got something like the motion seen in the gif at the top of the page. + +In the compute shader, all of these forces were computed, summed, multiplied by the delta time since the last update, and added to the position of the tip of the blade. Then several calculations are made to correct this translation in order to preserve the positioning and length of the blade. The tip is constrained to stay above the ground plane, the second (invisible) control point is recalculated, and then both are reduced by the ratio between the prescribed length and the proposed length. + +## Culling Optimizations + +**Orientation Culling:** Blades whose front faces perpendicular to the camera are barely seen because the blades have no width, which can cause aliasing artifacts. This culling removes any blades which face to perpendicularly to the camera, using a dot product between the view vector and the face direction to decide. + +**View-Frustum Culling:** In this culling, we are simply foregoing the tessellation and other computations for blades which are outside the camera's view and thus are not seen in the render. We do this by projecting the base, tip and midpoints of the blade into NDC space and culling if all three points are outside the homogeneous coordinate plus an additional tolerance. + +**Distance Culling:** Here we want to skip rendering blades which are too far from the camera to be seen. A max distance past which blades will not be rendered is set and a number of buckets for incremental culling is set. Then, we cull any blades past the max distance and b out of n blades for every bucket b between the camera and that max distance. Here is a gif showing a rather extreme example of this culling. In this case I set the max distance to 30 and there are 6 buckets. + +![](img/distanceCull.gif) + +Here is a chart comparing the performance benefit of each of the culling optimizations. I'm not extremely confident in these numbers because it was very difficult to measure performance in points of view of the grass mass that did not bias one type of culling or another. In any case, the distance culling does seem to be substantially more successful than the other two, which I believe is a result of the fact that distance culling is at work in all views, whereas view-frustum culling is of no use when the full mass of grass is visible. + +![](img/perfCulling.JPG) + +## Performance vs Number of Blades + +Below is a chart of the average frame render time vs the number of blades rendered using all forces and culling implemented. It seems to be a power relationship. It is evident that an increase in the number of blades has a detriment on the performance which is worse than a linear relationship. It's probable that this relationship is much more dramatic when no culling optimizations are used. The optimizations definitely become more useful as the number of blades increases so this may contribute to why the relationship is almost linear. + +![](img/overallPerf.JPG) + +### Instructions - Vulkan Grass Rendering + This is due **Wednesday 10/9, evening at midnight**. **QUICK NOTE**: Please use `git clone --recursive` when cloning this repo as there are submodules which need to be cloned as well. diff --git a/img/distanceCull.gif b/img/distanceCull.gif new file mode 100644 index 0000000..7323ed6 Binary files /dev/null and b/img/distanceCull.gif differ diff --git a/img/geomNoForces.JPG b/img/geomNoForces.JPG new file mode 100644 index 0000000..ceed1ce Binary files /dev/null and b/img/geomNoForces.JPG differ diff --git a/img/overallPerf.JPG b/img/overallPerf.JPG new file mode 100644 index 0000000..68c0019 Binary files /dev/null and b/img/overallPerf.JPG differ diff --git a/img/perfCulling.JPG b/img/perfCulling.JPG new file mode 100644 index 0000000..17b36fe Binary files /dev/null and b/img/perfCulling.JPG differ diff --git a/img/withGravity.JPG b/img/withGravity.JPG new file mode 100644 index 0000000..3993f66 Binary files /dev/null and b/img/withGravity.JPG differ diff --git a/img/withOrientationCull.JPG b/img/withOrientationCull.JPG new file mode 100644 index 0000000..877727e Binary files /dev/null and b/img/withOrientationCull.JPG differ diff --git a/img/withRecovery.JPG b/img/withRecovery.JPG new file mode 100644 index 0000000..2ad7968 Binary files /dev/null and b/img/withRecovery.JPG differ diff --git a/img/withWind.gif b/img/withWind.gif new file mode 100644 index 0000000..60bd673 Binary files /dev/null and b/img/withWind.gif differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..d5fbd21 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -44,9 +44,9 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode indirectDraw.firstVertex = 0; indirectDraw.firstInstance = 0; - BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); - BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); - BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); + BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); + BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); } VkBuffer Blades::GetBladesBuffer() const { diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..5097a17 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -195,9 +195,43 @@ void Renderer::CreateTimeDescriptorSetLayout() { } void Renderer::CreateComputeDescriptorSetLayout() { - // TODO: Create the descriptor set layout for the compute pipeline + // Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + // Describe the binding of the descriptor set layout + + VkDescriptorSetLayoutBinding inputLayoutBinding = {}; + inputLayoutBinding.binding = 0; + inputLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + inputLayoutBinding.descriptorCount = 1; + inputLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + inputLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding culledLayoutBinding = {}; + culledLayoutBinding.binding = 1; + culledLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + culledLayoutBinding.descriptorCount = 1; + culledLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + culledLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding numLayoutBinding = {}; + numLayoutBinding.binding = 2; + numLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + numLayoutBinding.descriptorCount = 1; + numLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + numLayoutBinding.pImmutableSamplers = nullptr; + + std::vector bindings = { inputLayoutBinding, culledLayoutBinding, numLayoutBinding }; + + // Create the descriptor set layout + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { @@ -215,7 +249,8 @@ void Renderer::CreateDescriptorPool() { // Time (compute) { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, - // TODO: Add any additional types and counts of descriptors you will need to allocate + // Blades + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , 3 * static_cast(scene->GetBlades().size()) }, }; VkDescriptorPoolCreateInfo poolInfo = {}; @@ -318,8 +353,44 @@ void Renderer::CreateModelDescriptorSets() { } void Renderer::CreateGrassDescriptorSets() { - // TODO: Create Descriptor sets for the grass. + // Create Descriptor sets for the grass. // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(grassDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo grassBufferInfo = {}; + grassBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + grassBufferInfo.offset = 0; + grassBufferInfo.range = sizeof(ModelBufferObject); + + descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[i].dstSet = grassDescriptorSets[i]; + descriptorWrites[i].dstBinding = 0; + descriptorWrites[i].dstArrayElement = 0; + descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[i].descriptorCount = 1; + descriptorWrites[i].pBufferInfo = &grassBufferInfo; + descriptorWrites[i].pImageInfo = nullptr; + descriptorWrites[i].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -358,8 +429,74 @@ void Renderer::CreateTimeDescriptorSet() { } void Renderer::CreateComputeDescriptorSets() { - // TODO: Create Descriptor sets for the compute pipeline - // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + // Create Descriptor sets for the compute pipeline + // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + computeDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(computeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(3 * computeDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo computeInputBufferInfo = {}; + computeInputBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + computeInputBufferInfo.offset = 0; + computeInputBufferInfo.range = sizeof(Blade) * NUM_BLADES; + + VkDescriptorBufferInfo computeCulledBufferInfo = {}; + computeCulledBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer(); + computeCulledBufferInfo.offset = 0; + computeCulledBufferInfo.range = sizeof(Blade) * NUM_BLADES; + + VkDescriptorBufferInfo computeIndirectBufferInfo = {}; + computeIndirectBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + computeIndirectBufferInfo.offset = 0; + computeIndirectBufferInfo.range = sizeof(BladeDrawIndirect); + + descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 0].dstBinding = 0; + descriptorWrites[3 * i + 0].dstArrayElement = 0; + descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 0].descriptorCount = 1; + descriptorWrites[3 * i + 0].pBufferInfo = &computeInputBufferInfo; + descriptorWrites[3 * i + 0].pImageInfo = nullptr; + descriptorWrites[3 * i + 0].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 1].dstBinding = 1; + descriptorWrites[3 * i + 1].dstArrayElement = 0; + descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 1].descriptorCount = 1; + descriptorWrites[3 * i + 1].pBufferInfo = &computeCulledBufferInfo; + descriptorWrites[3 * i + 1].pImageInfo = nullptr; + descriptorWrites[3 * i + 1].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 2].dstBinding = 2; + descriptorWrites[3 * i + 2].dstArrayElement = 0; + descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 2].descriptorCount = 1; + descriptorWrites[3 * i + 2].pBufferInfo = &computeIndirectBufferInfo; + descriptorWrites[3 * i + 2].pImageInfo = nullptr; + descriptorWrites[3 * i + 2].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateGraphicsPipeline() { @@ -716,8 +853,7 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.module = computeShaderModule; computeShaderStageInfo.pName = "main"; - // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout }; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; @@ -884,6 +1020,11 @@ void Renderer::RecordComputeCommandBuffer() { vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr); // TODO: For each group of blades bind its descriptor set and dispatch + for (int b = 0; b < computeDescriptorSets.size(); ++b) + { + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[b], 0, nullptr); + vkCmdDispatch(computeCommandBuffer, ceil(float(NUM_BLADES) / float(WORKGROUP_SIZE)), 1, 1); + } // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { @@ -975,14 +1116,13 @@ void Renderer::RecordCommandBuffers() { for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; - // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); - // TODO: Bind the descriptor set for each grass blades model + // Bind the descriptor set for each grass blades model + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, graphicsPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); // Draw - // TODO: Uncomment this when the buffers are populated - // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -1041,8 +1181,6 @@ void Renderer::Frame() { Renderer::~Renderer() { vkDeviceWaitIdle(logicalDevice); - // TODO: destroy any resources you created - vkFreeCommandBuffers(logicalDevice, graphicsCommandPool, static_cast(commandBuffers.size()), commandBuffers.data()); vkFreeCommandBuffers(logicalDevice, computeCommandPool, 1, &computeCommandBuffer); @@ -1057,6 +1195,7 @@ Renderer::~Renderer() { vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..36caa9b 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -56,12 +56,15 @@ class Renderer { VkDescriptorSetLayout cameraDescriptorSetLayout; VkDescriptorSetLayout modelDescriptorSetLayout; VkDescriptorSetLayout timeDescriptorSetLayout; + VkDescriptorSetLayout computeDescriptorSetLayout; VkDescriptorPool descriptorPool; VkDescriptorSet cameraDescriptorSet; std::vector modelDescriptorSets; VkDescriptorSet timeDescriptorSet; + std::vector grassDescriptorSets; + std::vector computeDescriptorSets; VkPipelineLayout graphicsPipelineLayout; VkPipelineLayout grassPipelineLayout; diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..f0a087b 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -21,20 +21,25 @@ struct Blade { vec4 up; }; -// TODO: Add bindings to: -// 1. Store the input blades -// 2. Write out the culled blades -// 3. Write the total number of blades remaining - -// The project is using vkCmdDrawIndirect to use a buffer as the arguments for a draw call -// This is sort of an advanced feature so we've showed you what this buffer should look like -// -// layout(set = ???, binding = ???) buffer NumBlades { -// uint vertexCount; // Write the number of blades remaining here -// uint instanceCount; // = 1 -// uint firstVertex; // = 0 -// uint firstInstance; // = 0 -// } numBlades; +// Add bindings to: +// 1. Store the input blades -> binding 0 +// 2. Write out the culled blades -> binding 1 +// 3. Write the total number of blades remaining -> binding 2 + +layout(set = 2, binding = 0) buffer InputBlades { + Blade iBlades[]; +} inputBlades; + +layout(set = 2, binding = 1) buffer CulledBlades { + Blade cBlades[]; +} culledBlades; + +layout(set = 2, binding = 2) buffer NumBlades { + uint vertexCount; // Write the number of blades remaining here + uint instanceCount; // = 1 + uint firstVertex; // = 0 + uint firstInstance; // = 0 +} numBlades; bool inBounds(float value, float bounds) { return (value >= -bounds) && (value <= bounds); @@ -43,14 +48,99 @@ bool inBounds(float value, float bounds) { void main() { // Reset the number of blades to 0 if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; + numBlades.vertexCount = 0; } barrier(); // Wait till all threads reach this point - // TODO: Apply forces on every blade and update the vertices in the buffer + uint idx = gl_GlobalInvocationID.x; + Blade b = inputBlades.iBlades[idx]; + + vec3 v0pt = b.v0.xyz; + float fDir = b.v0.w; // front facing direction of blade + vec3 v1pt = b.v1.xyz; + float ht = b.v1.w; // height of blade + vec3 v2pt = b.v2.xyz; + float wd = b.v2.w; // width of blade + vec3 upDir = b.up.xyz; + float stf = b.up.w; // stiffness of blade + + // Apply forces on every blade and update the vertices in the buffer + + // Gravity + vec4 D = vec4(0.0, -1.0, 0.0, 5.0); // gravity direction D.xyz, magnitude D.w + vec3 gE = normalize(D.xyz) * D.w; // environmental gravity + vec3 f = normalize(vec3(cos(fDir), 0.0, sin(fDir))); // direction of front face of blade + vec3 gF = 0.25 * abs(gE.y) * f; + vec3 G = gE + gF; + + // Recovery + vec3 iv2pt = ht * upDir + v0pt; // v2 would initially be v0 plus the height of the blade + vec3 R = (iv2pt - v2pt) * stf; + + // Wind + vec3 windDir = vec3(1.0, 0.0, 0.0) * cos(totalTime) + vec3(0.0, 0.0, 1.0) * sin(totalTime); + + float fd = 1 - abs(dot(normalize(windDir), normalize(v2pt - v0pt))); + float fr = dot((v2pt - v0pt), upDir) / ht; + float windAlign = fd * fr; + vec3 W = windDir * windAlign; + + // Combine forces and adjust v1 and v2 as needed + vec3 tv2 = (G + R + W) * deltaTime; // proposed translation of v2 + v2pt += tv2; - // TODO: Cull blades that are too far away or not in the camera frustum and write them + v2pt = v2pt - upDir * min(dot(upDir, (v2pt - v0pt)), 0.0); + + float projLength = length(v2pt - v0pt - upDir * dot((v2pt - v0pt), upDir)); + v1pt = v0pt + ht * upDir * max(1.0 - projLength/ht, 0.05 * max(projLength/ht, 1.0)); + + float l0 = length(v2pt - v0pt); + float l1 = length(v1pt - v0pt) + length(v2pt - v1pt); + float bladeLength = (2.0*l0 + l1) / 3.0; + float r = ht / bladeLength; + v1pt = v0pt + r*(v1pt - v0pt); + v2pt = v1pt + r*(v2pt - v1pt); + + // Cull blades that are too far away or not in the camera frustum and write them // to the culled blades buffer // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount // You want to write the visible blades to the buffer without write conflicts between threads + + vec3 camEye = inverse(camera.view)[3].xyz; + + // Orientation Culling + float viewBladeDot = dot(f, normalize(camEye - v0pt)); + if (viewBladeDot > 0.9) { return; } + + // View-frustum Culling + vec3 m = 0.25 * v0pt + 0.5 * v1pt + 0.25 * v2pt; + vec4 v0ndc = camera.proj * camera.view * vec4(v0pt, 1.0); + float v0h = v0ndc.w + 1.0; + bool v0InView = (v0ndc.x < v0h && v0ndc.x > -v0h) && + (v0ndc.y < v0h && v0ndc.y > -v0h) && + (v0ndc.z < v0h && v0ndc.z > -v0h); + vec4 mndc = camera.proj * camera.view * vec4(m, 1.0); + float mh = mndc.w + 1.0; + bool mInView = (mndc.x < mh && mndc.x > -mh) && + (mndc.y < mh && mndc.y > -mh) && + (mndc.z < mh && mndc.z > -mh); + vec4 v2ndc = camera.proj * camera.view * vec4(v2pt, 1.0); + float v2h = v2ndc.w + 1.0; + bool v2InView = (v2ndc.x < v2h && v2ndc.x > -v2h) && + (v2ndc.y < v2h && v2ndc.y > -v2h) && + (v2ndc.z < v2h && v2ndc.z > -v2h); + if(!v0InView && !mInView && !v2InView) { return; } + + // Distance Culling + float dmax = 30.0; // max distance rendered + float n = 6.0; //number of buckets + float dproj = length(v0pt - camEye - upDir * dot((v0pt - camEye), upDir)); + + if(idx % int(n) > floor(n * (1.0 - dproj/dmax))) { return; } + + inputBlades.iBlades[idx].v1.xyz = v1pt; + inputBlades.iBlades[idx].v2.xyz = v2pt; + + uint cIdx = atomicAdd(numBlades.vertexCount, 1); + culledBlades.cBlades[cIdx] = inputBlades.iBlades[idx]; } diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..be0a074 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -6,12 +6,18 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare fragment shader inputs +// Declare fragment shader inputs +layout(location = 0) in vec3 pos; +layout(location = 1) in vec3 nor; +layout(location = 2) in float h; layout(location = 0) out vec4 outColor; void main() { - // TODO: Compute fragment color + vec3 lightDir = vec3(1.0, 1.0, 1.0); + float lambert = abs(dot(lightDir, nor)); - outColor = vec4(1.0); + vec3 col = (1.0 - h) * vec3(0.1, 0.2, 0.0) + h * vec3(0.3, 0.6, 0.0); + + outColor = vec4(col, 1.0);// * clamp(lambert, 0.2, 1.0); } diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..10ddfa3 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -8,19 +8,34 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare tessellation control shader inputs and outputs +// Tessellation control shader inputs and outputs + +layout(location = 0) in vec4 iv0[]; +layout(location = 1) in vec4 iv1[]; +layout(location = 2) in vec4 iv2[]; +layout(location = 3) in vec4 iup[]; + +layout(location = 0) out vec4 ov0[]; +layout(location = 1) out vec4 ov1[]; +layout(location = 2) out vec4 ov2[]; +layout(location = 3) out vec4 oup[]; void main() { // Don't move the origin location of the patch gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; - // TODO: Write any shader outputs + // Write any shader outputs + ov0[gl_InvocationID] = iv0[gl_InvocationID]; + ov1[gl_InvocationID] = iv1[gl_InvocationID]; + ov2[gl_InvocationID] = iv2[gl_InvocationID]; + oup[gl_InvocationID] = iup[gl_InvocationID]; - // TODO: Set level of tesselation - // gl_TessLevelInner[0] = ??? - // gl_TessLevelInner[1] = ??? - // gl_TessLevelOuter[0] = ??? - // gl_TessLevelOuter[1] = ??? - // gl_TessLevelOuter[2] = ??? - // gl_TessLevelOuter[3] = ??? + // Set level of tesselation + // Should have more vertical tesselation than horizontal + gl_TessLevelInner[0] = 2.0; + gl_TessLevelInner[1] = 7.0; + gl_TessLevelOuter[0] = 5.0; + gl_TessLevelOuter[1] = 2.0; + gl_TessLevelOuter[2] = 5.0; + gl_TessLevelOuter[3] = 2.0; } diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..84d8235 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -8,11 +8,42 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare tessellation evaluation shader inputs and outputs +// Tessellation evaluation shader inputs and outputs + +layout(location = 0) in vec4 iv0[]; +layout(location = 1) in vec4 iv1[]; +layout(location = 2) in vec4 iv2[]; +layout(location = 3) in vec4 iup[]; + +layout(location = 0) out vec3 pos; +layout(location = 1) out vec3 nor; +layout(location = 2) out float h; void main() { float u = gl_TessCoord.x; float v = gl_TessCoord.y; - // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + // Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + + // Compute Normal + vec3 lerpA = iv0[0].xyz + v * (iv1[0] - iv0[0]).xyz; + vec3 lerpB = iv1[0].xyz + v * (iv2[0] - iv1[0]).xyz; + vec3 lerpC = lerpA + v * (lerpB - lerpA); + + vec3 bitan = vec3(sin(iv0[0].w), 0.0, cos(iv0[0].w)); // vector along width of blade + vec3 tan = normalize(lerpB - lerpA); + nor = normalize(cross(tan, bitan)); + + // Compute Position + vec3 lerp1 = lerpC - iv2[0].w * bitan; // iv2.w is width of blade + vec3 lerp2 = lerpC + iv2[0].w * bitan; + + float thresh = 0.2; + float t = 0.5 + (u - 0.5) * (1 - (max(v - thresh, 0))/(1 - thresh)); // triangle tip shape + + pos = (1 - t) * lerp1 + t * lerp2; + + h = v; + + gl_Position = camera.proj * camera.view * vec4(pos, 1.0f); } diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..5690fe1 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -6,12 +6,30 @@ layout(set = 1, binding = 0) uniform ModelBufferObject { mat4 model; }; -// TODO: Declare vertex shader inputs and outputs +// Vertex shader inputs and outputs + +layout(location = 0) in vec4 iv0; +layout(location = 1) in vec4 iv1; +layout(location = 2) in vec4 iv2; +layout(location = 3) in vec4 iup; + +layout(location = 0) out vec4 ov0; +layout(location = 1) out vec4 ov1; +layout(location = 2) out vec4 ov2; +layout(location = 3) out vec4 oup; out gl_PerVertex { vec4 gl_Position; }; void main() { - // TODO: Write gl_Position and any other shader outputs + // Write gl_Position and any other shader outputs + ov0 = iv0; + ov1 = iv1; + ov2 = iv2; + oup = iup; + + vec4 outPos = iv0; + outPos.w = 1.0; + gl_Position = outPos; }