Categories
godot performance

Godot Leveraging Timers

This is part 4 of 4 in a series about some recent performance tuning I completed on Elevation TD.

Intro

Start with the very basic assertion anything that does not need to be done every frame should not be done every frame. There is no golden rule about what should or should not be done every frame – it depends highly on your specific game and what you are trying to achieve.

For example:

  • In an FPS shooter, movements often need to be instantaneous (i.e. checked every frame)
  • In a top-down strategy game, movements can be a big laggy (i.e. an enemy might only evaluate if it needs to go to a new target every second or two – it does not need to be checked every frame)

It’s important to remind yourself that if something only needs to happen once every 2 seconds instead of every frame, that is the difference between running some code 120 times (at 60 FPS) versus just once. Running it once will not only be less impactful on performance, but will also free up cycles to do a more complicated version of the process if desired. So if you have code that is checking for enemies each frame, checking to see if the player is in range each frame, seeking a new destination each frame, you might benefit a lot by using timers.

What is a timer?

Let me start off by saying that Godot has Timers you can add via Add Child Node and then link up with Signals. You can read about them here: https://docs.godotengine.org/en/stable/classes/class_timer.html. I don’t use them. I “roll my own” because I like to have a multitude of things I am running timers on and, frankly, its pretty easy to code.

Lets take a very simple use case of “regoaling” – its common for enemies in a tower defense to periodically check to see if they have a new target – they start out with a goal, new towers are built by the player, they have to change their goals to attack those new towers, more towers are built, they change goals again, etc… You don’t want them doing this constantly, maybe every 3 seconds or so – and you want it to be configurable. So put your regoaling logic in a function and call it from a timer like so:

# make the frequency of checking configurable from editor
@export var reassessGoalFreq:float = 3000
# give yourself something to track it with
var nextReGoalTime:float=0

func _process(delta):
  # if your tracking number is in the past, run regoal
  if nextReGoalTime < Time.get_ticks_msec(): regoal()

func regoal():
  # set your next check time in the future by whatever
  # reassessGoalFreq is set to
  nextReGoalTime = Time.get_ticks_msec() + reassessGoalFreq
  # ...
  # do whatever complicated logic you need to do to check
  # if this thing needs a new goal
  # ...

IMHO, the above is way easier than adding nodes, linking signals, setting callbacks, etc… This is basically just two variable declarations and two lines of code. The check in _process is very light weight as you’re just comparing two floats. The way this plays out is:

  • First iteration of _process, nextReGoalTime will always be less that get_ticks_msec (number of milliseconds since start of game), so it will fire off regoal() to set an initial goal
  • Regoal() promptly resets nextReGoalTime to some point in the future that is current milliseconds since start of game plus whatever you have the reassessGoalFreq set to – its actually kind of important to do this first – if you have anything in regoal() that does an await get_timer() then _process might start doubling up calls to regoal() if nextReGoalTime isn’t in the future
  • Regoal() finishes and life returns to _process() until your are reassessGoalFreq number of milliseconds in the future

Its worth noting that since nextReGoalTime and reassessGoalFreq are class-level floats, you can do things like adjust them as needed – for example, if you spawn a bunch of enemies at the same time and want to make sure they don’t regoal at the same time, you can modify reassessGoalFreq with a random float to make sure they are all operating at slightly different intervals.

In the case of Elevation TD, I have a whole set of these kinds of timers controlling enemy movements, resetting goals, checking if enemies should be attacking, checking if towers should be attacking, and so on. In my case, leveraging timers like in the above made a significant difference in FPS.

Conclusion

Using timers to control the frequency of events is a great example of the kind of “unsexy plumbing” in the code that can make a big difference in performance. The above implementation style is easy to wrap around a wide variety of situations. All you need to do to take advantage of it is identify processes that are currently running each frame that can be run much less often.

See the other parts of this series:

Categories
godot performance

Godot Off-Screen Processing Control

This is part 3 of 4 in a series about some recent performance tuning I completed on Elevation TD.

Overview

Out of the box, Godot supports culling of off-screen object display – which is great, but if you have scripts attached to those culled-for-display-purposes-only objects, the scripts are still running, chewing up CPU for no displayable reason. Wouldn’t it be nice to mitigate that off-screen impact?

Godot gives you a way to exactly do that – its actually really easy to use, but you have to put a little thought into what you do with it. At any point in time, you can check to see if the node a script belongs to is within the camera’s frustum (i.e. in the camera’s field of view) via Camera.is_position_in_frustum(global_position) – see: https://docs.godotengine.org/en/stable/classes/class_camera3d.html#class-camera3d-method-is-position-in-frustum.

Frustum Checking

With a pointer to your main camera, you can call this method any time you want to see if your script is attached to an object that is visible with a a conditional (in my case, I have a static class called “StateMachine” that the Camera is referenced through):

if StateMachine.mainCamera.is_position_in_frustum(global_position):
  # do on screen activities
else:
  # do an off-screen version of activities

You want to be a bit careful about how often you check for this – don’t run it every frame. In the case of Elevation TD, I have enemies that perform a series of activities where it might be interesting to check their on-camera/off-camera status for:

  • Conduct a walking motion
  • Attack when in close proximity to targets
  • Do a little performance when they die

When these enemies are off-screen, I still need them to move, attack, search for targets, and die – the camera never has the whole battlefield so there’s pretty much always some enemies off-camera. However, I can periodically check to see if they are off-camera and then perform a lighter-weight version of each of these activities:

  • Don’t perform walking motion, just move to a next position
  • Don’t perform any of the visible parts of an attack, just damage the target
  • Don’t do a death performance, just die and remove yourself from the battlefield

So the enemy script checks if the enemy is off camera at the start of each step motion, when it attacks, and when it dies. You can see the difference in the graphic below – note the FPS when looking at all the enemies attacking vs the FPS when staring into the water:

Checking the camera’s frustum is an easy check – the harder part is coming up with lighter-weight versions of the scripts activities.

Bonus Round

Once you fold in some of these camera checks, it may occur to you that there are some other things you could check to govern whether to show a “full experience” vs a “lighter experience. For example, what’s the current FPS…

In the above image with the battle scene, one of the things you might not realize is that there are three versions of an enemy’s death:

  • Off-Camera – Just die and go away, no drama
  • On-Camera – Explode, particle effects, and other drama
  • On-Camera and FPS is below 90 – Just explode, provide a watered down version of drama

Doing this provides another protective layer to control how much CPU processing is going on under the hood. Like checking the camera frustum, you don’t want to do this on every frame, but it can be handy to control FPS impact of specific events.

To do this is a very simple check from the Engine API:

if Engine.get_frames_per_second() > 90:
  # perform extra dying acts for visual drama

The nice thing about this kind of check is it allows you to provide a more visually rich presentation when there are fewer actors fighting it out on the battlefield – which you want because the fewer the actors on the field, the closer the player will be looking at them. However, in a larger battle scene, it will tone down the visuals, but the player will largely not notice because focusing on any one actor in a battlefield of two hundred other actors is unlikely.

See the other parts of this series:

Categories
godot performance

Godot Performant Nav Agent

This is part 2 of 4 in a series about some recent performance tuning I completed on Elevation TD.

Background

Godot’s build in navigation and agent system works reasonably well. Before proceeding with the rest of this article, please be sure you’ve read and implemented everything in Godot’s Nav Agent tutorial – we are assuming you have a working nav mesh with working nav agents and want to optimize agent performance:

https://docs.godotengine.org/en/stable/tutorials/navigation/navigation_using_navigationagents.html

Problem Description

Let’s say you have hundreds of Nav Agents running concurrently in your game and you’re starting to notice that performance is lagging. Try this: take that entire block of code Godot’s tutorial tells you to put in _physics_process and comment it out – hit play again and see if you just got a whole bunch of FPS back.

The FPS you got back represents an estimate of how much you are loosing on frame-by-frame Nav Agent controls with the default implementation – of course, the problem is now nothing is moving and you still need your Agents to move around and follow their paths to their targets.

Leave that code commented out and add the following – first, add two class-level vars called currentNavArray and nextWayPoint – then modify the set_movement_target function:

...in header...
var currentNavArray
var nextWayPoint

func set_movement_target(movement_target: Vector3):
  navigation_agent.set_target_position(movement_target)
  # give the agent a moment to repath itself
  await get_tree().create_timer(1.0).timeout
  # tell it to grab its next position
  navigation_agent.get_next_path_position()
  # grab the entire navigation path and store it
  # the path is simply an array of Vector3's
  currentNavArray = navigation_agent.get_current_navigation_path()
  # make sure it has at least one entry and grab the first
  if currentNavArray.size() > 0:
    nextWaypoint = currentNavArray[0]
  else:
    # otherwise leave at current location
    nextWaypoint = global_position

At this point, you have an array of the entire navigation path to the target position for this agent. You simply need to implement a very plain vanilla “go to next waypoint” system in your _process() that is much lighter-weight than what Godot’s tutorial had put in. Remove or comment out everything Godot’s tutorial put in _physics_process() and replace it with the below in either _physics_process() or _process() (this does not need to be in a _physics_process()):

if currentNavArray.size() < 2:
  # if you are almost at the end of your waypoints, 
  # don't get closer than 7 to keep a bit of distance
  if global_position.distance_to(nextWaypoint) > 7:
    global_position = global_position.move_toward(nextWaypoint,movement_speed*delta)
else:
  global_position = global_position.move_toward(nextWaypoint,movement_speed*delta)
  # if you are close to your current waypoint, get a next one
  if global_position.distance_to(nextWaypoint) < .05:
    nextWayPoint()

The above code eliminates multiple calls to the nav server to establish what the next Agent position should be in favor of a couple move_toward()’s which have much less overhead. Its important to note that one of the things the Nav Agent takes care of is not actually moving the Nav Agent right on top of the target – in the above example, if you are on the last waypoint, it will keep the agent a distance of 7 from the destination to stop it from moving right on top of the goal. If you are not on the last waypoint and you are less than .05 distance to the waypoint, then you call nextWayPoint() to get the next waypoint.

The nextWayPoint function is actually very simple:

func nextWayPoint():
  if currentNavArray.size() > 1:
    currentNavArray.remove_at(0)
    nextWaypoint = currentNavArray[0]
    nextWaypoint.x += rng.randf_range(-1,1)
    nextWaypoint.z += rng.randf_range(-1,1)
  else:
    nextWaypoint = global_position

If you have more than one more waypoint in the currentNavArray, remove position 0, pulling all the rest in the array forward, and then grab the new one at position 0. This is also an opportunity to modify what that waypoint to add some randomness to movements and avoid a bunch of enemies forming a “conga line”. If you are already at the last position, just return global_position and the agent will go no where.

Presumably, you’ll have some events in your game that reset the target of the agent – when you call set_movement_target() with that new Vector3 target position, the process will simply repeat itself: you’ll navigation_agent.set_target_position, rebuild the currentNavArray, and move waypoint-waypoint-waypoint.

Conclusion

In my case, working on Elevation TD, each enemy is a Nav Agent so I had scenarios where there might be hundreds of Nav Agents at one time. For me, the above was a significant performance boost. I can easily see that the above agent code might:

  1. Not show significant performance gains in situations where there are very few agents
  2. Would create very “coarse grained” pathing movement – its probably not “reactive” enough if you were working on an FPS-style game – but for something like an overhead strategy or tower defense game, that coarseness might not matter (may even be desirable)
  3. Forces you to implement your own Agent behavior (like not getting closer than 7 to the destination) – some of that “overhead” we’re getting rid of handles things like agent avoidance and various other nuances of agent behavior. In my case, it didn’t matter – in other cases, it might, especially if you want very fine-grained movements.

See the other parts of this series:

Categories
godot performance

Godot RenderingServer

This is part 1 of 4 in a series about some recent performance tuning I completed on Elevation TD.

Background

Elevation TD is a tower defense game in which everything in a level is instantiated dynamically – the landscape is built tile-by-tile, each one nudged around with materials applied at runtime to create a unique look each time you play – enemies are constructed at runtime from a library of “bodies” and “legs” to create visual diversity – same with towers and the objects thrown around like “shots” and even the small decorative plants and rocks. Each time you visit a level, it follows a general template to position things, but it never looks the exact same twice.

The drawback for all of this on-the-fly construction is performance. Once you add up all the individual objects, you have thousands of Node3D’s wrapping around thousands of meshes and the default workflow in Godot of “put a GLB in a Node3D and tweak its positioning and other characteristics” start to not scale. This is where RenderingServer comes in.

Heads Up

If you are not comfortable coding, stop here. This isn’t an easy haul, but it can give you some solid performance boosts – in my case it was fairly significant, but not everyone has a game with thousands of distinct objects. If you don’t have a lot of distinct visual objects, this path might not be worth it.

When you have a Node3D that you drop in a scene, you can really spend a lot of time adjusting its look and feel, nesting all kinds of stuff inside it, and it all comes along for one happy ride. When you use the RenderingServer, you are basically writing a distinct mesh and only the mesh to the RenderingServer – you lose anything nested under it, you lose its scale, rotation, position – you will need to reapply all those aspects of it in code and you basically lose almost all ability to manage something in the Godot editor – that’s the bad news – the good news is that you’ll be writing that mesh and all its display instructions directly to Godot’s rendering server so its much faster and lighter weight.

The Basics

Godot provides a good discussion around RenderingServer along with a sample implementation here:

https://docs.godotengine.org/en/stable/tutorials/performance/using_servers.html

This YouTube video is also a good walkthrough of the basics:

You should familiarize yourself with the RenderingServer API doc at Godot – you’ll need it to do much more than the basics:

https://docs.godotengine.org/en/stable/classes/class_renderingserver.html

A Simpler Example

Lets start with the simple version where you instantiate a mesh in a position and you never touch it again, like the landscape and the small rocks and chunks of ice in the below:

At the class level:

var yourMesh

func _ready:

yourMesh = <wherever you get your meshes from>

#
# Mesh Renderer Instance for ground decoration
# We'll call the instance "tmpDecoInstance"
# You should have a mesh stored in the variable
# yourMesh at the class level
#

# first get your instance ID and scenario
var tmpDecoInstance = RenderingServer.instance_create()
var tmpScenario = get_world_3d().scenario

# then set your scenario and base
RenderingServer.instance_set_scenario(tmpDecoInstance, scenario)
RenderingServer.instance_set_base(tmpDecoInstance, yourMesh)

# create your transform3d and set its origin up front
var tmpxform = Transform3D(Basis())
tmpxform.origin = decoPos

# apply rotations as needed (in this case, randomized)
# 90 degrees in radians = 1.5708 / 360 degress = 6.28319
tmpxform = tmpxform.rotated_local(Vector3.UP, rng.randf_range(0,6.2))

# set scale
var tmpScale = Vector3(1,1,1) # or whatever scale you need
tmpxform = tmpxform.scaled_local(tmpScale)

# set the instance to position at the transform
RenderingServer.instance_set_transform(tmpDecoInstance, tmpxform)

This is basically the same example as is in the Godot docs except that I’m setting rotation, origin, and scale – why? When you load meshes via the Rendering Server, you will quickly discover what their actual scale and alignment are and it might not be what you think it is, especially if you didn’t create it yourself. This was a rude awakening for me as I had meshes from a variety of different sources – so a small flower was suddenly huge and sideways and a giant boulder was suddenly tiny. You either need to set the scale/rotation or resize/reorient them in Blender.

Once you’ve pulled the mesh out of whatever it was in, you might realize that you need to set its material:

RenderingServer.instance_geometry_set_material_override(tmpDecoInstance, yourMaterial.get_rid())

In case its not already clear, you’ll need to do this once per mesh you want to display – so if you have a character or structure that you “kit-bashed” together from multiple GLB’s, you will need to either combine all those meshes into one mesh or iterate over the above chunk of code once per mesh.

Moving Things Around

You may have noticed that no where in the simple example did I add_child() anything – you can’t with Rendering Server – the mesh does not exist as a Node that you can add to anything. This should inspire you to ask how you manage it – move it around, make it rotate, etc… That’s done via the instance RID that you get from RenderingServer.instance_create().

To make it easier to move instances around, lets separate the creation of the instance from the manipulation of the instance – you create the instance in your _ready and then you manipulate it via a func. In the below example, we’re representing a “shot” that is “fired” from an enemy to its target – for example, these trees throwing trees:

So first we create the instance in _ready, but we save the instance and the mesh at the class level:

var shotMesh
var shotInstance
var shotRotation
var shotScale = Vector3.ONE
var shotRotDirection = Vector3.LEFT

func _ready():
  shotInstance = RenderingServer.instance_create()
  var scenario = get_world_3d().scenario
  RenderingServer.instance_set_scenario(shotInstance, scenario)
  shotMesh = <where ever you get your mesh from>
  RenderingServer.instance_set_base(shotInstance, shotMesh)
  placeShot(Vector3(10000,-10000,10000))

You’ll notice that looks very similar to the simple example, but stops halfway through and calls that “placeShot” function – placeShot, as the name suggests, places the shot where you want it to be along with handling rotation, scale, etc…:

func placeShot(position):
  # 90 degrees in radians = 1.5708 / 360 degress = 6.28319
  shotRotation += shotRotSpeed
  if shotRotation > 6.28319: shotRotation = 0
  # create transform
  var xform = Transform3D(Basis())
  # set global position
  xform.origin = position
  # rotate as needed
  xform = xform.rotated_local(shotRotDirection, shotRotation)
  # set scale	
  xform = xform.scaled_local(shotScale)	  RenderingServer.instance_set_transform(shotInstance, xform)

You’ll notice that “shotInstance” is leveraged as a pointer to the mesh that was instanced in the Rendering Server. To move the shot around, create a Transform3D representing its new position and orientation and then instance_set_transform the shotInstance to that Transform3D. Putting that all into a general function means you can just call the _placeShot() function to position the visual wherever you need it, whenever you need it (i.e. from inside _process() more than likely).

A few things I will point out here:

  • https://docs.godotengine.org/en/stable/classes/class_transform3d.html – the Transform3D class is your friend – use it.
  • Note that each time placeShot is called you are basically building the Transform3D from scratch – every time I tried to manage the Transform or Basis persistently, I got erratic behavior from the RenderingServer. Once I made peace with re-establishing all positionality factors each time on a new temporary Transform3D, things worked a lot more consistently.
  • Because I’m not persistently managing the transform, I needed to keep track of how rotated the object needed to be to create a smooth rotation. Each time you create a new transform, the rotation is reset, so you need to be prepared to tell it how rotated you want it and its scale each call (thusly, managing the shotRotation variable at the class level).
  • You’ll notice the first thing I set on the Transform is its origin (i.e. the coordinates you want it to appear at) – do this first. If you attempt to perform operations like Transform3D.looking_at() and then set the origin, it does not work correctly. Origin first, everything else second works the most consistently.
  • You’ll notice I’m using scaled_local and rotated_local – again, this produces the most consistent result over a large series of updates.

I use the word “consistently” several times in those bullet points. Perhaps the most annoying thing about working with the Rendering Server is that when it doesn’t like what you are doing, objects tend to just disappear. I iterated many times over changes that should have either worked fine or made a very small difference to the visual display only to have the mesh completely disappear with no errors in the error console. Remember, when you use Rendering Server, you are working outside the node tree, so you can’t even look at Remote and see what’s going on – you basically have nothing to fall back on except debug in your code. Once I got the above “recipe” in place, things worked pretty consistently.

What’s Your Mesh?

Its worth taking a moment to remember that a GLB isn’t a “mesh”, its a collection of things, one of which is a mesh. If you instantiate a GLB, grab the first child and then you can get the mesh. There’s a lot of examples out there (including Godot’s tutorial) where they just load(Path-to-Mesh) into a variable, but that’s if you actually just have a literal Mesh – if you do that with a GLB or FBX, it won’t work right – seems like you need to instantiate it:

var selectedMesh = arrayOfGLBS.pick_random().instantiate()
var yourMesh = selectedMesh.get_child(0).mesh

Summary

In my case, with dozens of concurrent shots, hundreds of enemies, and many hundreds of distinct landscaping elements, using the RenderingServer like above resulted in a significant improvement of FPS (something in the range of 20-40 FPS recovered). In the next post, we’ll explore optimizing Godot’s Nav Agent…

See the other parts of this series: