Programmers are all too familiar with the time-consuming efforts of analyzing the crash and error reports they receive during new feature development, game testing, or released play. Understanding what happened and whether it matters takes a lot of back and forth between the programmer and player who identified the issue. The Backtrace Error Management Platform was created to automate the collection and analysis of crash and exception reports, cutting the time for these tasks in half so programmers can get to work fixing gameplay-impacting issues more quickly.
Let’s explore how Backtrace helps teams find, analyze, and solve crashes and errors.
Determining what happened
The first thing programmers or managers who are triaging issues need to determine is which component broke and under what conditions. This requires automating the capture of crashes or exceptions, including a full stack trace with contextual information such as environment variables, system information, custom metadata, log entries, attached screenshots or videos. The most difficult-to-diagnose problems often require a more sophisticated analysis of the system runtime to understand detailed information such as the state of all the running threads, which modules were loaded, and which versions of critical device drivers were in use.
Backtrace provides this with their capture libraries, which can be embedded in games for many types of platforms, including consoles like PS4, Xbox One, and Nintendo Switch, streaming services like Stadia, desktops like PC or Mac, mobile devices with iOS and Android, and web platforms with WebGL. Backtrace also supports back-end platforms such as MultiPlay and technologies like Docker Containers and Kubernetes, making collection a configurable option on those platforms.
Does it matter?
The obvious way to prioritize crashes is to sort them by how frequently they occur. That’s the right place to start, but there are other things to keep in mind:
- Some crashes aren’t going to be fixable – e.g., a PC crash could be caused by an old graphic driver, or game crashes might be caused by exploits or antivirus software.
- Some issues are already under investigation, and you need to make sure you have some way of communicating that – e.g., through a link to your bug tracking system.
- Some crashes have a very low user impact – e.g., crashes after the main window has shut down.
- Some issues are more high-impact than others – e.g., regressions, or server crashes often fall into this category.
Backtrace provides teams with tools to better understand what’s most important now:
- View and Manage by State – Use filter shortcuts like Open, In Progress, Muted, or Resolved to view fingerprints that are relevant to your activity in the Triage view.
- Regression Detection – Programmers can set criteria such as “Resolved Until seen in version x.y.z or greater” so the system reopens issues if they reappear.
- Collaboration and Integration – User assignment, linking to issue tracking systems, comments, and tags can all help you to understand the state of things under investigation.
- Tag crashes by team – Apply some regex rules to tag crashes as being in one or more categories, such as rendering, networking, or physics.
- Flame graphs – These visual tools give programmers an overall picture of where problems are happening. Teams find them useful for visualizing the scale of different types of crashes.
How can it be fixed?
To perform sophisticated debugging analysis and reporting, game developers need to ask questions like the following:
- What patterns of bits are set in a faulting address?
- What is the distribution of process uptime?
- What is the highest memory consumer in a snapshot?
- What are the unique values of an attribute where a group matches a regular expression?
- What is a histogram of occurrences for a set of traces of a certain age by group?
To enable this, all process statistics need to be exposed, from interrupts to memory usage and open descriptors. Programmers must be free to add their own attributes in an ad hoc fashion, and the system needs to provide flexible aggregation and analysis on any number of hundreds of dimensions.
These types of analytic workloads (large compressible data sets, queries that require full scans and materialize a small subset of columns) thrive on columnar databases. The Backtrace system was designed for this purpose, and it allows some of the largest gaming studios in the world to get real-time feedback while performing robust fold operations and introspection into value distribution.
To gain more actionable insights, Backtrace also provides a knowledge base of instruction sets and languages, as well as a rule system that allows the definition of static analysis rules on functions. The knowledge base consists of constraints for the underlying instruction set including alignment, permissions, storage requirements, and flags. This provides teams with insights such as:
- Identification of heap corruption through forensic analysis of allocator data structures and pointers
- Stale pointer detection that sweeps across all pointers and performs heap and type analysis to detect inconsistencies
- Malware detection that scans executable code and memory of code paths associated with faulting regions to detect ROP chains
With Backtrace integrated into the game development lifecycle, game programmers can cut the time they spend solving crashes and events in half since the system helps detect and resolve the most difficult-to-diagnose issues. Backtrace is a Unity Verified Solutions Partner. Being a Verified Solution Partner means that Backtrace has been authenticated by Unity to ensure that its SDK is optimized for the latest version of Unity Editor and provides a seamless experience for Unity developers. Click here to sign up for a free trial, or to learn more about their solution, visit https://backtrace.io/for/unity/.Continue reading