The garbage collector is said to manage the application memory, which means the programming language is managed. The main advantage of managed languages is that developers do not have to reason about object lifetimes and free objects manually. Forgetting to free objects leaks memory, and premature freeing results in dangling pointers.
V8 and Blink use mark-sweep-compact garbage collectors where a single garbage-collection cycle consists of three phases: marking, where live objects are identified; sweeping, where dead objects are released; and compaction, where live objects are relocated to reduce memory fragmentation. During marking, the garbage collector finds all objects reachable from a defined set of root references, conceptually traversing an object graph, where the nodes of the graph are objects and the edges are fields of objects.
Cross-component references express liveness over component boundaries and have to be modeled explicitly in the graph. The simplest way to manage those references is by treating them as roots into the corresponding component. In other words, references from Blink to V8 would be treated as roots in V8 and vice versa. This creates the problem of reference cycles across components, which is analogous to regular reference cycles1 within a single garbage-collection system, where objects form groups of strongly connected components that are otherwise unreachable from the live object graph.
Cycles require either manual breaking through the use of weak references or the use of some managed system able to infer liveness by inspecting the system as a whole. Manually breaking a cycle is not always an option because the semantics of the involved objects may require all their referents to stay alive through strong references. Another option would be to restrict the involved components in such a way that cycles cannot be constructed. Note that in the case of Chrome and the Web this is not always possible, as shown later.
While the cycle problem can be avoided by unifying the memory-management systems of two components, it may still be desirable to manage the memory of the two components independently to preserve separation of concerns, since it is simpler to reuse a component in another system if there are fewer dependencies. For example, V8 is used not only in Chrome, but also in the Node.js server-side runtime, making it undesirable to add Blink-specific knowledge to V8.
Cross-component tracing enables efficient, effective, and safe garbage collection across component boundaries.
Assuming the components cannot be unified, the cross-component reference cycles can lead to either memory leaks when graphs involving cycles cannot be reclaimed by the components' garbage collectors, heavily impacting browser performance, or premature collection of objects resulting in use-after-free security vulnerabilities and program crashes that put users at risk.
This article describes an approach called cross-component tracing (CCT),3 which is implemented in V8 and Blink to solve the problem of memory management across component boundaries. Cross-component tracing also integrates nicely with existing tooling infrastructure and improves the debugging capabilities of Chrome Dev-Tools (https://developers.google.com/web/tools/chrome-devtools/).
Figure 1 shows an example that creates a temporary object, a loading bar (
loadingBar), that is then replaced by actual content (
loading-Bar object alive. Treating such references as uniformly weak would result in reclamation of the body and the div elements by the V8 garbage collector, which would leave behind dangling pointers for Blink.
We propose CCT as a way to tackle the general problem of reference cycles across component boundaries. For CCT, the garbage collectors of all involved components are extended to allow tracing into a different component, managing objects of potentially different programming languages. CCT uses the garbage collector of one component as the master tracer to compute the full transitive closure of live objects to break cycles.
Other components assist by providing a remote tracer that can traverse the objects of the component when requested by the master tracer. The system can then be treated as one managed heap. As a consequence, the simple algorithm of CCT can be extended to allow moving collectors and incremental or concurrent marking as needed by just following existing garbage collection principles.8 The pseudocode of the master and remote tracer algorithms is available in our full research article.3
Chrome strives to provide smooth user experiences, updating the screen at 60fps (frames per second), leaving V8 and Blink around 16.6 milliseconds to render a frame. Since marking large heaps may take hundreds of milliseconds, both V8 and Blink employ a technique called incremental marking, which means that marking is divided into steps during which objects are marked for only a small amount of time (for example, 1ms).
The application is free to change object references between the steps. This means that the application may hide a reference to an unmarked object in an already-marked object, which would result in premature collection of a live object. Incremental marking requires a garbage collector to keep the marking state consistent by preserving the strong tri-color-marking invariant.8 This invariant states that fully marked objects are allowed to point only to objects that are also fully marked or stashed somewhere for processing. V8 and Blink preserve the marking invariant using a conservative Dijkstra-style write barrier6 that ensures that writing a value into an object also marks the value. In fact, V8 even provides concurrent marking on a background thread this way while relying on incremental tracing in Blink.5
loadingBar, in this example) are reclaimed by the garbage collector. Note that from V8's point of view, there is no difference between the div elements
loadingBar, and only CCT makes it clear which object can be reclaimed by V8's garbage collector. Once the unreachable V8 object is gone, any subsequent garbage collections in Blink will not see a root for the corresponding
HTMLDivElement and reclaim the other half of the wrapper-wrappable pair.
Incremental CCT as implemented today in Chrome eliminates those problems by providing a much better approximation by computing liveness of objects through reachability and by enabling incremental processing. The detailed performance analysis can be found in the main research paper.3 We are currently working on concurrent marking of the Blink C++ heap and on integrating CCT into such a scheme.
fetchContent function from Figure 1 keeps, perhaps because of a bug, an internal reference to the provided callback, as shown in Figure 4.
Without knowing the implementation of the
fetchContent function, a Web developer observes that the
loadingBar element from the previous example is not reclaimed by the garbage collector. Can debugging tools help track down why the element is leaking?
loadingBar element. The path shows that the leaking DOM element is captured by the
loadingBar variable in the environment (called context in V8) of an anonymous closure, which is retained by the
internal-State field of the
Reclaiming Memory in Other Heterogeneous Systems
More interestingly, though, we are not aware of other sophisticated systems integrating VMs that provide cross-component memory management. While VMs often provide bridges for integration in other systems, such as Java Native Interface (JNI) and NativeScript, cross-component references require manual management in all of them. Developers using those systems must manually create and destroy links that can form cycles. This is error prone and can lead to the aforementioned problems.
Note, however, that CCT comes with significant implementation overhead, as it requires implementations of tracers in each component. Ultimately, implementers need to weigh the effort of either avoiding cycles by enforcing restrictions on their systems or implementing a mechanism to reclaim cycles, such as CCT. Chrome was already equipped with garbage collectors in V8 and Blink, and thus we chose to implement a generic solution such as CCT that allows the systems on top to stay as flexible as needed.
Idle-Time Garbage-Collection Scheduling
Ulan Degenbaev et al.
Real-time Garbage Collection
David F. Bacon
1. Bacon, D.F. and Rajan, V.T. Concurrent cycle collection in reference counted systems. In Proceedings of the 15th European Conf. Object-Oriented Programming. Springer-Verlag, London, U.K., 2001, 207–235; https://doi.org/10.1007/3-540-45337-7_12.
2. Chambers, C., Ungar, D. and Lee, E. An efficient implementation of SELF, a dynamically-typed object-oriented language based on prototypes. In Proceedings of the Conf. Object-Oriented Programming Systems, Languages and Applications. ACM SIGPLAN, 1989, 49–70; https://dl.acm.org/citation.cfm?doid=74877.74884.
3. Degenbaev, U. et al. Cross-component garbage collection. In Proceedings of the ACM on Programming Languages 2, OOPSLA Article 151, 2018; https://dl.acm.org/citation.cfm?doid=3288538.3276521.
4. Degenbaev, U., Filippov, A., Lippautz, M. and Payer, H. Tracing from JS to the DOM and back again. V8, 2018; https://v8.dev/blog/tracing-js-dom.
5. Degenbaev, U., Lippautz, M. and Payer, H. Concurrent marking in V8. V8, 2018; https://v8.dev/blog/concurrent-marking.
6. Dijkstra, E.W., Lamport, L., Martin, A.J., Scholten, C.S. and Steffens, E.F.M. On-the-fly garbage collection: An exercise in cooperation. Commun. ACM 21, 11 (Nov. 1978), 966–975; https://dl.acm.org/citation.cfm?doid=359642.359655.
7. Hablich, M. and Payer, H. Lessons learned from the memory roadshow; https://bit.ly/2018-memory-roadshow.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2019 ACM, Inc.