Firstly, hello! (As I'm sure with many of yourselves, I've been busy - lots of ajax-enabled/web 2.0 apps to build out there.)
OK, all jokes aside - a serious question for the local JS gurus.
I have run into some issues with IE (and only IE,) where it appears to be that the browser's DOM handling from within Javascript is degrading in a linear fashion as I'm adding nodes and creating JS objects.
In most cases, this may be expected behaviour - but I am taking steps to keep the load light in terms of rendering performance and DOM load (eg. number of elements) - plus this is an IE-only issue (Firefox does not exhibit this behaviour,) so I suspect something particular to IE's JS/rendering engine.
I'm building a fairly heavy page that uses "pages" of content (Javascript-driven so you stay on the same page,) where new elements and JS objects are being created to track the properties and events relating to these elements. For conceptual reference, it's like a tabular spreadsheet-type sort of application.
Each "page" of data is created and rendered on-the-fly (eg. I'm creating a bunch of elements as efficiently as possible, creating JS objects for these, attaching objects->elements and finally appending to the document.)
I have noticed that this process (preparing data -> creating elements -> creating objects -> linking objects+elements -> append to body) takes ~500 ms more for *each* successive new page in IE (CPU use is 100% during this time,) despite that I'm actively hiding/removing previously-created (non-displayed) pages as I go.
Regarding efficiency in moving between existing pages: when you view page 2 for example, page 1 is removed from the DOM (but stored in a JS object for when you go back to that page, at which point it is re-inserted.) Obviously I also need to maintain state of changes that may have been made, so it's simplest to let the Javascript objects remain in memory rather than destroy them and try to dynamically re-create, etc.
I was previously doing a simple "show/hide" of page content after it had been written to the document, but figured this was perhaps the cause of the growing slowness in IE (more elements = slower.) Unfortunately, even now (swapping out these nodes) does not seem to make a difference. I tried a few other tricks like setting document.body.style.display to 'none', to see if it was perhaps related to rendering - that didn't seem to help.
It appears that several DOM methods like getElementsByTagName() are slowed down here (and are used extensively in creating a new page,) but others such as cloneNode() seems to be relatively unaffected. I suspect the DOM may be "flooded" by a large number of elements and therefore slow (having to parse a very large tree,) even though I am actively removing "hidden" pages so that only one is actually appended to the document body and visible at any given time and also scoping the getElementsByTagName() call to the smallest "branch" to minimize the number of nodes involved.
The degradation seems to be quite linear as well, as I am almost always creating the same number of new elements and objects. It appears to be tied directly to the total amount of "stuff" present within the document or JS environment - unfortunately, I have not yet determined if one or the other are specifically responsible.
Navigating away from / unloading the page / reloading solves this issue, it is not like a memory leak which "persists" across pages etc.. it is more typical: just within one page view, and only after a large number of elements + JS objects have been created.
Has anyone else seen this kind of issue, or am I just crazy? (Then again, this is the Asylum after all.)
My 2 cents, but I badly remember these situations, haven't done js for a long time, are:
It is not as simple as "many elements = less speed".
It comes down to some memory management issue me thinks, and there has to be a js way to prevent it.
This reminds me of how painfully slow it was to "inject" one new layer at a time in the innerHTML of a document,
as opposed to how fast it was to rebuild the whole innerHTML and then inject that.
It also reminds me of Liorean mentionning something about dom traversal..
Are all nodes always created on the fly? Because this would sound bad to me.
Some pre-coffee brainstorming: js in ie doesn't like "on-the-fly" instanciation added to large dom traversal.
It would "love" pre-baked references to instances instead of "100%" on the fly.
Do I make sense? My.. I need that coffee now.
Wait, yeah:
quote:
preparing data -> creating elements -> creating objects -> linking objects+elements -> append to body
When you build one page, do you append the nodes to the body as you create them ?
Have you tried to create a documentFragment to do all your magic and then append it to the BODY. That way you'll trigger a single big reflow instead of a lot of small reflows. Depending on how you touch/alter the pages you could even store their documentFragment to restore it afterwards.
Thanks for the replies! (_Mauro, you are/were InI? )
To clarify, the time it takes to do all that stuff is <i>increasing</i> by 500 msec each time. It takes <1 second to do in Firefox, 1.5 seconds perhaps in IE the first time.. and every subsequent page is +500 msec after that in IE (eg. 2 seconds for page 2, 2.5 seconds for page 3, etc.)
I am cloning an offline (referenced within JS, but not appended to document.body) collection of nodes, doing all the work and finally doing a single append (as opposed to appending each node individually, just do the parent that holds them all) to the body - which seems to be pretty fast as you mentioned, so the browser only reflows once (similar to doing .innerHTML, logically.)
From what I've found, it seems that it's "walking the DOM" that is being slowed the most. Assuming this is related to the number of elements in the document (and even then, it seems odd given I'm trying to control scope etc.), it's a bit disturbing that even "offline" elements appear to be considered at this point.
I was looking into documentFragment ideas also, I'm going to investigate this further. The idea of separating "offline" nodes further from the local document's DOM sounds crazy enough that it might just work. I wonder if there's any difference between a JS object in memory like someNode = document.removeChild(div); and someDocumentFragment.someNode, though.
From: 100101010011 <-- right about here Insane since: Mar 2000
posted 03-08-2006 18:45
Haven't personally run into this but my JS scripts etc usually fall well under some constraints of IE.
However I was reading This thread @ slashdot where there's some interesting comments about this particularly this one
.:[ Never resist a perfect moment ]:.
Tyberius Prime
Maniac (V) Mad Scientist with Finglongers
From: Germany Insane since: Sep 2001
posted 03-08-2006 18:49
hm... call me uncool, but when I last did such an application, I assigned myContentDiv.innerHTML and let the browser's C code worry about building notes out of it... this may or may not be faster, depending on how good the browser's doing it's job, than building your own notes in a lot of javascript calls.
So... are you in a position to make such a comparison easily?
Actually, using innerHTML should be a little faster than dom traversal.
And yep, Scott, I am the guy formerly known as InI. How did you guess, since you didn't seem to know?
I'm not trying to analyse the inner workings of IE, more just hoping to determine the cause/find a workaround for what appears to be a browser-specific problem with DOM traversal/manipulation. It seems to be a weird quirk, possibly triggered by a small bug or pattern I'm unaware of or missing. I figured the Asylum would be a good place to look, since lots of the people here have tons of experience with browser quirks.
To re-iterate and in an attempt to summarize: it seems that DOM methods like getElementsByTagName() take longer and longer as more nodes/objects are created within the browser, despite attempts to minimize the effects of non-active nodes and minimize the number of nodes being searched.
I might do that. The problem is the code is quite involved, and I've been trying to find a "magic bullet" solution based on some suspicions around IE's known performance issues. If I decide to try simplifying the code to a test case and can reproduce it, I will post a link here.
The speed of innerHTML relies a lot on exactly what changes you are making. In some cases, for some types of changes, it's indeed faster. Not so for every case, however. In fact, there are plenty of cases where innerHTML is much slower too.
I also happen to think innerHTML is "malpractice" as it is not object-oriented, dirty, ugly (used it a lot though).
And most benchmarks suck unless you make them for yourself.
Tyberius Prime
Maniac (V) Mad Scientist with Finglongers
From: Germany Insane since: Sep 2001
posted 03-10-2006 12:29
Huh?
InnerHTML is *the* way to go if you're loading parts of HTML from any server side scripting language... or do you think returning javascript that's then evaled() and builts a DOM Tree is cleaner than just returning regular HTML for innerHTML?
(now... building html in javascript, we can argue about, but if you're just included some server side output... innerHTML. Every time)
I just mean that innerHTML makes for "hacks". As I said, I used it a lot
because I am comfortable with quirky code,
but writing reusable code using innerHTML? The next geek will
have to be an Asylumnite to even try and read your code.
All html objects can be treated as objects through DOM methods like createChildNode,
appendChild, setAttribute, getAttribute, and these are the methods
that make for clean content rewrite.
In school, we are strongly encouraged to avoid hacks and foster
readability/reusability: sparing the next code monkey the hassle of
reversing my code and reading it has started to matter.
This said, plain text processing through innerHTML still is very fast.
And non-standard, as bit pointed out.
in my experience, innerHTML is blazing fast when you have to insert a zillion of similar tags. But as said before it's not standard ... yet. And it also has the bad habit to screw the events attached programatically to the sibling nodes ( when doing element.innerHTML += tagSoup; ). The reflow counts argument doesn't stand since using DOM methods you can create a documentFragment and thus have only one relow.
Writing to innerHTML doesn't work under true XHTML standards mode (eg. XHTML markup served with the application/xhtml+xml MIME type), so I see using standard DOM methods as a more future-proof method. (Argue value of standards-based JS and future-proof value here, etc.)
In my case and as Poi pointed out, I am appending a single node with a large structure via appendChild() only once, so the behaviour (reflow etc. on the browser's part) should be the same as .innerHTML.
My problem doesn't really seem to be that innerHTML or DOM is slow in "writing out" nodes, it's more so that DOM methods are becoming slower. The degradation I'm seeing seems to relate to the javascript objects I'm creating, not so much to the complexity of the markup/DOM structure from what I can tell.. eg. several arrays of ~100 objects each, which are defined via object literal notation and constructors, are created for each "page" of content I'm parsing/rendering. (You can browse multiple "pages" within one view, next/previous-style.)
It seems like document.getElementsByTagName() is made slower, even when parsing "offline" nodes such as XML returned to an xmlhttprequest object, as more pages are loaded.
Some of the reading I've done seems to indicate that even though I may be actively removing elements from the DOM and retaining the nodes in Javascript, IE still seems to be reflecting the "weight" of those elements / objects. Because the behaviour seems to be linear and pretty consistent (each new page takes 500 msec more to get/display,) it sounds like a simple sort of "looping through a growing collection of objects takes longer" problem.
If I have time and can isolate the behaviour to a minimal demo/example, I'll post a URL.
As in mem leaks in an environment without garbarge collection you mean? Have you tried explicitely "destroying"
the elements that are not in use through (some DOM method)?
Something like deleteNode or deleteChild.
If you assign a new value to such a node without explicit deletion in between, you may be bloating up the IE/JS engine memory stack
and this can cause slowdowns indeed. That mem stack may have no garbage collection mechanism whatsoever and fully rely
on "lightweight scripts" and the user changing page to reset the memory state (which would suck).
_Mauro: I'm swapping out pages and storing them "offscreen" in a Javascript object using replaceChild(). Thus, an offscreen page array contains a growing number of nodes which are then swapped back into the viewable area as needed (one page at a time.)
I've tried calling window.CollectGarbage() after swapping DOM nodes to see if it would matter (it didn't) - so I suspect I may have to more actively delete objects.
It would be nice to retain the off-screen objects (or at least, their states,) however - ie., not actively destroying the objects/data - because then I would have to re-fetch/create/initialize those objects depending on how much I destroyed. However, if this works around the problem for IE, maybe the "destroy-and-recreate" route is what I'll have to do until the cause of the slowdown is isolated.
CollectGarbage just calls on the JScript garbage collector, so COM objects won't be affected. It's mostly useful if you create enormous arrays or container objects containing huge numbers of strings or numbers, and always change existing properties instead of creating new variables, objects and properties. This because of the way the JScript garbage collection is triggered - it's triggered by threshold numbers of different types of allocations. See <uri:http://blogs.msdn.com/ericlippert/archive/2003/09/17/53038.aspx> for more on that. (I think he lists the numbers for the thresholds in the comments of another posting, but I can't remember which and can't be arsed to search it out... It's not that important for your situation anyways.)
I don't know how iew keeps track of nodes belonging to a certain document but not in the tree of that document. It might be that it uses filters on the entire set of nodes, in the document or otherwise, to build it's collections. If it in fact does keep track of all nodes, then it might be that storing the innerHTML of the nodes instead of the actual nodes could work. (i.e. Destroy the nodes, but store their data...)
Wow, a month has passed since I originally posted this.
OK, I've made some progress in this area. From what I have found, it seems that DOM (and possibly JS engine overall?) performance under IE (and Safari perhaps to an extent) are affected by growing amounts of Javascript objects - but not Firefox. It does not appear to be solely DOM references (though assigning event handlers can contribute some "weight") - it is mostly just data inside of objects.
I'm creating objects which contain DOM references, arrays and other data, and as more objects are created, DOM performance (eg. looping through an array created as the result of getElementsByTagName() ) appears to be hit.
For each new "page" of data in the earlier-described project I'm working on, I've found that I have to destroy the javascript objects for the current page. I'm creating a fixed number of objects (eg. 100 items means 100 objects) per page, and performance appears to be related to the number of active objects.
Eg. putting the following arbitrary assignment code in the constructor of each "item" object contributes heavily to the slowdown:
code:
for (var i=0; i<100; i++) {
this['random'+i] = 'random'+i;
}
Obviously creating these name/property values takes some time, but there's more to it than that. As I'm showing 100 items per page, this equates to 10,000 property assignments. As new pages are loaded, performance (time to create a new page) appears to degrade exponentially. I could assign these to either JS objects or DOM nodes referenced by the object, the effect appears to be the same. The type or "weight" of data may not really matter: Assigning a long string to each object seems to have a similar effect.
The obvious downside here is that the previously-viewed pages can't be just swapped out (ie. hide page 2, show page 1,) the data must be re-fetched from the API and related objects re-created.
The key point from all of this: If I delete the objects for the currently-active page before moving to the next one (at which point 100 new objects are created, etc.,) the performance / render time is consistent and does not degrade.
I have not had time to try making an isolated test case, but it may be worthwhile. I would be interested in knowing what part of the browser/JS engine is "bottlenecking," (object look-ups, ?) and if there are techniques that can be used as far as JS code style etc. to avoid the problem.
I have not had time to try making an isolated test case, but it may be
worthwhile. I would be interested in knowing what part of the browser/JS engine
is "bottlenecking," (object look-ups, ?) and if there are techniques that can be
used as far as JS code style etc. to avoid the problem.
If you do find time to make one, I'd be very interested to see it.
I'm currently working on a project where I'm about to generate and handle large amount of div tags.
Would be good to know about any possible issues before I find them the hard way...
I've made some "benchmarking" tests earlier with my vector graphics engine to compare the rendering speeds between creating objects with the DOM approach, innerHTML and insertAdjacentHTML, with and without buffering the tags before outputting them. What then turned out to be fastest was to use insertAdjacentHTML on each tag (no buffering). Don't know why though, but I'm not sure it's going to be the same in this new project since I now have to destroy a bunch of objects before/during generation of new ones.
Can you hear the smooth, sweet voice of InI on a rampage?
Fine.
IE7 SUCKS BALLS.
And huge ones at that.
Gave it a whirl after you posted the link, bit, and it has to disappear from the surface of the planet.
It has the same issues, copies FF poorly, is not usable at all, AND adds security checks of the poor,
that do prevent the user from using at best.
On to the actual issue...
quote:
The key point from all of this: If I delete the objects for the currently-active page before moving to the next one (at which point 100 new objects are created, etc.,) the performance / render time is consistent and does not degrade.
I have not had time to try making an isolated test case, but it may be worthwhile. I would be interested in knowing what part of the browser/JS engine is "bottlenecking," (object look-ups, ?) and if there are techniques that can be used as far as JS code style etc. to avoid the problem.
And more than a month before Scott got back with these observations, guess who had nailed something
very similar?
quote:
As in mem leaks in an environment without garbarge collection you mean? Have you tried explicitely "destroying"
the elements that are not in use through (some DOM method)?
Something like deleteNode or deleteChild.
So I'll trust my sixth sense, and will assume that IE is implicitely deleting references to
objects when you change page...
But not the actual object.
This is exactly what happened to my first OGL apps, pointers to objects where removed when
options were being changed, but the actual data and memory space remained, leading
to the application gently slowing down and finally dying.
This may be due to specific circumstancies: may happen on this or that Windows setup for this or that
reason: the issue occurs on the applicative layer, but is it related to the user profile? How does it work
on other Windows setups?
If you can reproduce the same issue on several different environments and user profile,
then I am afraid you've found a Microschmuck secret feature, aka consequence of poor software engineering, aka lovely little MS bug.
I am afraid all you can do is bug report, and assume internauts will not switch to IE7 because,
as a long time IE user and advocate, I have to say it's gone lower than I'd ever had suspected it could go.
IE7 must die. And it's engineers should be restructured.
I have plenty of work for them.
bitdamaged: The latest IE7 beta I tried also showed similar performance problems (but the workaround of throwing away "old" data prevented it, similarly.)
_Mauro: I am not removing the DOM nodes in fact (I am actually keeping those intact,) but the objects in JS that reference them.
eg.
code:
Item.destructor = function() {
this.releaseHandlers(); // remove any mouseover/out etc. from this.o (eg. an image or anchor)
this.o = null; // remove DOM reference entirely
}
I found that destroying these objects alleviates the performance issue, despite leaving the related DOM nodes intact (though in a "swapped out" JS object, not actually appended to the body as mentioned earlier.)
I am not talking about DOM nodes. I am talking about pointers and objects the way they are represented in memory. Let me explain....
Pointers are "handles" containing physical adresses for allocated memory spaces.
It tells where in the disk(s) or ram your object(s) reside.
What happens when you allocate an object in any language (new, tipycally), is that a given memory space is allocated for it, and the pointer referring to the beginning of that space is "hidden" behind the variable name you are using.
And what you describe, and your solution, is what happens when pointers are nulled out, but the mem space remains allocated somehow, locked for reading, no writing allowed: it's a "memory leak".
Memory leaks invariably lead to the mem being bloated by "ghost objects", which leads to massive slowdowns, which leads to a crash at some point, when there is no more free space.
Some space engines actually crashed because of this: rockets, satellites... there are known situations where discrete mem leaks killed an immense project.
So my guess is that your js objects which refer to nodes contain complex constructs for Dom traversal, and that what leaks are those constructs (not the node itself, the node is statically stored somewhere else).
And it's what my 6th sense has hinted at from the start, only I didn't look at your code, and didn't have time, quite frankly, to get used to js again.
That's why I think you've spotted a bug in js garbage collection in some engines, like the IE engine.
I think IE makers have never tested this situation to that extent, and never spotted the actual leak.
--------------------------------------
On to the onion layers theory: this *is* one of the worst possible gaps in terms of security.
It can be used to put the system in an unpredictable state, and abuse that state if you can, somehow, "predict the unpredictable".
Eg: if you know how the browser will try to trap this mem leak when it notices something is going wrong, prior to crashing severely, or how Windows will try to handle the issue,
you'll be able to let some low-level (machine language instruction) slide in to some mem space where it could be executed.
Basically, C and Cpp give you access to pointers and use them a lot. Hence the fact oolanguages written on top of C/C++ have no choice.. but using those structures
and "hiding" them to the js programmer.
You don't check the bounds of an array, for example, js does it at runtime.
To correct what I said in the above post, a mem leak is a defect per se, but becomes a threat like a buffer overflow only in rare and particular cases.