History | Log In     View a printable version of the current page. Get help!  
Issue Details [XML]

Key: SES-519
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Arjohn Kampman
Reporter: Andrew Newman
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Sesame

Concurrent Modification Exception in BTree

Created: 24/Jan/08 07:43 AM   Updated: 20/Mar/08 08:03 PM
Component/s: Native Sail
Affects Version/s: 2.0
Fix Version/s: 2.0.1

Environment: Java 1.6, Windows XP


 Description   
The following stack trace occurs in JRDF's copy of the BTree from Sesame:
Exception in thread "main" java.util.ConcurrentModificationException
                at java.util.HashMap$HashIterator.nextEntry(HashMap.java:841)
                at java.util.HashMap$KeyIterator.next(HashMap.java:877)
                at org.jrdf.graph.local.index.longindex.sesame.BTree$Node.notifyNodeSplit(BTree.java:1345)
                at org.jrdf.graph.local.index.longindex.sesame.BTree$Node.splitAndInsert(BTree.java:1279)
                at org.jrdf.graph.local.index.longindex.sesame.BTree.insertInNode(BTree.java:609)
                at org.jrdf.graph.local.index.longindex.sesame.BTree.insertInTree(BTree.java:581)
                at org.jrdf.graph.local.index.longindex.sesame.BTree.insertInTree(BTree.java:586)
                at org.jrdf.graph.local.index.longindex.sesame.BTree.insertInTree(BTree.java:586)
                at org.jrdf.graph.local.index.longindex.sesame.BTree.insert(BTree.java:539)
                at org.jrdf.graph.local.index.longindex.sesame.LongIndexSesame.add(LongIndexSesame.java:84)
                at org.jrdf.graph.local.index.longindex.sesame.LongIndexSesame.add(LongIndexSesame.java:74)

We fixed the issue in our code (http://jrdf.svn.sourceforge.net/viewvc/jrdf/trunk/jrdf/src/java/org/jrdf/graph/local/index/longindex/sesame/BTree.java?view=markup&pathrev=1859) by having nodeSplit (1665) return itself if it is to be deregistered and added it to a set in notifyNodeSplit (1290). This is then used to remove the listeners after the loop - thereby eliminating the ConcurrentModification.

This all seems a bit like a hack, hopefully there's a better way to do it. For example, does it need to be a set (does more than one listener have to be removed)?

Looking at the Sesame 2 code:
This is in notifyNodeSplit (1289) calling nodeSplit which then calls currentNode.deregister (1749) or parentNode.deregister (1767).

Here are the two modified methods (a dodgy diff if you will):
        private void notifyNodeSplit(Node rightNode, int medianIdx)
            throws IOException {
            synchronized (listeners) {
                Set<NodeListener> listenersToDeregister = new HashSet<NodeListener>();
                for (NodeListener l : listeners) {
                    NodeListener listener = l.nodeSplit(this, rightNode, medianIdx);
                    if (listener != null) {
                        listenersToDeregister.add(listener);
                    }
                }
                listeners.removeAll(listenersToDeregister);
            }
        }

        public NodeListener nodeSplit(Node node, Node newNode, int medianIdx) throws IOException {
            if (node == currentNode) {
                if (currentIdx > medianIdx) {
                    currentNode.release();
                    //currentNode.deregister(this);

                    newNode.use();
                    newNode.register(this);

                    currentNode = newNode;
                    currentIdx -= medianIdx + 1;

                    return this;
                }
            } else {
                for (int i = 0; i < parentNodeStack.size(); i++) {
                    Node parentNode = parentNodeStack.get(i);

                    if (node == parentNode) {
                        int parentIdx = parentIndexStack.get(i);

                        if (parentIdx > medianIdx) {
                            parentNode.release();
                            //parentNode.deregister(this);

                            newNode.use();
                            newNode.register(this);

                            parentNodeStack.set(i, newNode);
                            parentIndexStack.set(i, parentIdx - medianIdx - 1);
                        }

                        return this;
                    }
                }
            }
            return null;
        }

 All   Comments   Change History      Sort Order:
Comment by Arjohn Kampman [24/Jan/08 12:52 PM]
The HashSet that was used to store the listeners has been replaced with a ConcurrentLinkedQueue.

Comment by Arjohn Kampman [25/Jan/08 01:27 PM]
Apparently, there are some performance issues with ConcurrentLinkedQueue.remove(...):
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6493942

I verified this bug and it seems to surface frequently with the bug's test program. However, when the number of elements in the queue is reduced to 1000, which looks like a reasonable upper limit, everything's OK.

Comment by Arjohn Kampman [29/Jan/08 01:15 PM]
Some performance measurements performed by Andrew Newman indicate that the use of a ConcurrentLinkedQueue has a considerable negative effect on update performance. Doing 10,000 updates on a dataset of 1M elements takes 149 seconds, while the same test takes 19 seconds with his fix.

Comment by Arjohn Kampman [08/Feb/08 03:01 PM]
BTree now uses synchronized LinkedList for node listeners instead of ConcurrentLinkedQueue. The latter has considerable overhead which is visible when performing ranged searches (up to a factor 2 in simple benchmark).